After nearly 25 years as a founder of Mumsnet, I considered myself pretty unshockable when it came to the workings of big tech. But my jaw hit the floor last week when I read that Google was pushing to overhaul UK copyright law in a way that would allow it to freely mine other publishers’ content for commercial gain without compensation.
At Mumsnet, we’ve been on the sharp end of this practice, and have recently launched the first British legal action against the tech giant OpenAI. Earlier in the year, we became aware that it was scraping our content – presumably to train its large language model (LLM). Such scraping without permission is a breach of copyright laws and explicitly of our terms of use, so we approached OpenAI and suggested a licensing deal. After lengthy talks (and signing a non-disclosure agreement), it told us it wasn’t interested, saying it was after “less open” data sources.
You might ask why the lifting of online content for model-training poses a problem – hasn’t Google been crawling all over websites and ingesting their data for search purposes since the dawn of the internet? That’s true, but there is a clear value exchange in allowing Google to access that data, namely the resulting search traffic that comes from being indexed by Google. In contrast, the LLMs are building models such as ChatGPT to provide the answers to any and all prospective questions, and that will mean people no longer need to go elsewhere for solutions. And they’re building those models with illegally scraped content from the very websites they are poised to replace.
Allowing the AI companies to simply steal content isn’t just enormously unfair to publishers who see no reward for the work they put in, or the risks they take, it’s also an existential threat to them (and ultimately counterproductive). If publishers wither and die because the AIs have hoovered up all their traffic, then who’s left to produce the content to feed the models? And let’s be honest – it’s not as if these tech giants can’t afford to properly compensate publishers. OpenAI is currently fundraising to the tune of $6.5bn, the single largest venture capital round of all time, valuing the enterprise at a cool $150bn. In fact, it has just been reported that the company is planning to change its structure and become a for-profit enterprise.
Some larger publishers with legal and financial muscle have managed to cut licensing deals with the AI giants, and several others are engaged in lawsuits to try to protect their rights. But the smaller publishers will be at the back of the queue and may never get compensated if Google et al have their way with copyright law.
At Mumsnet, we’re actually in a stronger position than most to withstand AI’s onslaught because much of our traffic comes to us directly rather than via a search engine. An AI chatbot can spit out a “Mumsnet-style” answer to a parenting question, but they’ll never be as funny about parking wars or as brutally honest about relationships, and they’ll certainly never provide the emotional support that sees about 1,000 women a year, according to our estimates, helped to leave abusive partners by other Mumsnet users. But if these trillion-dollar giants are allowed to ride roughshod over content producers, and get away with it, they will destroy many of them, and all the jobs dependent on them.
I’m not anti-AI. It plainly has the potential to advance human progress and improve our lives in myriad ways. We used it at Mumsnet to build MumsGPT, which uncovers and summarises what parents are thinking about – everything from beauty trends to supermarkets to politicians – and we licensed OpenAI’s API (application programming interface) to build it. Plus, we think there are some very good reasons why these AI models should ingest Mumsnet’s conversations to train their models. The 6bn-plus words on Mumsnet are a unique record of 24 years of female interaction about everything from global politics to relationships with in-laws. By contrast, most of the content on the web was written by and for men. AI models have misogyny baked in and we’d love to help counter their gender bias.
But Google’s proposal to change our laws would allow billion-dollar companies to waltz untrammelled over any notion of a fair value exchange in the name of rapid “development”. Everything that’s unique and brilliant about smaller publisher sites would be lost, and a handful of Silicon Valley giants would be left with even more control over the world’s content and commerce.
It doesn’t have to be like this – there’s more than enough money flooding into AI companies for everyone to be fairly and sustainably rewarded for their contribution. But we, and by that I mean the publishing industry and government, need to wake up and smell the coffee because, as the recent Google antitrust trial in the US showed, left to their own devices big tech companies will happily ride roughshod over the law to grow their dominance.
Source link
lol