Context Window 59
This week’s AI stories are less about capability and more about control: how models are trained, when licensing works, and where consent is quietly assumed rather than granted. For publishers, these mechanics are starting to matter more than the rhetoric.
The thing I’ve seen most shared in my social channels this week is a piece in The Atlantic, AI’s Memorization Crisis, which responds to the Stanford/Yale research that I wrote about last week. The piece argues that major LLMs have “memorised” large chunks of copyrighted books and can reproduce them verbatim when prompted, undermining the industry’s claims that models don’t store copies of training data and creating serious legal risk.
However, it relies heavily on a handful of dramatic examples while giving little context on how common this is in regular use, or what this actually proves about how LLMs store and retrieve information. Readers should treat it as a strong warning shot about transparency, copyright, and governance—but not a settled, comprehensive account of how LLMs work.
Meanwhile, an interesting update on LLM training: some publishers and sales channels are seeing an increase in physical books being bought, apparently to be scanned for training purposes, relying on right of first sale and Fair Use(something Judge Alsup explicitly addressed in his judgement in Bartz v. Anthropic). Wholesale giant Ingram is allowing publishers to opt out of these sales.
Wikipedia has announced licensing deals with AI platforms including Microsoft, Mistral, Perplexity, Meta and Amazon. The reaction to this has been quite mixed, with a lot of people criticising them for working with AI companies. But as a commercial precedent, it is a helpful one for all publishers (particularly in Open Access), proving the argument that tech companies will pay for convenience and speed of access to data, even if they could scrape the same information from the web for free.
Microsoft’s AI Economy Institute released a new research paper on use of AI around the world. Counterintuitively, given its strength in tech, the US was ranked 24th in the world for the second half of 2025. The leader in the English-speaking world was Ireland, ranked 4th. New Zealand ranked 7th, the UK 9th, Australia 11th and Canada 14th. Ranked first and second were the UAE and Singapore, with 64% and 61% of the population having used a generative AI product in the period. I would be fascinated to see a comparative study of AI use in the global publishing industry and whether these national trends apply there too.
In trying to find different data sources by reading around the topic, I found this Substack from an editor reflecting on three years of running a journal. It makes some interesting points for anyone looking at editorial submissions in any area of publishing—including trade. The key transferable question for every publisher: what happens when the market moves from easily-recognised slop to competent authors quietly using AI as a generative tool.
Meanwhile, the music industry is grappling with issues of provenance and labelling of AI content, with Spotify facing complaints about AI music being surfaced in users’ Discover Weekly playlists and Bandcamp banning AI-generated material.
The Authors Alliance has a useful update on the Bartz v. Anthropic settlement, including updates on key dates, a change of judge, and a group of authors opting out of the settlement and pursuing their own, parallel litigation against Anthropic, Google, OpenAI, Meta, xAI, and Perplexity.
Whilst we await the UK government’s final response to last year’s AI and copyright consultation, Ed Newton-Rex highlighted an unguarded admission by a senior civil servant that the UK government’s preferred approach last year was as much about ensuring some access to content for training as giving creatives control over their IP.
Amazon’s new “Buy For Me” feature uses AI to show products from third-party brands in Amazon search and checkout, then completes the purchase on the brand’s own website—without the brand opting in. Retailers are pushing back, arguing this collapses consent and control. For publishers, it’s a familiar warning: AI convenience features increasingly assume access first, and negotiate rights later.
As a creative example of what can be done with AI, developer Iain Tait produced a website exploring fashion over his lifetime, with AI-generated coding and images. It’s the sort of niche interactive that would have taken an agency team several weeks to build in Flash at the start of my career—now it’s a single person weekend project with AI. He also documented exactly how it was built in a weekend on his Github repo, so if you wanted to build a bookish equivalent…
This was originally published in my email newsletter. To receive weekly updates on how AI is affecting the publishing industry, sign up here.