Context Window 61

There are two big themes in this week’s newsletter: utility and accountability, and they don’t sit easily together. The productivity gains many of us are seeing from AI are increasingly in tension with unanswered questions about training and transparency.

Matt Webb published a really thought-provoking piece on AI agents, which got me thinking about the promised benefits of AI and what we’re actually seeing in publishers. His argument is less about agents as magical new things, and more about them as systems that coordinate work over time, which feels much closer to how a publisher actually operates.

That prompted me to write a short piece on my website about AI productivity and marginal gains. My argument is that too many people have been looking for transformational leaps, when the real, durable value from AI is more likely to come from aggregation: shaving minutes off dozens of small tasks across workflows. Individually those gains look trivial; collectively, they can be material—especially in an industry like publishing where margins are thin and complexity is high. I’d love your thoughts on this.

On the question of AI productivity, I also really enjoyed Ethan Mollick’s latest post, which describes his recent experience with AI agents in business school teaching, and sets out a framework for thinking about how and when to delegate tasks to AI agents, based on human baselines, AI process time and the probability of success. The context in the piece is interesting, but the framework is invaluable for publishers and others looking at using AI agents.

Anthropic expanded access to Claude for Excel, previously available only to Max and Enterprise customers, to users on its Pro Tier (around £18 per month), making it significantly more accessible to teams in smaller publishers looking to experiment with data analysis and workflow.

DeepSeek open-sourced its Optical Character Recognition model, which combines one of the highest benchmark performances for OCR with superior efficiency. This could be interesting for anyone digitising printed materials.

OpenAI released a new tool for researchers called Prism, which integrates ChatGPT 5.2 with a text editor for writing scientific papers in LaTeX markup. For journals and publishers, this is a potentially big deal, even if it doesn’t live up to the hyperbolic promise of “vibe coding for science”. It will be interesting to see what sort of adoption it gets.

The best analysis I saw of the strategy behind this was from Adam Hyde at Kotahi, who highlighted that this moves OpenAI up the stack from training on finished papers to training on the process that creates those papers. Being able to see, and train on, workflow rather than the final output signals a shift toward models that mimic processes of reasoning. That has deeper implications for editorial integrity and reproducibility than training on static texts alone.

Last week I featured the first part of Cashmere founder Jonathan Woahn’s thoughts on the AI content licensing market for Scholarly Kitchen. The second part dropped yesterday, and provides thoughtful context on how AI licensing is developing compared to examples like the music business and YouTube. Jonathan also proposes four business norms which publishers should be establishing now. It’s an essential read for publishers, and it has that rare thing, a constructive comment thread beneath it.

The Washington Post has a detailed piece on Anthropic’s systematic efforts to buy and scan physical books for AI training, something the company was at pains to keep quiet. The piece reveals that millions of books were bought from vendors including the second-hand/circular economy platform World of Books, meaning any purchases are essentially invisible in industry sales data. (Incidentally, I wrote a piece about World of Books and attempting to size this market for the Society of Authors back in 2021).

The Anthropic book scanning project emphasises that questions about how models have been trained are not going away. The EU AI Act mandates that foundation model providers should disclose sources of training data, but the Commission is not actively enforcing it yet, and the failure of many platforms to comply is rightly frustrating rightsholder groups.

For anyone trying to keep track of the various pending lawsuits over AI and copyright, the Authors Alliance provides an annotated list, with some useful commentary and analysis. They make particularly valuable points about the Bartz v. Anthropic: the proposed settlement doesn’t seem to have impaired Anthropic’s valuation or progress, and so settling may look like a pragmatic decision for other defendants, especially where class litigation is certified by courts.

This was originally published in my email newsletter. To receive weekly updates on how AI is affecting the publishing industry, sign up here.

Written on January 30, 2026