Context Window 13
This edition covers Adam Hyde’s hands-on demonstration of OpenAI’s Operator agent for research publishing workflows, the unredacted Meta court documents alleging the torrenting of more than 80TB of pirated ebooks, Amazon’s bet on automated reasoning to suppress AI hallucinations, the proliferation of Gemini 2.0 model variants and what model fragmentation means for non-expert users, and a write-up from Jellyfish on building AI agents inside creative agencies.
A quick and early newsletter today as I’ve been on the road most of this week, and writing this on the first train down to Devon for the Creative UK regional investment summit at the University of Plymouth. Adam Hyde published an overview of using OpenAI’s new agentic model Operator for workflow: in this demo, extracting metadata from a scholarly article and submitting it to a publishing platform. It’s a flexible and powerful approach compared to simple web scraping: while Hyde acknowledges that it was slow, it worked well and—a point I often make in my training courses—this is the worst version of the tool you’ll use. He also offers a number of thought starters for other areas where this could be useful to publishers (I’d extend that by pointing out that the same approach could be equally useful for trade publishing workflows). His conclusion: “By breaking down publishing infrastructure into modular, interoperable services, we can create a system that is more flexible, scalable, and responsive to the needs of researchers and publishers alike. This is not just an optimization of current workflows but a rethinking of what research publishing infrastructure can be.” Further developments in the litigation between authors and Meta, with unredacted corporate emails suggesting that Meta torrented over 80 terabytes of ebooks from sources such as LibGen. A significant aspect of this is that this not only involved downloading and using the ebooks for training, which Meta asserts is Fair Use, but also seeding the torrents (in lay person’s terms, facilitating the sharing of the same content and thus actively enabling the distribution of pirated material to others, which could further complicate Meta’s Fair Use defence). The WSJ reports on Amazon’s efforts to prove that hallucination can be prevented, in order to unlock reluctant corporate customers. This automated reasoning approach validates AI outputs against logical rules and external data, ensuring they align with the underlying source material. As AI becomes more integral to content production, adopting technologies like this will be essential for maintaining accuracy and credibility. Google released several new Gemini 2.0 models: depending on their tier, a user could choose from 1.5 Pro, 1.5 Flash, 1.5 Flash-8B, 2.0 Flash, 2.0 Flash-Lite Preview… Similarly as an OpenAI customer I currently have a choice of GPT-4, GPT-4o mini, GPT-4o, GPT-4o with scheduled tasks, o1, o3-mini, o3-mini-high (and for US customers, Sora and Operator). The differences between the models make sense to experts but I wonder how clear model choice is to the average user (it makes me think of the story of Steve Jobs returning to Apple in 1997 and delivering its turnaround by focusing development on just four key segments). For publishers, navigating this fragmented landscape of model choices could create confusion, inefficiencies and increased cost, especially for smaller teams without dedicated AI expertise. I’ll be talking about model selection at my Lunch and Learn session for IPG members on 12 February. There’s an interesting write up from Jellyfish on developing AI agents within creative agencies. The insights are broadly applicable: I particularly liked the observations that no use of AI can overcome poorly defined goals, and building degrees of autonomy iteratively.
This was originally published in my email newsletter. To receive weekly updates on how AI is affecting the publishing industry, sign up here.