MEMORY, the next big Feature race in Foundation LLM AIs, Open or Closed. Large or Small.
Anthropic kicked this off late this week with its announcement of much larger ‘Context windows’ in the LLM AI model underlying Claude, its competitor to OpenAI’s ChatGPT. Much more to come on Memory and AI. Stay tuned.
https://www.anthropic.com/index/100k-context-windows
WHY IT’S IMPORTANT: Today’s LLM AIs, and AI Chat Bots, have the “memory of a goldfish”, as the author of the TechCrunch piece (link below), puts it so aptly.
Specifically, they go on to say:
“Historically and even today, poor memory has been an impediment to the usefulness of text-generating AI. As a recent piece in The Atlantic aptly puts it, even sophisticated generative text AI like ChatGPT has the memory of a goldfish. Each time the model generates a response, it takes into account only a very limited amount of text — preventing it from, say, summarizing a book or reviewing a major coding project.
But Anthropic’s trying to change that.
Today, the AI research startup announced that it’s expanded the context window for Claude — its flagship text-generating AI model, still in preview — from 9,000 tokens to 100,000 tokens. Context window refers to the text the model considers before generating additional text, while tokens represent raw text (e.g., the word “fantastic” would be split into the tokens “fan,” “tas” and “tic”).
So what’s the significance, exactly? Well, as alluded to earlier, models with small context windows tend to “forget” the content of even very recent conversations — leading them to veer off topic. After a few thousand words or so, they also forget their initial instructions, instead extrapolating their behavior from the last information within their context window rather than from the original request.”
So besides enabling longer, detailed questions, memory reduces hallucinations and increases reliability.
The other reason bigger context windows are important, is that the API fees for using LLM AIs are generally based on tokens. Increased token windows potentially means higher revenues, as both commercial and consumer customers widen the Memory capabilities of their foundation models.
Note that at this week’s Google I/O 2023, Sundar Pichai specifically mentioned Memory has a key area of focus for their upcoming Palm2 & Gemini LLMs, specifically being designed to go toe to to with GPT4 and beyond.
And this is just the beginning of this focus on Memory enhancements. Memory can meaningfully enable new LLM AI User experiences, changing current UI/UX at its core.
Next up is Persistent memory by user that remembers and tracks their needs and queries independent of the centralized foundation LLM models.
Much more to come here. Good to see Anthropic leading in this important area.
https://techcrunch.com/2023/05/11/anthropics-latest-model-can-take-the-great-gatsby-as-input/