AI: Teaching AI to 'forget'

...not as easy as it may look

Jan 17, 2024

A pencil without an eraser is a pain. Same may be true of AI without the ability to ‘forget’. Especially given the pivotal debate in AI on ‘Fair use’ vs data copyrights As Axios explains in “AI doesn’t forget, and that’s a problem”:

“Users want answers from artificial intelligence, but sometimes they want AI to forget things, too — creating a new category of research known as "machine unlearning," Axios' Alison Snyder reports.”

“Why it matters: Interest in techniques that can remove traces of data without degrading AI models' performance is driven in part by copyright and "right to be forgotten" laws, but also by concerns about biased or toxic AI outputs rooted in training data.”

“Deleting information from computer storage is a straightforward process, but today's AI doesn't copy information into memory — it trains neural networks to recognize and then reproduce relationships among bits of data.”

An urgent issue driving the need to teach AIs to ‘unlearn’ is of course the growing issue of copyrights on the data used to train today’s Foundation/Frontier LLM AI models.

As I’ve noted earlier, there are negotiations and legal actions underway for publishers to either get paid for their data, or have their data erased from the training sets of the offending AI models:

As a Microsoft researcher recently noted in “Who’s Harry Potter? Making LLMs forget” following work on a recent AI paper:

“It's like "trying to remove specific ingredients from a baked cake — it seems nearly impossible.”

“Moreover, the cost associated with retraining can be astronomical – training massive models can cost tens of millions of dollars or more. Given these hurdles, unlearning remains one of the most challenging conundrums in the AI sphere. There’s skepticism in the community around its feasibility. Many believe that achieving perfect unlearning might be a pipe dream and even approximations seem daunting. Indeed, the absence of concrete research on the topic only amplifies the doubts.“

Also driving the need to teach AIs to ‘unlearn’ is the growing issue of copyrights on the data used to train today’s Foundation/Frontier LLM AI models. As I’ve noted earlier, there are negotiations and legal actions underway for publishers to either get paid for their data, or have their data erased from the training sets of the offending AI models. As Axios observed:

“It's a pressing question if companies are going to be held liable for people's requests that their information be deleted or if policymakers are going to mandate unlearning.”

The Microsoft team in particular tried to get an open source version of Meta’s Llama2 7 billion parameter model to forget ‘Harry Potter’ content:

“In a new paper(opens in new tab), we decided to embark on what we initially thought might be impossible: make the Llama2-7b model, trained by Meta, forget the magical realm of Harry Potter. Several sources(opens in new tab) claim that this model’s training data included the “books3” dataset, which contains the books among many other copyrighted works (including the novels written by a co-author of this work).”

“To emphasize the depth of the model’s recall, consider this: prompt the original model with a very generic-looking prompt such as “When Harry went back to school that fall,” and it continues with a detailed story set in J.K. Rowling’s universe.”

A.I's Un-Learning Problem. Can machines Unlearn? Can A.I 'Forget'… | by Dishu Bansal | DataDrivenInvestor

The piece is worth reading to see the techniques tried, and the work that needs to be done to make the models ‘forget’ or unlearn’. And as Axios notes in its piece:

“But other researchers audited the unlearned model and found that, by rewording the questions they posed, they could get it to show it still "knew" some things about Harry Potter.”

“For low stakes problems it might be sufficient to stop a model from reproducing something verbatim, but serious privacy and security issues might require complete unlearning of information.”

Venturebeat had this to add on the Harry Potter experiment by Microsoft researchers:

“As the debate heats up around the use of copyrighted works to train large language models (LLMs) such as OpenAI’s ChatGPT, Meta’s Llama 2, Anthropic’s Claude 2, one obvious question arises: can these models even be altered or edited to remove their knowledge of such works, without totally retraining them or rearchitecting them?”

“In a new paper published on the open access and non-peer reviewed site arXiv.org, co-authors Ronen Eldan of Microsoft Research and Mark Russinovich of Microsoft Azure propose a new way of doing exactly this by erasing specific information from a sample LLM — namely, all knowledge of the existence of the Harry Potter books (including characters and plots) from Meta’s open source Llama 2-7B.”

“As the authors note, more testing is still needed given limitations of their evaluation approach. Their technique may also be more effective for fictional texts than non-fiction, since fictional worlds contain more unique references.”

The non-fiction arena is of course an area of contention in the recent New York Times copyright suit against OpenAI and Microsoft.

In OpenAI’s response to the recent New York Times ChatGPT copyright suit, they noted:

New York Times Sues OpenAI And Microsoft Over Copyrighted Infringement

“Interestingly, the regurgitations The New York Times induced appear to be from years-old articles that have proliferated on multiple third-party websites. It seems they intentionally manipulated prompts, often including lengthy excerpts of articles, in order to get our model to regurgitate. Even when using such prompts, our models don’t typically behave the way The New York Times insinuates, which suggests they either instructed the model to regurgitate or cherry-picked their examples from many attempts.”

Both the academic research and real world cases illustrates how gnarly the topic of teaching AI models to ‘forget’ or ‘unlearn’ what the models have learned either in the training or the inference runs, is easier said that done.

And there may be no easy and/or effective ways to ‘erase’ the data offending items from the training set at all. Not like a pencil and eraser at all.

Erasing ‘Harry Potter’ from our AI is not going to be as easy as waving a wand.

As I’ve noted before in “AI: On the Shoulders of Giants (OTSOG), Go-Getter ‘Creators’ and Grunts”, our AI computers are simply doing what we’ve done for centuries: learning from all that’s come before to create new works and insights, to keep learning from “the Shoulders of Giants” (OTSOG). The computers just ‘learn’ a whole lot faster, with more persistent calculations and memories than ours.

Erasure may not be an option.

Much remains to be learned ourselves on how to make the AI ‘unlearn’, on this long AI Tech Wave journey. Stay tuned.

(NOTE: The discussions here are for information purposes only, and not meant as investment advice at any time. Thanks for joining us here)

AI: Reset to Zero

Discussion about this post