Large Language Models Recite Copyrighted Books Verbatim After Finetuning

alignment copyright

2026-04-30 | Source: Mastodon | Original article

Researchers find finetuning large language models can activate verbatim recall of copyrighted books.

Researchers have discovered a significant issue with large language models, as finetuning can activate verbatim recall of copyrighted books. This phenomenon, dubbed "Alignment Whack-a-Mole," suggests that even with safeguards in place, these models can still recall vast amounts of copyrighted material. The study, published on arxiv.org, highlights the challenges of aligning language models with human values and respecting intellectual property rights. This finding matters because it underscores the ongoing struggle to balance the capabilities of large language models with the need to protect copyrighted content. As we reported on April 29, OpenAI has been expanding its reach, bringing its models to Amazon's cloud, which may amplify the issue. The ability of these models to recall copyrighted material verbatim raises concerns about the potential for copyright infringement and the need for more robust safeguards. As the development of large language models continues to accelerate, it is essential to watch how researchers and developers respond to this challenge. Will they be able to find a solution to the "Alignment Whack-a-Mole" problem, or will it remain a persistent issue? The answer will have significant implications for the future of AI and its relationship with intellectual property rights.

Sources

Back to AIPULSEN