LinkedIn
| Source: Mastodon | Original article
LinkedIn has become the latest high‑profile battleground in the growing clash over whether large language models (LLMs) may be trained on copyrighted material harvested from online platforms. A Dutch court last week accepted a complaint filed by a coalition of authors and publishers alleging that several AI firms scraped LinkedIn posts, résumé data and articles—much of it still under copyright—to feed their models. The plaintiffs argue that the practice violates EU copyright law, while the tech companies have so far relied on the “transformative use” defence, claiming that the output of an LLM is a new creation that does not infringe the original works.
The case matters because LinkedIn hosts billions of professional posts, many of which are original articles, white papers and industry analyses. If the court rules that such content cannot be harvested without explicit permission, AI developers could lose a vast source of high‑quality training data, potentially slowing the pace of model improvement and raising costs for startups that lack proprietary corpora. Conversely, a ruling in favour of the defendants would cement a legal pathway for AI firms to continue mining publicly accessible text, intensifying the debate over data ownership and the adequacy of existing copyright frameworks.
All eyes now turn to the upcoming hearing, scheduled for June, where LinkedIn’s legal team is expected to argue that the models’ outputs are “transformative” and therefore exempt from infringement claims. Observers will also watch for reactions from the European Commission, which is drafting AI‑specific provisions under the Digital Services Act. The outcome could shape licensing practices, prompt new data‑use policies on professional networks, and influence how AI companies structure future training pipelines.
Sources
Back to AIPULSEN