Summary
The Centre for Media, Technology and Democracy published an empirical audit showing that major AI systems can reproduce a large amount of journalism while failing to attribute the original reporting by default. For this watch, it is a strong direct journalism story because it turns vague complaints about AI scraping and substitution into measured evidence about attribution failure, substitution risk, and the conditions under which citation behavior improves.
Why It Matters
This matters directly to journalists and publishers because it documents several operational realities at once:
- AI systems appear to have absorbed newsroom reporting at scale
- default chatbot behavior often omits the originating outlet
- newsroom value can be displaced even when links are sometimes present
- attribution is not purely a technical impossibility, because it improves sharply when the system is explicitly asked for it
That makes the story useful not only as a policy or copyright reference, but also as a practical warning for newsroom product, licensing, and audience-strategy work.
PI Tool Angle
`n/a`
What the Source Says
The memo says the researchers tested four major AI models on 2,267 Canadian news stories in English and French across 18,134 queries, then ran a second study on 140 recent articles under 3,360 conditions with web search enabled. It reports that the models provided no source attribution 82% of the time when asked about news events from their training data. In the web-enabled study, the systems produced enough of the underlying reporting to substitute for the source in 54% to 81% of cases, linked to Canadian news sites in 29% to 69% of responses, but named the originating outlet in only 1% to 16% of response text. When the outlet was named in the prompt and citations were explicitly requested, attribution rates rose to 74% to 97%.