In a world rife with distrust of information and disinformation, tracing the source and history of the media we consume –its provenance– can inform our trust, not unlike knowing where our food comes from and how it was prepared can help us make better decisions about what we eat.
With the advent of generative AI and its increasing capacity to create realistic content, transparency is even more critical. This has been recognized by industry leaders, civil society and international organizations, as well as legislative and regulatory bodies across the globe, from the European Union to China, and Brazil to Australia, who have also begun to set some form of disclosure requirements to those creating or displaying AI-generated or edited content.
Bolstered by regulation and standardization, mechanisms of disclosure such as watermarks, cryptographic metadata or digital fingerprints are gaining momentum. For example, major platforms and services like Youtube, LinkedIn, and Adobe have already adopted the C2PA standard which can help capture the provenance of all types of content, including pictures taken with your camera or phone. The World Standards Cooperation has also been created by the IEC, ISO, and ITU to discuss priority areas and requirements in this field.
However, even as these technologies flow into the mainstream, a different reality is being unveiled: a significant portion of online content will always exist without traceable provenance.
Several factors make universal traceability unfeasible. First, because even if we succeed in creating accessible standards and a tightly regulated AI ecosystem, millions using open-source AI tools could still bypass these requirements. Others, even if willing to comply, may find it technically impossible–the C2PA standard, for instance, will not work with tools that grant users access to its system. Plus, there is the massive volume of legacy media—content created before the existence of these technologies—that could endure, ‘untraceable’, online.
And it isn’t just that it is impractical to conceive that all media could somehow include its provenance, there are also human rights considerations at stake. Perhaps most concerning is the risk that authoritarian regimes or other malicious actors could suppress freedom of expression by exploiting security flaws in these mechanisms of disclosure, deriving private information from available metadata (such as device identifiers), or enacting laws that require personally identifiable information (PII) to be attached to a piece of content’s provenance.
There are safeguards that could be put in place to try to avert this concern. Among others, users should be able to retain control over their information, and captured provenance should focus on the HOW of multimedia—how it was created or edited—not WHO—who created, edited or published it. But even with these and other guardrails in place, the risk of misuse and abuse remains significant.
Given both the impossibility and potential harms of universal transparency based on the prevailing mechanisms of disclosure, we must now expand our focus to prepare for what we have set in motion: a digital world split between content that does have–varying levels and expressions of–verifiable provenance and content that does not.
This means preparing for–and trying to avert–a world where one journalist’s image could be trusted while another’s is not on the basis of not having access to, or not being able to, use tools that capture their source and history; where evidence of human rights violations could be dismissed in legal instances for not including cryptographic metadata; or where those speaking truth to power could be forced to choose between visibility and safety when platforms require potentially dangerous provenance information.
Tomorrow’s content divide requires clarity about what verifiable provenance means and what it doesn’t. It is a marker of authenticity–one among others, including fact-checking, forensic analysis or even personal and professional experiences. It does not mean that its content is necessarily true, or that it should be automatically trusted. By the same token, content without verifiable provenance should not be automatically discredited or undermined.
This emerging challenge will demand media literacy at all levels of society, not just among the general public. Key industries and communities will face distinct hurdles: for example, courts must develop new methods to determine truth in this dichotomous digital future, while social media platforms will need to reimagine content moderation to ensure fair treatment of the voices that may not or cannot add verifiable provenance to their content, and they must resist pressure to increasingly and arbitrarily tie social media activity to individual human identity rather than focusing on distinguishing AI creation from human activity.
User experience will also be critical in shaping how people process provenance information. Too much information could be overwhelming and ultimately ignored; too little may be confusing or misleading. The challenge can become even more complex with aggregated content, such as in Google image search results, where provenance must be conveyed clearly across multiple items at once.
Finally, as we develop frameworks and strategies to address these and other challenges through policy, law, and standards, it’s crucial that we learn from diverse global perspectives – especially from vulnerable populations and those whose experiences can help us both prevent harm and leverage the potential that these technologies can offer.
Published: 05 March, 2025