“History is written by the victor” – and it will be rewritten by AI.
Much ink has been spilled on the problem of disinformation supercharged by generative AI. The consequences of photorealistic synthetic content on how organisations ingest media are obvious, I hope, to anyone reading this.
However, there is an even deeper, more urgent problem that hasn’t received nearly enough attention: the security of archives and their long-term value in the face of photorealistic AI-generated content.
At the Archives and Records Conference in Belfast this year, I spoke to several archivists and record-keepers who clearly saw their role as guardians of the truth. However, I fear our guardians aren’t fully awake to the forces attacking our fragile fortress.
Generative AI represents a unique threat to archival security – no longer are we concerned about ransomware or deletion of records, we are concerned about the veracity of every piece of content regardless of its age, origin and creator. It is as simple as adding a few realistic-looking synthetic images to an archive to completely dissolve any sense of trust in the archive as a whole.
When deep fakes were first invented, they didn’t know how to blink. There aren’t a lot of high-resolution pictures of famous people with their eyes closed so the AI models didn’t think to close the eyes of the clones. Once this unblinking nature was discovered, it became simple to detect deepfakes. That lasted about five months before the deep fake models were updated with the ability to blink.
This is a common theme: AI detectors figure out some quirk of AI content that allows them to distinguish them from real content, AI creators then learn of this quirk and fix it, AI detectors go on the hunt for a new quirk.
With each iteration of this cycle, the AI models become better until, eventually, they’re truly undetectable. We have already reached the end of this cycle as text AI-generated content can no longer be reliably detected by humans or machines and never will be.
I believe we are 24 months away from this happening for photographic content. So, we only have two years to secure all historical content. After that, any unsecured archive will remain forever unverifiable with no reason to believe its account of history over that of any of the thousands of AI- generated accounts. If you cannot prove that a piece of media existed before the AI inflection point, you can never again prove that it is real. This effectively drives the value of any archive to zero. Without trust in content, an archive holds no value as a documentation of history.
So how do we continue to protect the future of the newsroom? We start by safeguarding the past.
The process of securing an archive is not as simple as hiring a team of cybersecurity wizards to ensure that the archive is locked up in what might feel like an impenetrable vault. Like in Troy, there is always a horse, a single point of failure can always lead to a breach. Each individual organisation is inherently susceptible to synthetic content injection simply because any siloed database has no external provability.
Moreover, each image and video in an archive holds no unique fingerprint, no incorruptible origin point that can be pointed to as proof that an image or video has in fact not been edited or changed. This siloing, combined with the lack of proof information, creates a complete erosion of trust for any external party trying to validate an archive or its contents.
The first step to secure an archive is to create a fingerprint for each piece of content in it. This is a demonstrable marker of its existence and a proof that that photo or video is in fact the original object that was placed in the database, not an added fake. The second is to anchor all these photos onto a secure ledger of trust. This allows us to externally verify that the content is in fact the content it claims to be. The third step is to distribute the trust, creating thousands of points of failure instead of just one. We do this by creating a network of ‘verifiers’ who all collaboratively act as the trust layer for each piece of content. This effectively removes the single point of failure problem while ensuring that the content itself never leaves each organisation’s infrastructure.
At OpenOrigins we call this process Archive Anchoring. Archive Anchoring ensures that archives remain resilient in the face of AI and retain their value indefinitely while keeping content ownership firmly in the hands of the owners.
The next 24 months will lay the groundwork for the information landscape of the next 20 years. Organisations that do not take steps today to protect themselves will have no role to play in that landscape.
The post To protect future newsrooms from AI fakery we must first protect the past appeared first on Press Gazette.