Decoding Deepfakes: The Role of Zero-Knowledge Proofs in AI Verification

How Cryptography Can Help Verify and Protect AI-Generated Content

Jun 21, 2024

In the realm of AI policy, a pressing issue is how to verify if an image or text was generated by a machine learning model. This verification is crucial, especially when the content holds significant implications, such as politically driven deepfakes. For example, if an image crafted with MidJourney is shared on social media, would users be able to discern its AI origins? The concept of verifying AI-generated content is still emerging, and while zero-knowledge proofs (ZKPs) offer a privacy-preserving tool to explore in addressing this challenge, they are not the ultimate solution.

Currently, the most common approach to tackle this verification problem is the development of deep neural networks designed to distinguish between "real" and "fake" images. Many services employ models like MidJourney, Stable Diffusion, and DALL-E to generate a vast array of images, which are then meticulously labeled. These AI-generated images, along with a substantial collection of camera-captured images, are used to train a neural network to classify new images into categories such as "not AI-generated," "generated with MidJourney," "generated with Stable Diffusion," or "generated with DALL-E." Despite its popularity, this method faces significant hurdles.

Neural networks are limited in identifying the source of an image because they depend on patterns learned from their training data. Generative AI models, designed to create a nearly infinite variety of new images, often produce content vastly different from the training data of verification models. Consequently, these models struggle with out-of-distribution images, leading to degraded performance and increased false positives and negatives, thereby undermining the goal of accurate disclosure.

To mitigate the challenge of out-of-distribution images, AI verification services might continuously retrain their models with updated data. However, this approach is prohibitively expensive, making it feasible only for large AI labs and companies. Startups would find it difficult to sustain themselves as AI verification services due to the high costs associated with continuous model retraining and data acquisition.

Given these limitations, neural networks might not be the most effective method for verifying the source of AI-generated images.

Exploring the Potential of Zero-Knowledge Proofs

Zero-knowledge proofs (ZKPs) are cryptographic techniques that play a crucial role in blockchain technology, enabling both data privacy and transaction transparency. In blockchain systems, ZKPs allow independent auditors to trace cryptocurrency transactions without compromising user data privacy. This principle can be applied to the challenge of verifying AI-generated images while maintaining user privacy.

For instance, a blockchain user transacting with Ethereum uses a private key for signing transactions and a public key for receiving Ethereum. ZKPs ensure that while transactions can be verified as legitimate, the private key and specific transaction details remain confidential.

Applying this concept to AI image verification, ZKPs could facilitate a privacy-preserving protocol for maintaining a database of AI-generated images. This database would securely record the generation of images without revealing sensitive user information. Scott Aaronson discussed the potential of using a database to store AI-generated content in a 2022 blog post but highlighted privacy concerns as a major obstacle:

“If OpenAI controls the server, then why go to all the trouble to watermark? Why not just store all of GPT’s outputs in a giant database, and then consult the database later if you want to know whether something came from GPT? Well, the latter could be done, and might even have to be done in high-stakes cases involving law enforcement or whatever. But it would raise some serious privacy concerns: how do you reveal whether GPT did or didn’t generate a given candidate text, without potentially revealing how other people have been using GPT? The database approach also has difficulties in distinguishing text that GPT uniquely generated, from text that it generated simply because it has very high probability (e.g., a list of the first hundred prime numbers).”

Understanding the Roles of Provers and Keys in ZKPs

In the context of zero-knowledge proofs, a "prover" is an entity that wants to convince a verifier that it knows a value (the proof) without revealing any information about the value itself. Within AI-generated images, the prover could either be the company that developed the AI model or the individual user who created the image using the model.

Model Company as the Prover

When the model company acts as the prover, a circuit representing the model's neural network architecture and weights serves as the private key. This private key encapsulates the company's intellectual property, including the intricate details of the model's structure and parameters. Public keys, which are derived from this private key, represent the AI-generated images. The role of zero-knowledge proofs here is to enable anyone to verify that an image was created with a specific model, such as MidJourney, by referencing a public ledger like a blockchain.

In this scenario, the private key (model architecture and weights) remains confidential, while the public key (AI-generated images) is accessible. This method ensures that the authenticity of the images can be verified without exposing the proprietary details of the model.

User as the Prover

Alternatively, the prover can be the individual user who generates images using the AI model. In this case, the private key is assigned to the user, encompassing the user's credentials and usage rights. Public keys, on the other hand, represent the images created by the user. This configuration allows users to retain ownership and control over their AI-generated images, ensuring that their creations can be authenticated without compromising their privacy.

Differences in Mechanism Design

The choice of who acts as the prover significantly affects the design and implementation of the verification system. When the model company is the prover, the focus is on protecting the company's intellectual property while providing a way to authenticate the generated content. This approach is beneficial for maintaining the integrity and security of the model itself.

On the other hand, when users are the provers, the system emphasizes user ownership and control over the generated images. This method supports a decentralized approach, where individual creators have the power to prove the authenticity of their work independently.

So What?

Consider a scenario involving politically driven deepfakes. Suppose a malicious actor creates a deepfake video using an AI model, depicting a politician making controversial statements. The video is then shared widely on social media, potentially influencing public opinion and election outcomes.

If the AI model company is the prover, each AI-generated video can be linked to a public key that is stored on a blockchain. When the video is shared, viewers or independent auditors can check the blockchain to verify if the video was generated by the AI model. This ensures the video's authenticity without exposing the model's architecture and weights.

Alternatively, if users are the provers, individuals who create and share AI-generated content would have their own private keys, with public keys linked to their creations. This allows content creators to authenticate their work, and viewers can verify the source of the video. This decentralized approach empowers users to maintain control over their content while ensuring transparency.

Takeaways

"If we knew what we were doing, it would not be called research, would it?"
~ Albert Einstein

The prospect of verifying the source of deepfakes is an ongoing technical problem that will be solved with enough time and resources. Despite its promise, the application of zero-knowledge proofs in verifying AI-generated images is not without challenges. Access to the model's source code could enable bad actors to bypass verification mechanisms. Additionally, alterations to AI-generated images using tools like Photoshop might complicate their classification as AI-generated. Therefore, while ZKPs offer a promising avenue for tracking AI-generated content, they are not a complete solution. Zero-knowledge proofs might be one component of a larger technological framework needed to verify and authenticate AI-generated images. Verifying deepfakes and AI-generated content is an ongoing and unpredictable technical challenge. Policy interventions regarding deepfakes should not be prescriptive. Policies that require specific technological interventions such as watermarks will be quickly outdated as innovators identify more effective ways to handle deepfakes.

Curiosity Cabinet

Jul 7

Hey Logan, interesting take on ZKPs for AI verification. While I see the potential, I'm not fully convinced it's the best solution. Here's my perspective:

I think something like Twitter's Community Notes might be more practical for dealing with deepfakes. ZKPs sound promising, but I'm concerned about the data storage and compute resources they'd require. Is it really necessary (or feasible) to verify every AI image so intensively?

As AI-generated content becomes more common, I suspect we'll naturally become more skeptical of what we see online. While ZKPs could be one way for platforms to verify content, I wonder if simpler, less resource-intensive methods might emerge for everyday use. We might end up relying more on a combination of technological solutions and human judgment, like trusting certain individuals and organizations with solid track records.

Don't get me wrong, ZKPs could be useful for highly sensitive content. But for everyday stuff? Seems like overkill to me.

What are your thoughts on these concerns? Am I missing something about how ZKPs would work in practice?

Expand full comment

1 reply by Logan Whitehair

1 more comment...

Tech Tapestries

Discussion about this post