Unlocking the Magic: zkML — Uniting AI and Blockchain!
Machine learning (ML) is a branch of artificial intelligence (AI) that enables computers to learn from data and perform tasks that would otherwise require human intelligence. ML is widely utilized in various areas — from the more traditional aspects such as fraud detection and credit risk assessment to more advanced deep neural networks for computer vision, natural language processing, and speech recognition. However, ML also faces certain challenges, such as privacy, security, and verifiability, particularly when dealing with sensitive or confidential data, or more broadly, when it comes to safeguarding the intellectual property of the underlying ML model.
Blockchain is a distributed ledger technology that enables peer-to-peer transactions without any intermediaries. It offers features such as immutability, transparency, and decentralization, engendering trust among the concerned counterparties while also enhancing the efficiency of various applications. However, blockchain has some limitations, such as scalability, interoperability, and usability, particularly when dealing with complex or compute-intensive computations.
Zero-Knowledge Machine Learning (zkML) is a field of active research and development that aims to bridge the gap between AI and blockchain by using zero-knowledge proofs (ZKPs) for ML. ZKPs are a cryptographic method that allows one party to prove to another party that a statement is true, without revealing any additional information beyond the fact that the statement is true. ZKPs can be used to achieve both privacy and verifiability for ML models and data, as well as to enable off-chain computation and on-chain verification for scalability and efficiency.
Non-Technical Introduction to ZKML
zkML is a way of using machine learning on potentially encrypted data (potentially since zkML can also be applicable for public data) without revealing the data or the model. Encrypted data is data that has been scrambled or hidden using a secret code so that only those who have the key can read it. zkML then utilizes ZKPs to prove that a machine learning model has been applied to, or trained using, encrypted data and produced a correct output, without revealing the underlying data, the model, or the output. ZKPs are like magic tricks that convince someone that something is true, without showing them how it is done or giving away any secrets.
As a concrete example: Suppose Alice has some private data that she wants to use for machine learning, but she also does not want to share it with anyone. She can encrypt her data using a secret key that only she knows, and send the encrypted data to Bob, who has a machine learning model that he wants to sell or license, but he does not want to disclose the model parameters. Bob can use zkML to apply his model to Alice’s encrypted data and generate a zero-knowledge proof that shows that he has done so correctly, without revealing the model or the data. He can then send the proof to Alice, who can verify it using her secret key and get the output of the model, without revealing the output to Bob. This way, Alice can benefit from Bob’s model without compromising her data privacy, and Bob can benefit from Alice’s data without compromising his model security.
Recent Advancements in zkML
zkML is a relatively nascent and emerging field that has been making waves in cryptography and AI/ML circles recently. There have been several research papers, projects, and applications that demonstrate the potential and feasibility of zkML. Until around a year ago the practicality of zkML was in doubt, however, the recent advancements and efficiencies have made zkML more amenable to practical and daily applications.
Here are some examples of the recent advancements of zkML:
- zkml by Daniel Kang: This is a framework for constructing ZKPs of ML model execution in zkSNARKs, which are a special type of ZKPs that are succinct and non-interactive. Daniel also recently introduced TensorPlonk with efficiencies around compute requirements and proving times. However, TensorPlonk’s technical paper has not been released yet nor has it been open-sourced.
- Modulus Labs aims to facilitate on-chain AI platforms to be powered by zkML. Modulus Labs has published a series of blog posts that explain how to put AI on-chain, why to put AI on-chain, and showcase some examples of on-chain AI applications.
- ezkl by zkonduit: This is a library and command-line tool for generating and verifying ZKPs of inferences generated from deep learning and the more traditional ML models through halo2-based zkSNARKs. ezkl’s source code is open-sourced and is under active development. They recently added support for scikit-learn models and XGB, albeit with certain practical limitations. The generated proofs can then be used on-chain to verify the underlying computation (only the Ethereum Virtual Machine (EVM) is supported at the moment).
Despite its promise, zkML faces certain challenges, including computational complexity and scalability limitations. However, ongoing research and development are addressing these challenges, aiming to make zkML more efficient and practical for real-world applications — as evidenced by ezkl’s latest updates in just the last couple of months.
zkML Applications
zkML can enable a wide range of applications that can benefit from the combination of AI and blockchain, such as:
- Data marketplaces: zkML can allow data owners to sell or share their data without revealing the actual data, and data consumers to verify the quality and relevance of the data without accessing the actual data. This can create a win-win situation for both parties, as well as ensure and protect data privacy and security.
- Privacy-Preserving Data Sharing: zkML allows organizations to share data for collaborative ML without compromising the privacy of individual data contributors. This is particularly beneficial in domains like healthcare and finance, where sensitive data is prevalent.
- Model marketplaces: zkML can allow model owners to monetize or license their models without disclosing the model parameters, and model consumers to verify the performance and correctness of the model inferences without running the models themselves. This can create a new business model for ML, as well as ensure the verifiability and accountability of the models. For example, ezkl can be used to generate ZKPs of the inputs and outputs of an ML model, such as a neural network, a decision tree, or a logistic regression, and reveal only the ZKPs to the model users, who can then verify them on-chain.
Have a look at Spectral — their vision is exactly this: a marketplace for the inference economy!
- Federated learning: zkML can allow multiple parties to collaboratively train an ML model without sharing their local data and verify the contributions and outcomes of each party without revealing their local models. This can enable a distributed and decentralized approach to ML and enhance the privacy and efficiency of the training process.
- Verifiable ML Predictions: zkML allows users to verify the correctness of ML predictions without revealing the underlying data or model parameters. This enhances transparency and trust in ML-driven systems, particularly in critical applications like medical diagnosis or financial risk assessment.
- Secure Model Deployment: zkML enables the deployment of ML models on untrusted platforms without compromising the model’s integrity or confidentiality. This is crucial for scenarios where data privacy is paramount, such as edge computing or cloud-based ML applications.
Practical Usage of ZKML
Recent advancements in zkML have accelerated its adoption and practical usage. Notable examples include:
Reputation Systems
zkML can be used to build transparent and verifiable reputation systems for decentralized applications, ensuring trust and accountability among participants. Think of a scenario where:
- With zkML, your preferences (like the movies you’ve watched or the products you’ve bought) can be encrypted. It’s like putting your preferences in a magic box that can only be opened by the recommendation system, but the system itself can’t see the details inside.
- The ML recommendation system can perform calculations on the encrypted preferences without decrypting them. It’s like the system is doing math on the outside of the magic box without looking inside. This allows it to make personal recommendations without knowing the specifics of your preferences.
- Since your preferences are always encrypted, your personal details remain private. The recommendation system doesn’t need to know your exact preferences; it just needs to know enough to make good suggestions. This way, you get personalized recommendations without sacrificing your privacy.
Healthcare
zkML can securely analyze medical data and enable collaborative research without compromising patient privacy. For instance:
- zkML enables healthcare institutions and researchers to collaborate on building predictive models without sharing raw patient data. With homomorphic encryption, computations can be performed on encrypted data, allowing multiple parties to jointly train models without revealing individual patient records. This facilitates large-scale collaborative research efforts, potentially leading to advancements in disease prediction, treatment optimization, and epidemiological studies.
- Healthcare organizations often possess vast amounts of patient data that could yield valuable insights. zkML allows these organizations to perform analytics on encrypted health data without decrypting it. This ensures patient privacy is maintained throughout the data analysis process, making it possible to glean meaningful information without compromising confidentiality.
- Pharmaceutical companies and research institutions can use zkML to collaborate on clinical trials and drug discovery initiatives. By training models on encrypted patient data, organizations can collectively analyze the efficacy and safety of drugs without revealing proprietary information. This not only accelerates the drug development process but also ensures the confidentiality of sensitive research data.
- zkML can enhance the security of health information exchanges (HIEs) by allowing different healthcare providers to share insights and collaborate on patient care without exchanging raw patient data.
Finance
zkML can be used to facilitate confidential risk assessment, fraud detection, or personalized experiences. Practically speaking:
- Imagine you’re using a banking app or an online financial service. These platforms use machine learning to analyze your spending patterns, make predictions, and offer financial advice. Now, you might be concerned about sharing sensitive information like your income, expenses, and investment details. This is where ZKML comes into play.
- With zkML, your financial data — like how much money you have, your investment portfolio, and your spending habits — can be encrypted. Think of it like putting all this information in a secure digital vault. The magic of zkML ensures that this vault can be used for calculations without anyone seeing the specifics of your financial situation.
- The financial platform, using zkML, can perform calculations on the encrypted data without decrypting it. It’s like a financial advisor who can give you advice without knowing exactly how much money you have or where you’re spending it. This protects your privacy while still providing you with useful insights and recommendations.
- Let’s say you’re applying for a loan. zkML can be used to assess your creditworthiness without exposing your detailed financial history. The lender can make decisions based on encrypted information, ensuring that sensitive details are kept private while still allowing for fair evaluations.
- Banks and financial institutions use machine learning for fraud detection. zkML enhances this process by allowing the analysis of encrypted transaction data. It’s like having a security system that can detect unusual patterns and potential fraud without having to look at the specifics of individual transactions.
Conclusion
zkML is a promising and exciting field that can bring the best of both worlds of AI and blockchain. zkML can leverage the power of ZKPs to achieve privacy, verifiability, scalability, and efficiency for ML models and data, as well as to enable new and innovative applications that can benefit from the synergy of AI and blockchain. zkML is still in its infancy, and there are many open challenges and opportunities for further research and development.
