Summary

Google's introduction of the Web Environment Integrity (WEI) API marks a significant shift in online security, aiming to enhance the integrity of web environments and combat issues like AI-driven web scraping.

Abstract

The article discusses the advent of Google's Web Environment Integrity (WEI) API, which is poised to usher in a new era of online security. The author reflects on a previous post about the broader issue of web scraping by AI companies, exemplified by challenges faced by platforms like Twitter and Reddit. The WEI API, detailed in a technical document and initial proposal, allows developers to retrieve tokens to verify the trustworthiness of a web environment, leveraging platform-level attesters and asymmetric cryptography. The author also touches upon Firebase's App Check feature, which uses attestation to authenticate apps, and the limitations of reCAPTCHA v3 on the web, which can incorrectly deny access to legitimate users. The article posits that WEI could address these issues by enabling the operating system to validate browser integrity, despite concerns about potential negative impacts on the open internet. The author, initially skeptical of HTTPS, now acknowledges its security benefits and similarly views WEI as a positive step, albeit with reservations about its implementation and the potential for it to be limited to certain browser/OS combinations initially.

Opinions

The author views the current state of web scraping by AI companies as a significant problem, with platforms like Twitter and Reddit being symptomatic of this larger issue.
The WEI API is seen as a response to these security challenges, with the potential to improve the integrity of web environments.
Attestation, as used in Firebase's App Check, is recognized as an effective tool for ensuring app legitimacy, though it is not without its issues, such as false rejections and lack of advanced analysis tools.
reCAPTCHA v3 is criticized for its potential to incorrectly deny access to humans and for the extensive user data it collects, which could infringe on privacy.
The author is optimistic about WEI's ability to solve the problems associated with reCAPTCHA v3 and enhance user privacy.
There is skepticism about the discourse surrounding WEI, particularly claims that it could destroy the open internet, with the author arguing that it is an optional API and that the open web is already compromised by measures like HTTPS.
The author acknowledges the security advantages of HTTPS, despite initial reservations, and suggests that WEI could offer similar benefits.
Concerns are raised about the practicality of WEI, such as the possibility of OS or browser modifications that could undermine its security intentions.
Overall, the author sees WEI as a positive development for online security, despite its imperfections and potential limitations during its initial rollout.

Google’s Upcoming API Signals a New Era For Online Security

The New Web Environment Integrity (WEI) API

A while ago I wrote a post arguing that Twitter and Reddit are just symptoms of a much larger problem: Web Scraping by AI companies.

Twitter And Reddit Are Just Symptoms Of A Much Larger Problem

All the way back in April I saw this post:

medium.com

I call it the Scrapening. I have another article coming out about it later. Anyways, maybe in response to the scrapening, maybe for unrelated reasons, Google has responded with a new API: the Web Environment Integrity api (WEI). You can read about it here.

It’s more of a technical document though. It goes into way more detail than you need and at the same time tells you nothing. It’s meant for people trying to implement it. A more interesting piece about it is the initial proposal here:

This is a new JavaScript API that lets web developers retrieve a token to attest to the integrity of the web environment. This can be sent to websites’ web servers to verify that the environment the web page is running on is trusted by the attester. The web server can use asymmetric cryptography to verify that the token has not been tampered with. This feature relies on platform level attesters (in most cases from the operating system).

Ah, attestation. This is something I’ve been playing around with in my apps. Because Firebase has a new ‘App Check’ feature that uses Attestation to tell if the app is legitimate. I’m not exactly sure how this works but it appears to be using the Play Integrity API on Android, the Device Check and App Attest APIs on iOS, and the reCAPTCHA v3 API on web. What happens is before making a backend request the app contacts one of these APIs to request an attestation token. I have no idea how these are generated but I assume they’re checking the hash of the app to ensure it has not been modified.

It works pretty well but I don’t enforce anything yet because I get a lot of false positives. Or should that be false negatives? False incorrect attestation rejections. I think it’s coming from one of my debugging apps but I’m not sure. I hope Firebase adds more advanced tools to tell which apps are not attested.

Otherwise Attestation works really well… except on web. The problem is reCAPTCHA v3. reCAPTCHA v3 isn’t your parent’s captcha. It does not require you to solve one of those manual captchas. Instead it relies on a few metrics to determine if a human is using the site. These include mouse movements, typing, and navigation patterns.

This is good… but it also results in a lot of humans being denied incorrectly. I can see all the ‘reCAPTCHA’ failed error messages and there are a lot of them. And this is a lot of information to be sending to a service. It is already possible to determine a person’s identity with just mouse movements. And with even more data like screen resolution, user agent, language, and IP, it’s probably even easier to identify users. Certainly companies like Apple do not like this form of tracking. So what do we do?

Enter Web Integrity

Web integrity I see as a way to solve all of these problems. It is a way for the operating system to validate the integrity of a browser.

But what I don’t get is the amount of… negative discourse around this idea. People are saying how it’ll destroy the open internet. Well, no, it’s just another API for people to use. You don’t actually have to use it if you don’t want to. And we’ve already had things like this in the form of reCAPTCHA v3, just less efficient.

And the open web has been dead for a while now. The most prominent example is HTTPS (HTTP over SSL). You need to pay for a certificate to use HTTPS. And who issues the certificates? The SSL Mafia, that’s who.

That’s a really nice site you have there. It would be a shame if any browsers marked it as insecure.

I was originally negative on HTTPS. I thought it was unnecessary and just an excuse to line the pockets of the big tech companies. And it can break a site if someone forgets to renew an SSL certificate. You can still use the web without HTTPS but not without constantly being reminded that the site you’re using is ‘insecure’.

But I’ve come around to HTTPS a little. I’m still not sure if it’s the best way to handle it, but I can’t deny that there are certain attacks that HTTPS makes impossible. Well, impossible unless you’re Tom Cruise breaking into a certificate authority.

And because HTTPS is built into the browsers it’s way more intrusive than this new API. As I said before: don’t like it? Don’t use it.

Oh and one more thing. People are complaining that Google is controlling this. And they are at first.

We initially support this only for Android platforms (Android, and Android WebView). This feature requires an attester backed by the target platform so it will require active integration per platform.

But I suspect soon all major browser/OS combos will support it. It is an open standard after all.

Final Thoughts

However I’m not really sure how secure this is. What’s stopping an attacker from modifying the OS to automatically validate any attestation requests? If it’s your machine you can do whatever you want to it.

Although reCAPTCHA v3 is not foolproof either. And perhaps the big companies are going to lock down this API pretty hard. You will have to be a skilled attacker to pull it off.

However then there’s the browser. What would stop someone from just modifying a browser or forking Chromium and starting a new one? Questions, questions. I suspect only a few browser/OS combos will be able to use this for a while. All the other browsers will be forced to use the less reliable, more privacy-invasive reCAPTCHA. Or force users to solve an actual Captcha.

Overall I do see Google’s Web Integrity API as a positive step forward. It’s not perfect. Nothing on the web is. But it’s a step forward.

Thank you for reading this article. One of my favourite aspects of Medium is the vibrant community of readers who engage with my writing. So feel free to share your insights, jokes, critiques, or any topics you’d like me to explore in future articles. I do read them. Eventually.

Also if you’d like to stay updated with my future articles and other exciting content, consider using my RSS app on iOS and Android. It supports Medium’s PubSubHubbub for speedy updates, allows subscriptions to users and publications, and provides personalized notifications. It’s how I follow people on Medium. And YouTube. And Reddit. And everywhere else.