Pretend AI, aka Microsoft Recall

Prof Bill Buchanan OBE FRSE
6 min readNov 2, 2024

“If you want to keep a secret, you must also hide it from yourself.”
― George Orwell, 1984

Microsoft has again delayed its Recall feature in order to add more security to it. It was planned for release in May 2024, and then put back until October 2024. It is now likely to be released in December 2024, and security testers all over the world are rubbing their hands with glee for the opportunity to break it. When it was first announced, it was going to be on by default, but Microsoft has since back-tracked on this. As it takes screenshots every few seconds and sends them to the cloud, maybe it is an opportunity to create a Big Brother world and a privacy nightmare. It is likely, though, that Microsoft will focus on adding biometric authentications into Recall in order to address security concerns (using Windows Hello authentication).

Preface

What’s the dumbest and most inefficient way to collect data for your computer? Take a screenshot every few seconds, and then play it back like a flickbook. It sounds like the worst student IT project in the world, and it is now a new feature on Copilot+ PCs. And, so, Microsoft, with all the power of OpenAI at its fingertips, has taken one of the dumbest routes to smart AI with the implementation of a recall function:

The feature is enabled by default, and some people have already identified that it is difficult to disable:

Introduction

I had a meeting on Friday, and an AI notetaking guest joined. Basically, it was there to take notes about what was said and produce transcripted minutes. So, like it or not, we can’t hide from AI bots. They now understand our speech, they learn how we speak, and they can almost instantly link us to multiple sources of data — gathering who, where and when, along with what we did and said. Basically, machines can now link almost every part of your lives, but where you may not actually find a log of your data or for it to be stored on a database. In fact, “you” could be stored within a deep learning model, and where it would be almost impossible to extract yourself from the data gathered.

Your computer, smartphone and smart devices record a whole lot of things about your activities, such as in Internet caches, cookie stores, and other logs. But imagine if a machine could record every single thing that you did and then have the ability to play it all back. With this, you might remember that you had a file named “wobbly bridge”, and where an AI agent could search back on all of your previous activity for a file that had a similar name. Basically, the system would act like having a photographic memory. All sounds great?

Now, imagine that someone also gains access to your computer and manages to steal data on every single action you have taken on a device. What might it reveal about your inner thoughts? Thus, taken to the extreme and adding other information gathered from electronic devices, such as your location, a complete history of your life would be recorded.

The Recall function

As a Mac user, I use Spotlight, and it is usually pretty good at finding things that I’ve lost. For Windows, there isn’t even anything like that. But that is all going to change with the release of Copilot+ PCs, and which will take a screenshot every two seconds — a Recall function. Along with this, every action performed is also recorded. For many, the risks around security are likely to see Copilot+, especially in a corporate situation and where there is a potential for a large-scale data breach.

Along with the cybersecurity risks, one must wonder whether Microsoft could start training machine learning models on users or feed targeted adverts to them.

Microsoft, though, has said that the user will be in control of the screenshots, pick the applications recorded, and can pause or delete them at any time. The company will also have to show that Recall complies with the UK’s Data Protection Act and the EU’s General Data Protection Regulation (GDPR).

How does it work?

The system takes a screenshot every few seconds, and then this is converted into a text form using OCR software with Azure, and then stored in an SQLite database in the user’s folder. There is then a plaintext record of every piece of text that was viewed. The files are stored in the CoreAIPlatform folder within AppData [here]:

Ref [here]

and a demo of Microsoft engineers access the file:

https://cyberplace.social/system/media_attachments/files/112/535/509/719/447/038/original/7352074f678f6dec.mp4

The storage of the screenshots will likely be encrypted, but with the user’s encryption key. This means that when a user is logged in, all the screenshots will be decrypted and available to be viewed. This means that malware installed on a computer will easily be able to view and copy the screenshots as they will be available to them as much as to the user.

A dream for info stealers, advertisers and law enforcement

The screenshot approach would be a dream for both law enforcement and advertisers, and allow a complete playback of the trail of someone’s life. Added to all the other data that computers gather, we have a system that could record your from your birth to your death. This is the Internet we have created. Unfortunately, the methods we have used to create the Internet are not really fit for purpose when it comes to things like consent, privacy and trust.

For a digital investigator, the ability to view two-second screenshots and to search someone’s whole activity would be the best investigation tool known. And we can say the same about a cybercriminal, where a leak of your most sensitive of secrets could be extremely embarrassing. The store of screenshots is likely, thus, could become a single target for malware writers. In fact, it builds up virtually everything that correlates:

  • Who? Gained from speech, face detection, device, login, etc.
  • With who? Gained from dialogues of your conversations and correspondence with others.
  • Where? Gained from that smartphone in your pocket, and from IP/MAC address logs.
  • When? Timestamps and logs.
  • What? For all the traces of our purchases.
  • Why? Your timeline of activity on your device.

If a machine can cross-correlate all of these things, it will be able to develop a deep understanding of our world, and how we interact with it.

Conclusions

I’m a Mac user, and if I want to find something on my machine, I just open up the Spotlight, and it will find it for me. When I go to a Microsoft Windows machine, it just struggles to find things. Perhaps Recall is a way to address this in the simplest way possible — just keep taking screenshots. And, so, there’s a reason that this type of product does not exist in the market … it is just too risky!

Overall, there are many questions to ask here, especially in whether the logs will be encrypted and have high levels of access control checking. Does Microsoft have a good track record for this? Well, No! And, all this data will take up lots of space, so will the Cloud be used to store or archive them? If so, who might have access to them?

So, in the 21st Century, there will be no hiding places. Flipping Andy Warhol’s famous quote, we might have, “In the future, everyone will have privacy for 15 minutes”.

--

--

Prof Bill Buchanan OBE FRSE
Prof Bill Buchanan OBE FRSE

Written by Prof Bill Buchanan OBE FRSE

Professor of Cryptography. Serial innovator. Believer in fairness, justice & freedom. Based in Edinburgh. Old World Breaker. New World Creator. Building trust.