Shades of Privacy | Václav Novák

What is privacy in AI? The reason there is no simple answer is not so simple to explain. Even without AI, there are degrees of privacy. In the most privacy friendly setting, you run an app such as PianoMeter. It does some calculation on your phone, shows you the results and that’s it. A less private case is something like Trello: you send your data to the company, they are processed somewhere, and you get to see the result. In the ideal case nobody else can access your data, but in reality there is a long slide, beginning with company insiders being able to look up your data (think Uber employee stalkers), all the way to apps whose purpose is to sell your data to whoever.

With AI involved there is a whole new dimension to this problem: Somebody has to provide the training data. That somebody is often you, the user, because the user data is the best data, as they reflect the actual statistical distribution of the users. Such data is so valuable that giving it away for training is often the condition to use the app at all. Take an app which takes your naked pictures and calculates something useful about your skin. In order to use it, you must not only give your naked pictures to the company, but also agree to share them with third party institutions. Of course these institutions are “carefully selected”, otherwise you probably wouldn’t allow them to see your naked butt. But wouldn’t you? There is no choice here. Either you want help detecting skin cancer, and you have to agree to share your pictures with unknown numbers of grad students in all sorts of places, or you are out of luck.

Does it have to be this way? Do we just get used to the fact that whatever gets processed by any of these apps, will take on a life on its own in the clouds, going from third parties to fourth parties and beyond? Wouldn’t it be nice to go back to the PianoMeter situation? With naked pictures processed on the phone and AI trained without uploading and sharing them?

A short systematic overview was put together by Patricia Thaine, opening with this beautifully misleading schema:

Four pillars of Private AI

For now, perfectly privacy-preserving AI is still a research problem, but there are a few tools that can address some of the most urgent privacy needs.

So there is hope! A lot of math is involved, making it harder to make private AI than an AI built with all the naked pictures on everybody’s disks. Do we even care enough to do the math? I’m sure some people do and there may be a push to make this the new normal. Back to apps that just do what you need, no stalkers involved.