Thursday, July 18, 2019

Robust Against Manipulation

As algorithms get more sophisticated, so do the opportunities to trick them. An algorithm can be forced or nudged to make incorrect decisions, in order to yield benefits to a (hostile) third party. John Bates, one of the pioneers of Complex Event Processing, has raised fears of algorithmic terrorism, but algorithmic manipulation may also be motivated by commercial interests or simple vandalism.

An extreme example of this could be a road sign that reads STOP to humans but is misread as something else by self-driving cars. Another example might be false signals that are designed to trigger algorithmic trading and thereby nudge markets. Given the increasing reliance on automatic screening machines at airports and elsewhere, there are obvious incentives for smugglers and terrorists to develop ways of fooling these machines - either to get their stuff past the machines, or to generate so many false positives that the machines aren't taken seriously. And of course email spammers are always looking for ways to bypass the spam filters.

"It will also become increasingly important that AI algorithms be robust against manipulation. A machine vision system to scan airline luggage for bombs must be robust against human adversaries deliberately searching for exploitable flaws in the algorithm - for example, a shape that, placed next to a pistol in one's luggage, would neutralize recognition of it. Robustness against manipulation is an ordinary criterion in information security; nearly the criterion. But it is not a criterion that appears often in machine learning journals, which are currently more interested in, e.g., how an algorithm scales upon larger parallel systems." [Bostrom and Yudkowsky]

One kind of manipulation involves the construction of misleading examples (known in the literature as "adversarial examples"). For example, an example that exploits the inaccuracies of a specific image recognition algorithm to produce an image that will be incorrectly classified, thus producing an incorrect action (or suppressing the correct action).

Another kind of manipulation involves poisoning the model - deliberately feeding a machine learning algorithm with biased or bad data, in order to disrupt or skew its behaviour. (Historical analogy: manipulation of pop music charts.)

We have to assume that some bad actors will have access to the latest technologies, and will themselves be using machine learning and other techniques to design these attacks, and this sets up an arms race between the good guys and the bad guys. Is there any way to keep advanced technologies from getting in the wrong hands?

In the security world, people are familiar with the concept of Distributed Denial of Service (DDOS). But perhaps this now becomes Distributed Distortion of Service. Which may be more subtle but no less dangerous.

While there are strong arguments for algorithmic transparency of automated systems, some people may be concerned that transparency will aid such attacks. The argument here is that the more adversaries can discover about the algorithm and its training data, the more opportunities for manipulation. But it would be wrong to conclude that we should keep algorithms safe by keeping them secret ("security through obscurity"). A better conclusion would be that transparency should be a defence against manipulation, by making it easier for stakeholders to detect and counter such attempts.

John Bates, Algorithmic Terrorism (Apama, 4 August 2010). To Catch an Algo Thief (Huffington Post, 26 Feb 2015)

Nick Bostrom and Eliezer Yudkowsky, The Ethics of Artificial Intelligence (2011)

Ian Goodfellow, Patrick McDaniel and Nicolas Papernot, Making Machine Learning Robust Against Adversarial Inputs (Communications of the ACM, Vol. 61 No. 7, July 2018) Pages 56-66. See also video interview with Papernot.

Neil Strauss, Are Pop Charts Manipulated? (New York Times, 25 January 1996)

Wikipedia: Security Through Obscurity

Related posts: The Unexpected Happens (January 2017)

No comments:

Post a comment