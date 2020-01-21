What is AI safety?

Tonya Hall sits down with Max Tegmark, scientist, author and co-founder of the Future Of Life Institute, to determine what AI safety looks like and how one can achieve this.

Is it possible to make artificial intelligence more reliable by involving a person in the decision making process of machine learning?

Maybe so, but you get something for a reason. It is better for that person to be a person who knows a lot about what the neural network is trying to find out. And that is a mystery, given that one of the most important promises of AI is to discover things that people don’t know.

It is a mystery that is being circumvented in a new piece of AI work by scientists from the Technische Universität Darmstadt in Germany. Main author Patrick Schramowski and colleagues propose that people should check the explanation of a neural network. The idea is to extend the so-called “explainable AI” and “interpretable AI”. They claim that it is not enough to have an explanation about what a neural net does, people should actually be closely involved in resolving what goes wrong with a neural net.

With this Schramowski and colleagues hope that people will gain more confidence in machine learning.

“[I] Learning and explaining interactively is necessary for the user to understand and build confidence in model decisions,” write Schramowski and colleagues in Right for the Wrong Scientific Reasons: Revising Deep Networks through interaction with their explanation.

Their solution is “XIL”, which stands for “explanatory interactive learning”, with the emphasis not only on explaining machine behavior, but also on the exchange between person and machine.

The work is strongly inspired by the recent work of Sebastian Lapuschkin from the Fraunhofer Heinrich Hertz Institute in Berlin, who has written that neural networks can sometimes be like ‘Clever Hans’. Clever Hans was a famous horse that blinded the public in the early 1900s by seeming to be able to count.

Further investigation revealed that Hans only responded to human gestures, such as nodding his head. Despite the impressive quality of AI, Lapuschkin claims that it sometimes only uses data set specificities instead of learning really relevant representations of a problem. People must therefore be a bit careful with all the “excitement about machine intelligence”.

Philosophically, the authors of the current work take a page from the algorithm genius Don Knuth. They agree with Knuth that “instead of imagining that our most important task is to instruct a computer on what to do,” the goal is to “focus rather on explaining to people what we have want to have the computer do “.

Schramowski and the team’s experimental set-up for XIL is to solve a convolutional neural network a simple problem in classifying the phenotype of a plant as healthy or sick. They have the convolutional net examine images of leaves of the sugar beet plant, a staple crop around the world, for cases of illness. They then visualize which functions the network used, and then they have an expert in plant biology where the neural network fell. With good learning, the net should only focus on dark spots on the leaves of the plant that indicate the “Cercospora Leaf Spot” disease.

Schramowski and colleagues from the Darmstadt University of Technology in Germany propose that a person be informed again in AI by having a domain expert correct where a neural net goes wrong.

As Schramowski and colleagues put it: “In every step the student [the neural net] explains his interactive question to the domain expert and she responds by correcting the explanation, if necessary, to give feedback […] we leave a expert review the learning of the machine by limiting the explanation of the machine to domain knowledge. ”

Feedback in this case is formalized as an additional loss function added to the two normal loss functions of “cross-entropy” and “L2 regularization” that are commonly used in a neural net training session. That third loss function acts as a new limitation that has been added to the convolutional neural network.

In the case of the convolutional net that looks at beet leaves, the visualizations reveal that an uncorrected network sometimes looks at the wrong signals: it takes into account artifacts in the image that are not in the area of ​​the leaf, such as the plate on which the sheet is lying. That is a mistake, an example of what you can call a naturally occurring opponent. Another way to name it is a “confounder”, a variable that should not be included in the calculation.

They create a binary mask of the contours of the beetroot leaf in each photo. Then it becomes very easy to use the loss function to penalize the neural net if it looks somewhere else than the area in the photo of the leaf. They find that accuracy can improve in some cases, but, more importantly, it seems that the accuracy is now based on the right signals, so it’s more reliable.

On the right you see an example of heat maps made by the Grad-CAMs technology to show what a neural network is, and on the left is a clustering of the solution strategies that the neural network uses.

Schramowski and colleagues build on a great deal of prior knowledge in explainable AI. They take SpRAy, or “spectral relevance analysis”, from Lapuschkin and his colleagues, a program that makes heat maps of what convolutional networks “see” based on activations of neurons in different layers. SpRAy was developed by itself based on a visualization technique made by Bolei Zhou and colleagues at MIT in 2015 called “class activation cards” and refined in 2017 by Ramprasaath R. Selvaraju and colleagues at Georgia Institute of Technology in the form of what “Grad” will be called -CAMs. “With Grad-CAMs, someone can create a heat map of a particular element while it flows through the network from start to finish.

The authors write in their conclusion that they hope to bring this interactive element to many other forms of explainable or interpretable AI, such as “Coactive Learning” developed in 2015 by scientists at LinkedIn and Cornell University, and “human-led probabilistic learning, “developed in 2018 by scientists from Georgia Tech and UT Dallas.

What remains open at the end of the article is whether one of these approaches is applicable beyond very simple supervised classifications of the kind they have demonstrated. Given that function discovery in deep learning should find things that may well be unknown to a person, it is not clear how a person could step into the loop to correct when the machine messes it up as its own domain knowledge. the human being is probably surpassed In a way.

But that doesn’t mean it can’t happen. At least there is a framework in the work of Schramowski and colleagues for how a person and a machine can work together, and there are “desiderata” things to strive for. It remains to be seen how broadly applicable it can be.