“We need to get started with AI.”

Prof. Dr. Ute Schmid is a Professor for Cognitive Systems at the Faculty of Information Systems and Applied Computer Sciences at the University of Bamberg and a member of the Board of Directors of the Bavarian Research Institute for Digital Transformation. She also heads the Fraunhofer IIS project group Explainable Artificial Intelligence.

Ute_Schmid

Hardly any other technology carries as high hopes as artificial intelligence (AI). But there are also concerns: Will machines soon be smarter than us? Will robots be in charge then? Professor Dr. Ute Schmid, AI expert from the University of Bamberg, shines a light into this black box.

 

Professor Dr. Schmid, for years artificial intelligence has been portrayed as a huge gray area between the promise of salvation and primal fears. You encourage all people to deal with the topic of AI. Why?
Prof. Dr. Ute Schmid
: AI technology will dramatically change many areas of life and of the economy. To better assess the potential positive and negative effects of AI applications, a basic understanding of AI methods, especially machine learning, is important. I will illustrate this with a simple example – recommender systems for news, search results and products. These are based on classical approaches to information retrieval, enhanced by AI methodology. People who use these technologies on a daily basis need to understand what a page rank at Google does, why we get into filter bubbles and what the effects are – for example on opinion formation processes.

What is the situation in companies where AI is seen as a technology that determines future viability?Information and education are also absolutely necessary in companies too. They generally have very high expectations where the use of deep learning methods is concerned. However, they often lack the understanding that using such data-intensive methods can be very expensive and require a lot of effort. After all, a great deal of data must be available in very high quality in order to learn robust and reliable models. By now, however, there are more and more initiatives to teach AI skills. The German Federal Ministry of Education and Research, for example, is currently spending a lot of money to promote AI skills in companies.

What should the measures target?
I believe that in a company all employees should have at least a basic idea of how AI systems work. Depending on the position in the company, that knowledge should be more or less broad and more or less deep. For example, while a clerk needs to understand where data influences the quality of a learned model, for example to predict demand in a quarter, strategic management needs to know as much as possible about deep learning approaches and other methods of AI in order to make a meaningful assessment of possible applications. In conversations I often hear about the following slogan being bandied about in companies: “We need to get started with AI.” If you then look at what tasks and issues are to be dealt with in concrete terms, it often turns out that standard approaches are quite sufficient and even more suitable.

Differentiation seems difficult with all those buzzwords and application scenarios flying about.
At the moment, there is quite a confusion of terms. AI is sometimes equated to machine learning or even directly with deep learning. Even digitization and AI are often used synonymously. Digitization means the transformation of analog information into digital formats so that this information can be processed using computer algorithms. AI methods should only ever be used when there are no clearly predefined rules or when the problems are so complex that the solution cannot be efficiently calculated. Block chain technologies or robotics are not AI either.

You have recently been awarded a prize for being able to explain AI in simple terms, you bring the subject to primary schools and have published a book entitled “AI programming for dummies – Junior”. So what exactly is AI?
A well-known definition is that AI research is concerned with solving problems that currently people can still solve better. I like this definition because it does without the term “intelligence”. In everyday life, this term is used in a certain way that does not really describe all that well what AI research is all about. Typically, we consider people to be intelligent if they play chess very well or have studied physics. By contrast, we do not find it impressive if someone can build a stable tower out of building blocks or mix an apple spritzer. However, the latter two activities are much more difficult to implement with an AI system that can perform these tasks reliably under different conditions. They require a wide range of problems – from object recognition to action planning – to be solved. Often, it takes a certain amount of world knowledge – so-called common sense knowledge – which is difficult to formalize and which every child has, for example the knowledge that a tower will not be stable if you put a cuboid on top of a pyramid.    Conclusions like that come easily to people, while they tend to find arithmetic and strategic forward-planning, which you need to play chess, rather difficult.

What types of problems are researchers aiming to solve with the help of AI?
AI methods are mainly used when one of the following conditions applies: Either it is a problem in a very complex area, where a solution is hidden in the search space like a needle in a haystack. These are the so-called exponentially complex problems, where it is not possible to simply search through all variants with a standard algorithm. You can find such problems in logistics, for example. Or it is a problem that cannot be fully modeled. In this case machine learning can be used to approximate models from data. In most other problem areas, standard methods of software development are sufficient.

So what caused the hype surrounding AI a few years ago, apart from chess, jeopardy and go championships?
About ten years ago, an artificial neural network won a competition in image recognition. In contrast to classical methods of image processing (computer vision), this approach made it possible to learn directly from the image data what object was visible in the image. In addition, the neural network correctly recognized far more images than the methods used previously. This so-called convolutional neural net allows end-to-end learning, i.e. learning a classifier directly from the raw data, in this case from bitmaps. Previously, in machine learning, even with classical neural networks, features always had to be extracted from the raw data first so that the programs could learn from these feature vectors. This “feature extraction” is a lot of work, and with complex problems you have to think a lot about what kind of features to extract from the raw data. Since the new deep learning approaches now allow raw data to be fed into neural networks unprocessed, many people hope that they will be able to obtain models that can be used in complex areas for which no good solutions have so far been found – without going to a lot of trouble. There is the hope that thinking can be replaced by data, combined with the belief that data is cheaper to come by than employees who are able to think up good solutions.

So where is the fallacy, in data or in thinking?
What is often overlooked in data-intensive approaches to deep learning is that you need a lot of training data, which firstly must be sampled as representatively as possible and secondly requires correct class labels. If you have data that does not adequately reflect the distribution of classes, you arrive at unfair models that discriminate against people of certain ethnic groups, for example. In addition, gross errors can occur due to overfitting – the learned model uses irrelevant information to make decisions. Assigning correct class labels to very large amounts of training data can quickly become the bottleneck for using deep nets. Animal pictures or traffic signs could still be labeled quite reliably and inexpensively, for example by crowdsourcing. But in order to distinguish good parts from bad parts or to classify tumors in tissue sections, the data must be labeled by experts. This is what makes data expensive too, and thinking is necessary to have good quality data available.

Is that why banks and insurance companies are still hesitant to use AI methods?
Financial service providers have long used statistical methods to predict risk, for example, or the creditworthiness of customers. In the 1990s, the EU project “StatLog” was launched, in which the machine learning algorithms available at that time – from the decision tree to the classical neural network – were compared on many data sets. Among other things, it also included the so-called German credit data set, which is often used as a benchmark data set even today. As far as I know, statistical and machine learning approaches are increasingly used by insurance companies also to determine individual rates, for example by monitoring car drivers or analyzing fitness trackers.

Where do you see the biggest obstacles in implementation?
You need people who have sound expertise in artificial intelligence and machine learning. Most employees in software development did not learn about these topics when studying computer science, because until recently AI was an elective subject that was not very frequently chosen. You also need people who know the area of application very well. The two groups must mutually understand enough of each other’s field to ensure that data and methods fit together. “Blindly” feeding a deep net with existing data usually goes wrong. It is a truism in statistics as in machine learning that the greatest influence on the quality of the result is the quality of the data – garbage in, garbage out.

How can companies bring their existing data into play here?
It is true that banks or hospital operators often have big data. But if the distribution of the data differs from the real world or the data is highly unbalanced because certain customer profiles or diseases are rare, the learned models are very inaccurate and new data is wrongly classified. Often you don’t have big data available, but only “small data”. For the current deep networks, it is often the case that more than half a million weights have to be adapted. To train such nets well, you actually need a multiple of training data, which, as already mentioned, must be annotated with the correct class labels.

What negative consequences can this have?
Insufficient data leads to models that frequently make mistakes, for example because they are biased. Cases have become known where images of people of color were classified as gorillas or where female applicants were not considered for jobs in the IT sector. Especially in the human resources sector, people claim that AI systems would make more objective decisions when selecting applications. In the case of unreflected “data in, thinking out”, prejudices are even reinforced as previous decisions of HR staff are used as “ground truth” labels to train the models. Without good data there can be no good decision-making model.

How will AI develop over the next five years?
The current hype surrounding artificial intelligence has brought a lot of dynamism into AI research and generated many exciting research questions. In science, there is an increasing exchange between different streams of AI, for example, cooperation between statistical machine learning and knowledge-based approaches. I believe that especially the development of hybrid approaches, where machine learning and logical inference are combined and complement each other, will advance research for some time to come. Last but not least, methods of so-called “explainable AI”, which try to make the decisions of black box models transparent and comprehensible, will become more and more important.

And how does AI affect the economy and society?
Currently, we are running the risk that deep learning methods will be used in real life without reflection. The resulting models will be highly flawed if there is not enough data of the required quality and if the problems to be solved do not fit the methods. There is a danger that users will conclude that AI is useless, even though the unsatisfactory results are due to the incorrect use of the methods, not to the methods themselves. But this is not unusual – in the history of AI there have been several so-called AI winters, which followed hype phases. However, each new wave has also led to a part of the methods and technologies from pure research being transferred into practice and no longer even being perceived as AI methods. In my opinion, we have reached a point where the future use of AI systems will be decided: can it support the economy and our society in a helpful way, for example in the areas of education and health, or are we depriving people of competences. In order to avoid this “sorcerer’s apprentice effect”, we need to engage in a broad social discourse now and ask ourselves how we want to live, learn, work, produce and do business with AI in the future.