When AI meets big data with Núria Oliver

The relationship between AI and big data.

AI & Big data: the biggest development shaping our future?

In which specific areas are we seeing the most exciting advancements?

What about climate change?

Opportunities of public administrations using ADM systems to make decisions.

Impact of machine learning in the future.

Are algorithms biased?

What efforts are being made to counteract this apparent sexism?

Reading Time: 6 minutes

Núria Oliver is a computer scientist with over 20 years of research experience in Artificial Intelligence, Human Computer Interaction and Mobile Computing at MIT, Microsoft Research, and as the first female Scientific Director at Telefonica R&D, the first Chief Data Scientist at DataPop Alliance and the first Director of Research in Data Science at Vodafone. Her research interests include artificial intelligence, health monitoring, mobile computing, personal and big data analysis, statistical machine learning and data mining, ubiquitous computing, personalization, computational social sciences and human computer interaction.

Can you explain a little bit about the relationship between AI and big data?

Fundamentally, artificial intelligence is a discipline within computer science or engineering whose objective is to build computational, non-biological systems that exhibit a level of intelligence, taking human intelligence as a reference. In order to be able to define or build AI we need to be able to understand what it entails to have human intelligence. As we know, humans have the ability to learn, to constantly adapt, to plan, to find a solution to a problem, to improvise, to perceive, to observe and respond to it. And we have different types of intelligence: creative intelligence, verbal, maths, musical etc. Similarly, there are many different areas within AI.

In addition, when it appeared in the 50s there were two schools of thought which continued over time. The first was called ‘top-down’, ‘logical’ or ‘neat’. What they said was that if we want to have AI all we need to do is have humans with a lot of intelligence who enter as much knowledge as they can into the computer and then use logic to derive additional knowledge from all this built-in knowledge that we have entered. The other school of thought which was called ‘bottom-up’ or ‘scruffies.’ They said, “well why don’t we find the inspiration in biology. Biological beings learn from interacting with the world and they have these neurons, so maybe we can build artificial neurons.” This approach was also called the connectionist approach. It was an approach based on learning in a more biologically inspired way from data. Because our way to learn is through interacting with the world. These schools have had their ups and downs over time. The top-down was the dominant school and showed the most commercial successes in the 80s and 90s with expert systems. It’s only in the last 20 years that the bottom-up approach has become more important and useful, to the point that today most of the state of the art AI-based systems are using the bottom-up approach.

This bottom-up approach needs a lot of data to be able to train sophisticated models. The availability of larger scale data has been one of the factors that has led to the exponential growth of AI today. So that’s the relationship, we need the data to be able to train all these machine learning algorithms. In addition, we need computation. Computation power duplicates every year for the same price so we have affordable large scale computing that we can use to train very complicated models with a lot of data.

The convergence of Big data and AI is said to be the single most important development in shaping our future. Why?

It’s an important area because AI as a discipline has a number of characteristics which make it play a very similar role to that of electricity in the second industrial revolution. Today we are in the fourth industrial revolution. If we look at the characteristics of AI they are similar to those of electricity. AI is invisible, because it’s software; it is transversal because it’s not just a discipline that you can apply in e-commerce or in transportation or in medicine, you can apply it in any field with a lot of data; it is scalable; updateable; and it has the ability to predict the future. AI is at the core of the fourth industrial revolution. It is transforming and it will transform our world as we know it. It will enable autonomous cars, personalized medicine, personalized education, and the optimisation of resources.

In which specific areas are we seeing the most exciting advancements? For example, how is this technology being used to solve the world’s challenges?

As I said it’s transversal, so it applies to any field. But my particular motivation has always been on the ways technology can be used for social good. I’m interested in the potential AI has in areas of health, personalised medicine won’t be possible without AI, but also public health, as the availability of large scale human behavioural data enables us to give more accurate epidemiological models of the spread of diseases. Another interesting area is public administration, optimising the different services provided to citizens so people can live in a fairer, better functioning society. We can make decisions more based on evidence, and less on cognitive bias and manipulation.

What about climate change?

Any domain where there is a large amount of large scale, non-structured, dynamic data – not numbers in a spreadsheet but data coming from sensors, images, videos and so forth. We, humans, are unable to make sense of data of such volume and characteristics without machine learning algorithms. If we want to have more accurate models, to be able to understand what is happening on the planet, predict the weather, detect anomalies… all of this is done with data-driven machine learning algorithms.

Many public administrations are now using ADM systems to make decisions that directly impact the lives of many people. What are the most significant opportunities and the biggest challenges when it comes to governance?

Our aspiration is to overcome the limitations of human decision making, but there are multiple limitations. The first one is discrimination of bias, even though we want to make decisions based on evidence and less bias than humans, but we are finding that data-driven algorithmic decisions also have biases. The second one is computational violations of privacy. We can infer very personal attitudes about people from non-personal data. For example, I could infer your sexual or political orientation, happiness or mental health, from non-personal data. Another challenge is veracity.

With AI today we can generate text, video, audio and photos that are completely artificially generated but look 100% real and human. If you combine that with the ability to reach millions of people, then you have a lot of power to shape public opinion.

Another challenge is opacity, which is the opposite to transparency. Algorithmic decision-making systems can be opaque for three main reasons. The first one is because the creators of the algorithms want to protect their intellectual property, this is called intentional opacity. They don’t want to lose their edge. The second reason is about public understanding. Even if I try to explain how an algorithm works, 99% of society will not understand me because they don’t have the skills. And the last one is called intrinsic opacity, which is related to the complexity of the methods that we use, with hundreds of layers, millions of parameters, so they are very difficult to interpret. There is an entire field in AI or, more broadly, computer science, called interpretable AI, or explainable AI that is designed to be explained to humans. We also have a very big asymmetry in terms of data. Most of the data that is being used to train the algorithms is privately held, it’s not publicly available. So there is an asymmetry because whoever has the data, has the power.

There is a very small elite of people who have access to the data and even if the public had access, most of them wouldn’t know what to do with it. We really need to focus on tackling the lack of knowledge in our industry and in society, when it comes to automated decision making.

There are increasingly more movements from governments. Since 2016, most of the developed and some developing companies have published their national strategies on AI. So it has become a political issue. People are talking about it, writing about it, and thinking about how we can move forward as a society.

Large data sets make it possible for machine learning applications to learn independently and rapidly. What could this mean over the next few years?

As with any industrial revolution, there will be a transformation in society. New jobs will be appearing that didn’t exist before and many will disappear due to AI and automation.

The key is to feel empowered in determining where we want to go with this technology. We feel like these technological developments are happening to us, when really it’s us driving the change.

No one is imposing it, we just need to understand it. It’s about compulsory education for the new generations. They are receiving the education of the second industrial revolution, not the fourth. Employers, citizens, decision-makers, they all need better education.

AI has a close relationship with men. If algorithm designers are predominantly male, does this mean automated decision making is from a male perspective?

There are numerous studies that have demonstrated that diversity enriches goals metaphorically and literally. Institutions, disciplines, companies that are more diverse do better but also design more meaningful and inclusive solutions and more innovative products. We don’t have enough diversity right now, especially in computer science and the field of AI. Given how important this field is, I think this is something that we really need to address urgently because the trend is heading in the wrong direction. Up to the mid-80s the percentage of females that chose computer science was growing every year, at one point there were around 40%, then since the mid-80s, the percentage has been declining. In Spain, we have about 10%. That is a very negative situation and the concern is that these ubiquitous algorithms have not been designed by a diverse group of people, so we’re probably missing a very significant amount of innovation and we’re probably not designing inclusive enough systems because of the lack of diversity. The other worrying element is that we’re actually excluding half of the population, who will not be able to participate and contribute in such an important area of our economy and that is going to generate a huge amount of wealth.

It’s dangerous for women as a collective because unless they are AI and tech-savvy they are not going to be able to contribute to one of the most disruptive important and technologies in our society.

From Apple’s Siri to Amazon’s Alexa, voice recognition and AI assistants are typically female. What efforts are being made to counteract this apparent sexism?

There is definitely an element of gender, and there are definitely some research projects on that. There is also an element of how pleasant and intelligible and artificial voice should be. Higher pitched voices are naturally more pleasant to listen to, we can understand them and we feel that we can trust them more. However, it can be seen as a conspiracy theory that all these guys are designing female slaves because we see this in science fiction movies. Even in the Greek myths, Hephaestus, the god of the forge, was already talking about creating two female gold robots to support him. Of course, if there was more diversity, we might see more interesting voices, diverse contributions, neutral voices coming through.

Get inspired

When AI meets big data with Núria Oliver

"AI is at the core of the fourth industrial revolution”