Prof. Zoubin Ghahramani
- Home
- About UC3M
- Honoris Causa
- Prof. Zoubin Ghahramani
Prof. Zoubin Ghahramani
Excmo. Sr. Rector, Excelentísimas Autoridades, Claustro de Doctores, Señoras y señores,
Es un gran honor recibir este Doctorado Honoris Causa por la Universidad Carlos III de Madrid. Muchas gracias al Rector y a mi padrino el profesor Antonio Artés-Rodríguez por su generoso laudatio.
Madrid ocupa un lugar muy especial en mi corazón. Aunque soy iraní, crecí aquí, viviendo aquí desde los 6 hasta los 16 años, y aunque he vivido en los Estados Unidos, Canadá y el Reino Unido durante los últimos 36 años, sigo siendo un poco Madrileño.
Para el resto de mi discurso cambiaré al inglés ya que me es más fácil hablar sobre temas técnicos en inglés.
—- —
My lecture today will be on the topic of Artificial Intelligence: what is it, how does it relate to human intelligence, how does it affect our lives?
I’ve been fascinated by AI since I was a child. I would read science fiction books by Asimov and nonfiction like Hofstadter’s Godel Escher and Bach, and wonder: what is intelligence, what is learning?
At the time, in the 70s and 80s, the dominant paradigm in AI was based on rules, logic and search algorithms. But these symbolic tools were brittle and couldn’t explain perception, human vision, language and speech understanding, or movement control. Such AI systems also didn’t learn or adapt to changing environments well.
In the mid-1980s a new paradigm for AI emerged based on neural networks. Neural networks are loosely inspired by the Brain. They consist of simple computing elements - called neurons or units - connected to each
other by weighted connections. The computing elements are organized into layers or stages of computation, and because these NNs often have many layers they are sometimes called ‘deep NNs’. These NNs can learn from data; given examples of inputs and desired outputs the system tries to reproduce those outputs. At first it makes many errors, but over time it modifies its weights to reduce these errors, getting better at the task it’s given to achieve. In other words, it learns!
The learning in neural networks is fundamentally data-driven and statistical. So along with logic, rules and search, the field of AI added tools from probability, statistics and optimization.
The neural network paradigm has now become dominant in AI and the field has exploded in the last decade. Scientific conferences which used to be a few hundred people now attract over 20 thousand participants and have had visits by the CEOs of major companies like Facebook and NVIDIA. Neural networks research attracts billions of dollars of investment from industry and hundreds of startup companies a year, as well as the attention of policy makers.
So why did this happen? Why has the field of AI exploded in importance since the 1980s? Well, AI systems are now working - they may not be “intelligent” yet in most senses of that word - but they regularly contribute to breakthroughs. And importantly they have now become extremely practically useful. There have been breakthroughs in games such as chess, Go, Atari, and Poker, where AI systems outperform humans. Poker is particularly interesting because it involves human psychology, risk taking, a great deal of uncertainty, bluffing and deception.
But the breakthroughs are not just in games, AI systems have proven useful in the sciences, helping discover new drugs, predict how proteins fold, and their function, design new materials, predict natural disasters and the weather, and control fusion reactors.
Finally, and very importantly, AI systems underpin many of the modern technologies we have. We now have systems that can reliably recognize speech, translate between dozens of languages, have conversations with humans and answer questions. We have AI systems that can recognize objects in images, identify plants, and analyze and interpret satellite and medical images. AI systems help drive newsfeeds, recommend products, optimize online advertising. AI systems are the brains of self-driving cars.
So what are the technical innovations that have made all these breakthroughs possible? Firstly, we have vastly more data than in the 1980s. Remember, before the web, it was very hard to collect and share vast amounts of images, text, and other kinds of data. Second, we have vastly more powerful computers, including special architectures like GPUs and TPUs, connected in large data centers. Data and compute are two of the key ingredients in this AI revolution. Third, we have much better open-source software, democratizing the use of AI to the point that my daughter could learn to train a neural network in her AI summer camp when she was 10 years old. Fourth, there is a virtuous cycle, where breakthroughs lead to investment, which leads to more people joining the field, which leads to more good ideas and breakthroughs. When you have 20 times as many researchers in the field, it’s not so hard to make perhaps 5-10 times the rate of progress.
How do modern AI systems compare to human brains? First I want to dispel a number of myths. The human brain appears incredibly complex. We have about 90 billion neurons with 1000s of connections per neuron. But there are several ways in which the human brain is remarkably limited! We can use information theory to estimate the amount of information contained in the human brain about the outside world. Information can come from only two channels, our genes and our senses. Our genome can be encoded in far less than one GB - a USB memory stick, and since the senses and human memory are highly lossy, the entire sensory experience of a human lifetime is perhaps 2TB worth of bits of information. So our brains collect and store only about one hard-drive worth of information about the outside world, in a lifetime. It doesn’t matter how many neurons
or molecules are in a brain… the information content is formally very bounded and far less than any typical AI system is exposed to.
Another limitation of the brain is well documented by cognitive scientists: human brains are rather poor at decision making, falling for many fallacies of logic and probability theory. Moreover, even the best human communication channel, namely speech, has a bit rate of only about 50 bits per second, which is about 20 million times less than my home Gigabit wifi network.
Finally, and amazingly, the energy consumption of a human brain is about 20W which is equivalent to a weak incandescent light bulb. This isn’t really a limitation but rather a feat of marvelous biological efficiency.
So how does this underpowered and limited system, the brain, appear to be so amazingly intelligent? I frankly don’t know. But we must not confuse the intelligence of a single human brain with the accomplishments of human civilization, which are the product of billions of interacting brains over thousands of years. When I sit in an airplane looking out the window over the complex and beautiful structure of a big city like Madrid, I note that my airplane and that city were not created by any single human brain.
In fact, it’s not clear that humans should be a model for intelligent systems. Humans are primates with a particular set of cognitive skills that have evolved for our survival. Human reasoning is notoriously flawed; human memory, calculation and communication abilities are very limited. Moreover, there are many kinds of intelligence out there, from the octopus intelligence described in the fascinating book “Other Minds”, to the amazing intelligence of a smartphone that can calculate things, navigate the world, and translate between languages already much better than its owner. We should move on from the human-centric view where humans are some pinnacle of a linear measure of intelligence, and instead embrace the idea that humans are just one point in the many dimensions of possible intelligence.
Intelligence involves sensing the world, interpreting these senses into percepts, being able to predict future outcomes, and taking actions or decisions so as to achieve certain goals. All these aspects of intelligence need to be learned from data and experience. The focus of my work has been on the probabilistic foundations of machine learning.
Probabilistic machine learning is based on the idea that AI systems need to learn models of their world. A model helps the AI system understand, but importantly also predict future data. A good model is good at predicting data, a poor model isn’t. Learning aims to get the model to be better with more data.
Of course, whenever there is a model, there are aspects of the model which are uncertain. Probability theory is the mathematics of uncertainty. It can be used to represent that uncertainty, and update it in light of new data. The mathematical rule for updating knowledge given data is called Bayes rule. It forms the foundation of learning from data, making good predictions, and making decisions that maximize expected utility.
Over the last two decades we have advanced both the foundations of this field and its practical applications.
Probabilistic machine learning, and especially how it can be used to overcome limitations of deep learning, is one of several recent trends in the field of AI. Another trend is exploring the power of scale in deep learning. It turns out that we now can train deep learning models with not millions, but billions, and most recently over a trillion weights. Such models, trained on huge datasets of text, exhibit impressive and surprising emergent abilities, such as the ability to complete code, make calculations, and keep seemingly natural conversations with a user.
Another trend in AI is the use of AI systems to improve other AI systems. This sounds like science fiction but is actually very practical. AI is being used to tune hyperparameters of machine learning models. AI is being
used to design new computer hardware such as Google’s TPUs. AI is being used to help software engineers write software faster and with fewer errors.
A third trend I’ll mention is the use of AI to solve intelligence problems that are not the ones single human brains evolved to solve. For example, AI in the sciences can analyze and find subtle patterns in genetic sequences;
AI in transportation can optimize the movement of people in a city; AI for power grids can substantially reduce waste in electricity production and transmission, resulting in a reduction in carbon emissions.
A final trend I will mention is that we are seeing more and more emphasis on the human side of AI. When AI systems are used in society, they interact with people and they should be designed so as to maximize benefits and minimize harms. To do this, the systems need to be designed in a way that considers impacts on user data privacy; they need to minimize the potential for harm due to gender, racial, or ethnic bias; they need to enhance rather than suppress human creativity; they need to be transparent and interpretable; they need to give people the right controls so that individuals and societies are empowered to make the choices that work for them.
I want to leave you with some final thoughts. I’ve been on an amazing journey in this field of AI and machine learning, first as an academic, and more recently also in industry. The more I work in this field, the more I see this human side as the ultimate prize in AI. AI is a tool for society, and we have so many immense opportunities to use this for human flourishing!
Thank you!
Muchas Gracias!