CSA Images/iStock

In August 2015, a number of carefully selected Facebook users in the Bay Area discovered a new feature on Facebook Messenger. Known as M, the service was designed to rival Google Now and Apple’s Siri. A personal assistant that would answer questions in a natural way, make restaurant reservations and help with Uber bookings, M was meant to be a step forward in natural language understanding, the virtual assistant that – unlike Siri – wasn’t a dismal experience.

Fast forward a couple of years, and the general purpose personal assistant has been demoted within Facebook’s product offering. Poor M. The hope was that it would tell users jokes and act as a guide, life coach and optimisation tool.


The disappointment around M largely derives from the fact that it attempted a new approach: instead of depending solely on AI, the service introduced a human layer – the AI was supervised by human beings. If the machine received a question or task it was incapable of answering or responding to, a human would step in. In that way, the human being would act to further train the algorithm.

This, of course, is where we are with virtual assistants; on the cusp of what may well be a transformational technology – in the form of deep learning – we’re at the peak of inflated expectations (right next to the connected home which, despite the best efforts of electronics industry at this year’s CES, appears not to be fulfilling a single human need – I’m looking at you, Cloi). To misquote Peter Thiel, we were promised general purpose artificial intelligence, and we got a home assistant that looks at the contents of our fridge and tells us to make a sandwich. What a time to be alive.

Read next This CIA spy game reveals the secrets of successful teams This CIA spy game reveals the secrets of successful teams

DeepMind's Mustafa Suleyman: In 2018, AI will gain a moral compass Artificial Intelligence DeepMind's Mustafa Suleyman: In 2018, AI will gain a moral compass

The aspirational narrative is that AI will be everywhere and in every object, as ubiquitous as oxygen. It can help us read X-rays more accurately, pursue science more effectively, empower us to understand foreign languages without studying and ensure that autonomous vehicles behave the way we would like them to. There will be breakthroughs in agriculture, medicine and science. Governments will discover ways to combat inequality and crime.


But we’re not there yet. And we maybe never will, says Gary Marcus, the former head of AI at Uber and a professor at New York University. Marcus – who participated in a robust exchange of views with DeepMind’s Demis Hassabis at the Neural Information Processing Systems Conference in December last year – is known for tempering the excitement within the tech community regarding the progress of research into machine learning. In a paper Deep Learning: A Critical Appraisal published earlier this month, he outlines “concerns” that must be addressed if the most widely known technique in artificial intelligence research is to lead to general purpose AI. Marcus writes that the field may be subject to “irrational exuberance” and outlines what he feels might be done in order to move the field forward.

Marcus argues that in instances when data sets are large enough and labelled, and computing power is unlimited, then deep learning acts as a powerful tool. However, “systems that rely on deep learning frequently have to generalise beyond the specific data that they have seen, whether to a new pronunciation of a word or to an image that differs from one that the system has seen before, and where data are less than infinite, the ability of formal proofs to guarantee high-quality performance is more limited.”

The paper outlines ten areas in which Marcus argues deep learning has limitations; for instance, it’s need for vast labelled data sets. In applications such as image recognition, a more restricted volume of data can mean that deep learning has difficulty in generalising novel perspectives (a round sticker on a parking sign could be identified as, say, a ball). Marcus points out that the ‘deep’ in ‘deep learning’ refers to the architectural nature of the system – meaning the number of layers in today’s neural networks – rather than the conceptual notion of depth. “The representations acquired by such networks don't, for example, naturally apply to abstract concepts like ‘justice’, ‘democracy’ or ‘meddling’”, he writes, arguing that there is a superficiality to some patterns extracted by the technique.

Read next Covid-19 has shown how easy it is to automate white-collar work Covid-19 has shown how easy it is to automate white-collar work

Fake news 2.0: AI will soon be able to mimic any human voice Artificial Intelligence Fake news 2.0: AI will soon be able to mimic any human voice

Deep learning-based language models are questioned. Marcus’s view is that the technology learns correlations between words that are flat, rather than hierarchical. “As a result, deep learning systems are forced to use a variety of proxies that are ultimately inadequate, such as the sequential position of a word presented in a sequences [sic].” Similarly, Marcus argues that the technology is limited when it comes to open-ended inferences based on real-world knowledge, meaning that machines wouldn’t understand the difference between “John promised Mary to leave” and “John promised to leave Mary”.

Transparency too is a cause for concern. The millions or billions of parameters in a neural network are identifiable via their geography rather than “human interpretable labels”. Marcus argues that, in fields such as finance or medicine, researchers are unable to discover why a particular decision has been made and a lack of this type of opacity could mask in-built bias.

Bias is also evident in research done without the addition of prior knowledge: most deep learning is self-contained and pertains to correlations, rather than abstractions. “Problems that have less to do with categorisation and more to do with common sense reasoning essentially lie outside the scope of what deep learning is appropriate for, and so far as I can tell, deep learning has little to offer such problems.”


Other concerns are that deep learning operates well with data sets that are fixed and stable – such as the games of Go, or chess – but less well in dynamic and shifting environments, such as economics; that applications such as image recognition are easily ‘spoofed’ – meaning that they misinterpret what they’re ‘seeing’ – and that deep learning is inherently difficult to engineer with as it has none of the transparency or debuggability or programming languages.

What is apparent from Marcus' paper is that, while there is immense excitement around the potential of the field, machine learning is extremely hard. While there is significant progress being made, researchers are still a long way from applications of machine learning beyond pattern classification.

Marcus argues that, while there is – justifiably – excitement about research at companies such as DeepMind, there is unlikely to be a revolutionary breakthrough in the short term. Rather, the science will develop as most sciences do, incrementally over time. While we’re out of the AI winter, we may be heading towards the trough of disillusionment. Marcus uses a quote from Geoffrey Hinton, seen by many as the godfather of machine learning, to illustrate his point. “Science progresses one funeral at a time. The future depends on some graduate student who is deeply suspicious of everything I have said.” The same should go for Marcus too.