By: John Manslow
Recent years have seen an explosion of interest in machine learning, with companies such as Amazon, Facebook, Google, IBM and Microsoft investing billions of dollars in acquisitions and R&D. This has caused a tenfold increase in the number of machine learning patent filings in the last decade and resulted in the development of numerous new products and capabilities – machine learning allows Amazon to perform dynamic price optimizations; Microsoft to offer live translation in Skype; and Google to rank web pages based on more nuanced interpretations of their content and users' search queries. So, where did machine learning come from and why is there so much interest in it now?
The term 'machine learning' was coined in the late 1950s by researchers in artificial intelligence who believed that the best way to make computers behave intelligently was to give them the ability to learn. Earlier research had focused on using human experts to write rules that computers could follow but it quickly became apparent that, even for simple problems like playing checkers, writing effective rules was extremely difficult. In the late 1950s and throughout the 1960s, simple machine learning algorithms were developed and successfully applied to a number of problems that had previously been impossible to solve. Despite these early successes, the 1970s and 1980s were, for the most part, decades of disappointment and disillusionment; progress was slow, research was poorly funded and there was very little serious commercial interest in applying machine learning to real world problems.
In the late 1980s, however, there were major breakthroughs: a way of training arbitrarily complex neural networks was discovered; and the problem of behaving optimally in an environment with delayed rewards was solved by reinforcement learning.
In the early 1990s, machine learning research was pursued with greater mathematical rigour, leading to the development of new algorithms and kernel methods – such as Bayesian neural networks, support vector machines (SVMs) and Gaussian processes – that significantly improved real world performance. Machine learning was finally ready for commercial exploitation and a number of early adopters started to apply it to problems such as fraud detection, credit scoring and churn prediction. The success of machine learning on real world problems and its clear potential for further development led to an increase in research funding and the field began to attract researchers from mathematics and physics as well as computer science.
During the 1990s, the amount of storage in desktop PCs increased by a factor of a thousand and the race to 1GHz that occurred between AMD and Intel late in the decade resulted in a similar increase in processing power. By the year 2000, most desktop PCs had enough memory to store the amount of data that is required to learn effective solutions to complex problems and enough processing power for that learning to take place in a reasonable amount of time. Technology had even reached the stage where machine learning could be used in computer games – competing against the player on equal terms in Codemasters’ Colin McRae Rally 2.0 and allowing the player to train his own avatar in real time in Lionhead’s Black and White.
In the mid-2000s, general purpose computing on graphics processing units (GPGPU) became possible, low cost multicore processors were developed and cloud computing emerged, bringing further substantial increases in processing power. At the same time, the growth of the internet brought about the big data revolution – vast datasets became available, enabling ever more complex and powerful machine learning algorithms to be applied to real world problems. Learning with deep neural networks (DNNs) – neural networks with large numbers of interconnected layers of neurons – became practical and DNNs were quickly found to outperform all previous approaches in fields such as speech recognition and image labelling – eventually surpassing human performance.
The last decade has seen widespread commercial exploitation of machine learning, which has become the technology of choice for market leaders in a wide range of industries due to its ability to reliably extract extremely complex relationships from data, provide unique insights and make accurate predictions. The theoretical advances of the 1990s provided machine learning algorithms such as Bayesian neural networks and SVMs that perform well in environments where data is scarce, and the developments that occurred in the late 2000s provided algorithms such as DNNs that offer unrivalled performance with big data.
Today, machine learning algorithms run on almost every platform from embedded processors in central heating systems to vast parallel processing cloud computing networks and are used commercially for everything from reading postal/ZIP codes, processing cheques, generating credit scores, detecting fraud, recognizing speech, translating text, recommending movies, matching players in online games, optimizing prices, ranking search results, filtering spam, animating virtual characters, optimizing data centres, playing videogames, labelling images, driving autonomous vehicles, to identifying pirated content.
Business is only starting to scratch the surface of what machine learning has to offer but there’s no doubt about its potential to transform business processes and create entirely new classes of products. Machine learning is no longer the preserve of the researcher and the academic but has become firmly established as a tool for business.
Upcoming posts in this series will introduce the general principles of machine learning and examine the internals of some of the most powerful and widely used machine learning algorithms – SVMs, Bayesian networks, decision trees, Bayesian neural networks and deep neural networks – and describe how they can be applied in practice to solve real world problems.
The next version of the WPS data analytics platform will see the addition of many of these valuable machine learning techniques later in 2017, offering the opportunity to gain greater insight into data with minimum fuss using the solid and consistent WPS software suite.
About John Manslow
John joined World Programming in 2014 as a senior R&D software engineer in the mathematics and statistics team. He is currently developing state-of-the-art deep neural network capabilities for the WPS platform due for release in 2017. John has a Ph.D. in Machine Learning from the University of Southampton and more than 20 years of commercial experience with machine learning and statistical methods. He has more than ten patents, has contributed to seven books on game AI, sat on review committees for the AAAI and lectured at the University of Southampton.