Data Science & Augmented Intelligence – Reblog from “Data Science: a new discipline to change the world” by Alan Wilson

This is a reblog of the post by Alan Wilson that appeared in the EPSRC blog. You can see the original here.


Data science – the new kid on the block

I have re-badged myself several times in my research career: mathematician, theoretical physicist, economist (of sorts), geographer, city planner, complexity scientist, and now data scientist. This is partly personal idiosyncrasy but also a reflection of how new interdisciplinary research challenges emerge. I now have the privilege of being the Chief Executive of The Alan Turing Institute – the national centre for data science. ‘Data science’ is the new kid on the block. How come?

First, there is an enormous amount of new ‘big’ data; second, this has had a powerful impact on all the sciences; and thirdly, on society, the economy and our way of life. Data science represents these combinations. The data comes from wide-spread digitisation combined with the ‘open data’ initiatives of government and extensive deployment of sensors and devices such as mobile phones. This generates huge research opportunities.

In broad terms, data science has two main branches. First, what can we do with the data? Applications of statistics and machine learning fall under this branch. Second, how can we transform existing science with this data and these methods? Much of the second is rooted in mathematics. To make this work in practice, there is a time-consuming first step: making the data useable by combining different sources in different formats. This is known as ‘data wrangling’, which coincidentally is the subject of a new Turing research project to speed up this time-consuming process. The whole field is driven by the power of the computer, and computer science. Understanding the effects of data on society, and the ethical questions it provokes, is led by the social sciences.

All of this combines in the idea of artificial intelligence, or AI. While the ‘machine’ has not yet passed the ‘Turing test’ and cannot compete with humans in thought, in many applications AI and data science now support human decision making. The current buzz phrase for this is ‘augmented intelligence’.

Cross-disciplinary potential

I can illustrate the research potential of data science through two examples, the first from my own field of urban research; the second from medicine – with recent AI research in this field learned, no doubt imperfectly, from my Turing colleague Mihaela van der Schaar.

There is a long history of developing mathematical and computer models of cities. Data arrives very slowly for model calibration – the census, for example, is critical. A combination of open government data and real-time flows from mobile phones and social media networks has changed this situation: real-time calibration is now possible. This potentially transforms both the science and its application in city planning. Machine learning complements, and potentially integrates with, the models. Data science in this case adds to an existing deep knowledge base.

Medical diagnosis is also underpinned by existing knowledge – physiology, cell and molecular biology for example. It is a skilled business, interpreting symptoms and tests. This can be enhanced through data science techniques – beginning with advances in imaging and visualisation and then the application of machine learning to the variety of evidence available. The clinician can add his or her own judgement. Treatment plans follow. At this point, something really new kicks in. ‘Live’ data on patients, including their responses to treatment, becomes available. This data can be combined with personal data to derive clusters of ‘like’ patients, enabling the exploration of the effectiveness of different treatment plans for different types of patients. This combination of data science techniques and human decision making is an excellent example of augmented intelligence. This opens the way to personalised intelligent medicine, which is set to have a transformative effect on healthcare (for those interested in finding out more, reserve a place for Mihaela van der Schaar’s Turing Lecture on 4 May).

An exciting new agenda

These kinds of developments of data science, and the associated applications, are possible in almost all sectors of industry. It is the role of the Alan Turing Institute to explore both the fundamental science underpinnings, and the potential applications, of data science across this wide landscape.

We currently work in fields as diverse as digital engineering, defence and security, computer technology and finance as well as cities and health. This range will expand as this very new Institute grows. We will work with and through universities and with commercial, public and third sector partners, to generate and develop the fruits of data science. This is a challenging agenda but a hugely exciting one.

Artificial Intelligence, Revealed

A few weeks ago I was invited by General Assembly to give a short intro to Data Science to a group of interested (and interesting) students. They all had different backgrounds, but they all shared an interest for technology and related subjects.

While I was explaining some of the differences between supervised and unsupervised machine learning, I used my example of an alien life trying to cluster (and eventually classify) cats and dogs. If you are interested to know more about this, you will probably have to wait for the publication of my “Data Science and Analytics with Python” book.. I digress…

So, Ed Shipley – one of the admissions managers at GA London – asked me and the students if we had seen the videos that Facebook had produced to explain machine learning… He was reminded of them as they use an example about a machine distinguishing between dogs and cars… (see what they did there?…). If you haven’t seen the videos, here you go:

Intro to AI

Machine Learning

Convolutional Neural Nets

First full draft of “Data Science and Analytics with Python”

It has been nearly 12 months in development almost to the day, and I am very please to tell you that the first full draft of my new book entitled “Data Science and Analytics with Python” is ready.

Data Analytics Python

The book is aimed at data enthusiasts and professionals with some knowledge of programming principles as well as developers and business people interested in learning more about data science and analytics The proposed table of contents is as follows:

  1. The Trials and Tribulations of a Data Scientist
  2. Firsts Slithers with Python
  3. The Machine that Goes “Ping”: Machine Learning and Pattern Recognition
  4. The Relationship Conundrum: Regression
  5. Jackalopes and Hares, Unicorns and Horses: Clustering and Classification
  6. Decisions, Decisions: Hierarchical Clustering, Decision Trees and Ensemble Techniques
  7. Dimensionality Reduction and Support Vector Machines

At the moment the book contains 53 figures and 18 tables, plus plenty of bits and pieces of code ready to be tried.

The next step is to start the re-reading, re-draftings and revisions in preparation for the final version and submission to my publisher CRC Press later in the year. I will keep you posted as how things go.

Keep in touch!


How much should we fear the rise of artificial intelligence?

  1. When the arena is something as pure as a board game, where the rules are entirely known and always exactly the same, the results are remarkable. When the arena is something as messy, unrepeatable and ill-defined as actuality, the business of adaptation and translation is a great deal more difficult.

Tom Chatfield

From the opinion article of Tom Chatfiled in The Guardian.

Astronaut Bowman

Artificial Intelligence – Debunking Myths

Exploring around the interwebs, I came across this article by Rupert Goodwins in ArsTechnica about debunking myths about Artificial Intelligence. 

HAL 9000 in the film 2001.

It is a good read and it you have a few minutes to spare, do give it a go.

Rupert addresses the following myths:

  1. AI’s makes machines that can think.
  2. AI will not be bound by human ethics.
  3. AI will get out of control
  4. Breakthroughs in AI will all happen in sudden jumps.

It is true that there are a number of effort to try to replicate (and therefore understand) human thought. Some examples include the Blue Brain project in the École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland. However, this does not imply that they will get immediately a machine such as HAL or C3-PO.

This is because the brain is fat more complex than the current efforts are able to simulate. As a matter of fact, even simpler brains are significantly more complex for simulation. This does not mean that we should not try to understand and learn how brains work.

Part of the problem is that it is difficult to even define what we mean by “thought”— the so called hard problem. So finding a solution to the strong AI problem is not going to be here soon, but we should definitely try.

So, once that myth is out of the way, the idea that a Terminator-like robot is around the corner is put into perspective. Sure, there are attempts at getting some self-driving cars and such but we are not quite there yet. All in all, it is true that a number of technological advances can be used for good or bad causes, and that is surely something that we all should bear in mind.