Data Skeptic Podcast

I had an opportunity to be one of the panellists in the Data Skeptic podcast recently. It was great to have been invited and as a listener to the podcast it was a really treat to be able to take part. Also, recording it was fun…

You can listen to the episode here.

More information about the Data Skeptic Journal Club can be found in their site. I would like to thank  Kyle Polich, Lan Guo and George Kemp for having me as a guest. I hope it is not the last time!

In the episode Kyle talks about the relationship between Covid-19 and Carbon Emissions. George tells us about the new Hateful Memes Challenge from Facebook. Lan joins us to talk about Google’s AI Explorables. I talk about a paper that uses neural networks to detect infections in the ear.

Let me know what you guys think!

Getting Answers for Core ML deployment from my own Book

I was working today in the deployment of a small neural network model prototype converted to Core ML to be used in an iPhone app.

I was trying to find the best way to get things to work and then it occurred to me I had solved a similar issue before… where‽ when‽ aha!

The answer was actually in my Advanced Data Science and Analytics with Python.

Top Free Books for Deep Learning

This collection includes books on all aspects of deep learning. It begins with titles that cover the subject as a whole, before moving onto work that should help beginners expand their knowledge from machine learning to deep learning. The list concludes with books that discuss neural networks, both titles that introduce the topic and ones that go in-depth, covering the architecture of such networks.

1. Deep Learning
By Ian Goodfellow, Yoshua Bengio and Aaron Courville

The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. The online version of the book is now complete and will remain available online for free.

2. Deep Learning Tutorial
By LISA Lab, University of Montreal

Developed by LISA lab at University of Montreal, this free and concise tutorial presented in the form of a book explores the basics of machine learning. The book emphasizes with using the Theano library (developed originally by the university itself) for creating deep learning models in Python.

3. Deep Learning: Methods and Applications
By Li Deng and Dong Yu

This book provides an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks.

4. First Contact with TensorFlow, get started with Deep Learning Programming
By Jordi Torres

This book is oriented to engineers with only some basic understanding of Machine Learning who want to expand their wisdom in the exciting world of Deep Learning with a hands-on approach that uses TensorFlow.

5. Neural Networks and Deep Learning
By Michael Nielsen

This book teaches you about Neural networks, a beautiful biologically-inspired programming paradigm which enables a computer to learn from observational data. It also covers deep learning, a powerful set of techniques for learning in neural networks.

6. A Brief Introduction to Neural Networks
By David Kriesel

This title covers Neural networks in depth. Neural networks are a bio-inspired mechanism of data processing, that enables computers to learn technically similar to a brain and even generalize once solutions to enough problem instances are taught. Available in English and German.

7. Neural Network Design (2nd edition)
By Martin T. Hagan, Howard B. Demuth, Mark H. Beale and Orlando D. Jess

NEURAL NETWORK DESIGN (2nd Edition) provides a clear and detailed survey of fundamental neural network architectures and learning rules. In it, the authors emphasize a fundamental understanding of the principal neural networks and the methods for training them. The authors also discuss applications of networks to practical engineering problems in pattern recognition, clustering, signal processing, and control systems. Readability and natural flow of material is emphasized throughout the text.

8. Neural Networks and Learning Machines (3rd edition)
By Simon Haykin

This third edition of Simon Haykin’s book provides an up-to-date treatment of neural networks in a comprehensive, thorough and readable manner, split into three sections. The book begins by looking at the classical approach on supervised learning, before continuing on to kernel methods based on radial-basis function (RBF) networks. The final part of the book is devoted to regularization theory, which is at the core of machine learning.

CoreML – iOS App Implementation for the Boston Price Model (Part 1)

Hey! How are things? I hope the beginning of the year is looking great for you all. As promised, I am back to continue the open notebook for the implementation of a Core ML model in a simple iOS app. In one of the previous post we created a linear regression model to predict prices for Boston properties (1970 prices that is!) based on two inputs: the crime rate per capita in the area and the average number of rooms in the property. Also, we saw (in a different post) the way in which Core ML implements the properties of the model to be used in an iOS app to carry out the prediction on device!

In this post we will start building the iOS app that will use the model to enable our users to generate a prediction based on input values for the parameters used in the model. Our aim is to build a simple interface where the user enters the values and the predicted price is shown. Something like the following screenshot:

You will need to have access to a Mac with the latest version Xcode. At the time of writing I am using Xcode 9.2. We will cover the development of the app, but not so much the deployment (we may do so in case people make it known to me that there is interest).

In Xcode we will select the “Create New Project” and in the next dialogue box, from the menu at the top make sure that you select “iOS” and from the options shown, please select the “Single View App” option and then click the “Next” button.

This will create an iOS app with a single page. If you need more pages/views, this is still a good place to start, as you can add further “View Controllers” while you develop the app. Right, so in the next dialogue box Xcode will be asking for options to create the new project. Give your project a name, something that makes it easier to elucidate what your project is about. In this case I am calling the project “BostonPricer”. You can also provide the name of a team (team of developers contributing to your app for instance) as well as an organisation name and identifier. In our case these are not that important and you can enter any suitable values you desire. Please note that this becomes more important in case you are planning to send your app for approval to Apple. Anyway, make sure that you select “Swift” as the programming language and we are leaving the option boxes for “Use Core Data”, “Include Unit Tests” and “Include UI Tests” unticked. I am redacting some values below:

On the left-hand side menu, click on the “Main.storyboard”. This is the main view that our users will see and interact with. It is here where we will create the design, look-and-feel and interactions in our app.

 

We will start placing a few objects in our app, some of them will be used simple to display text (labels and information), whereas others will be used to create interactions, in particular to select input values and to generate the prediction. To do that we will use the “Object Library”. In the current window of Xcode, on the bottom-right corner you will see an icon that looks like a little square inside a circle; this is the “Show the Object Library” icon. When you select it, at the bottom of the area you will see a search bar. There you will look for the following objects:

  • Label
  • Picker View
  • Button

You will need three labels, one picker and one button. You can drag each of the elements from the “Object Library” results shown and into the story board. You can edit the text for the labels and the button by double clicking on them. Do not worry about the text shown for the picker; we will deal with these values in future posts. Arrange the elements as shown in the screenshot below:

OK, so far so good. In the next few posts we will start creating the functionality for each of these elements and implement the prediction generated by the model we have developed. Keep in touch.

You can look at the code (in development) in my github site here.

CoreML – Model properties

If you have been following the posts in this open notebook, you may know that by now we have managed to create a linear regression model for the Boston Price dataset based on two predictors, namely crime rate and average number of rooms. It is by no means the best model out there ad our aim is to explore the creation of a model (in this case with Python) and convert it to a Core ML model that can be deployed in an iOS app.

Before move on to the development of the app, I thought it would be good to take a look at the properties of the converted model. If we open the PriceBoston.mlmodel we saved in the previous post (in Xcode of course) we will see the following information:

We can see the name of the model (PriceBoston) and the fact that it is a “Pipeline Regressor”. The model can be given various attributes such as Author, Description, License, etc. We can also see the listing of the Model Evaluation Parameters in the form of Inputs (crime rate and number of rooms) and Outputs (price). There is also an entry to describe the Model Class (PriceBoston) and without attaching this model to a target the class is actually not present. Once we make this model part of a target inside an app, Xcode will generate the appropriate code

Just to give you a flavour of the code that will be generated when we attach this model to a target, please take a look at the screenshot below:


You can see that the code was generated automatically (see the comment at the beginning of the Swift file). The code defines the input variables and feature names, defines a way to extract values out of the input strings, sets up the model output and other bits and pieces such as defining the class for model loading and prediction (not shown). All this is taken care of by Xcode, making it very easy for us to use the model in our app. We will start building that app in the following posts (bear with me, I promise we will get there).

Enjoy!

CoreML – Building the model for Boston Prices

In the last post we have taken a look at the Boston Prices dataset loaded directly from Scikit-learn. In this post we are going to build a linear regression model and convert it to a .mlmodel to be used in an iOS app.

We are going to need some modules:

import coremltools
import pandas as pd
from sklearn import datasets, linear_model
from sklearn.model_selection import train_test_split
from sklearn import metrics
import numpy as np

The cormeltools is the module that will enable the conversion to use our model in iOS.

Let us start by defining a main function to load the dataset:

def main():
    print('Starting up - Loading Boston dataset.')
    boston = datasets.load_boston()
    boston_df = pd.DataFrame(boston.data)
    boston_df.columns = boston.feature_names
    print(boston_df.columns)

In the code above we have loaded the dataset and created a pandas dataframe to hold the data and the names of the columns. As we mentioned in the previous post, we are going to use only the crime rate and the number of rooms to create our model:

    print("We now choose the features to be included in our model.")
    X = boston_df[['CRIM', 'RM']]
    y = boston.target

Please note that we are separating the target variable from the predictor variables. Although this dataset in not too large, we are going to follow best practice and split the data into training and testing sets:

    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=7)

We will only use the training set in the creation of the model and will test with the remaining data points.

    my_model = glm_boston(X_train, y_train)

The line of code above assumes that we have defined the function glm_boston as follows:

def glm_boston(X, y):
    print("Implementing a simple linear regression.")
    lm = linear_model.LinearRegression()
    gml = lm.fit(X, y)
    return gml

Notice that we are using the LinearRegression implementation in Scikit-learn. Let us go back to the main function we are building and extract the coefficients for our linear model. Refer to the CoreML – Linear Regression post to remember that type of model that we are building is of the form  y=\alpha + \beta_1 x_1 + \beta_2 x_2 + \epsilon:

    coefs = [my_model.intercept_, my_model.coef_]
    print("The intercept is {0}.".format(coefs[0]))
    print("The coefficients are {0}.".format(coefs[1]))

We can also take a look at some metrics that tell let us evaluate our model against the test data:

    # calculate MAE, MSE, RMSE
    print("The mean absolute error is {0}.".format(
        metrics.mean_absolute_error(y_test, y_pred)))
    print("The mean squared error is {0}.".format(
        metrics.mean_squared_error(y_test, y_pred)))
    print("The root mean squared error is {0}.".format(
        np.sqrt(metrics.mean_squared_error(y_test, y_pred))))

CoreML conversion

And now for the big moment: We are going to convert our model to an .mlmodel object!! Ready?

    print("Let us now convert this model into a Core ML object:")
    # Convert model to Core ML
    coreml_model = coremltools.converters.sklearn.convert(my_model,
                                        input_features=["crime", "rooms"],
                                        output_feature_names="price")
    # Save Core ML Model
    coreml_model.save("PriceBoston.mlmodel")
    print("Done!")

We are using the sklearn.convert method of coremltools.converters to create the my_model model with the necessary inputs (i.e. crime and rooms) and output (price). Finally we save the model in a file with the name PriceBoston.mlmodel.

Et voilà! In the next post we will start creating an iOS app to use the model we have just built.

You can look at the code (in development) in my github site here.

CoreML – Boston Prices exploration

In the previous post of this series we described some of the basics of linear regression, one of the most well-known models in machine learning. We saw that we can relate the values of input parameters x_i to the target variable y to be predicted. In this post we are going to create a linear regression model to predict the price of houses in Boston (based on valuations from 1970s). The dataset provides information such as Crime (CRIM), areas of non-retail business in the town (INDUS), the age of people who own the house (AGE), average number of rooms (RM) as well as the median value of homes in $1000s (MEDV) as well as other attributes.

Let us start by exploring the data. We are going to use Scikit-learn and fortunately the dataset comes with the module. The input variables are included in the data method and the price is given by the target. We are going to load the input variables in the dataframe boston_df and the prices in the array y:

from sklearn import datasets
import pandas as pd 
boston = datasets.load_boston() 
boston_df = pd.DataFrame(boston.data)
boston_df.columns = boston.feature_names
y = boston.target

We are going to build our model using only a limited number of inputs. In this case let us pay attention to the average number of rooms and the crime rate:

X = boston_df[['CRIM', 'RM']]
X.columns = ['Crime', 'Rooms']
X.describe()

The description of these two attributes is as follows:

            Crime       Rooms
count  506.000000  506.000000
mean     3.593761    6.284634
std      8.596783    0.702617
min      0.006320    3.561000
25%      0.082045    5.885500
50%      0.256510    6.208500
75%      3.647423    6.623500
max     88.976200    8.780000

As we can see the minimum number of rooms is 3.5 and the maximum is 8.78, whereas for the crime rate the minimum is 0.006 and the maximum value is 88.97, nonetheless the median is 0.25. We will use some of these values to define the ranges that will be provided to our users to find price predictions.

Finally, let us visualise the data:

We shall bear these values in mind when building our regression model in subsequent posts.

You can look at the code (in development) in my github site here.

What Is Artificial Intelligence?

Original article by JF Puget here.

Here is a question I was asked to discuss at a conference last month: what is Artifical Intelligence (AI)?  Instead of trying to answer it, which could take days, I decided to focus on how AI has been defined over the years.  Nowadays, most people probably equate AI with deep learning.  This has not always been the case as we shall see.

Most people say that AI was first defined as a research field in a 1956 workshop at Dartmouth College.  Reality is that is has been defined 6 years earlier by Alan Turing in 1950.  Let me cite Wikipedia here:

The Turing test, developed by Alan Turing in 1950, is a test of a machine’s ability to exhibit intelligent behaviorequivalent to, or indistinguishable from, that of a human. Turing proposed that a human evaluator would judge natural language conversations between a human and a machine designed to generate human-like responses. The evaluator would be aware that one of the two partners in conversation is a machine, and all participants would be separated from one another. The conversation would be limited to a text-only channel such as a computer keyboard and screen so the result would not depend on the machine’s ability to render words as speech.[2] If the evaluator cannot reliably tell the machine from the human, the machine is said to have passed the test. The test does not check the ability to give correct answers to questions, only how closely answers resemble those a human would give.

The test was introduced by Turing in his paper, “Computing Machinery and Intelligence“, while working at the University of Manchester(Turing, 1950; p. 460).[3] It opens with the words: “I propose to consider the question, ‘Can machines think?'” Because “thinking” is difficult to define, Turing chooses to “replace the question by another, which is closely related to it and is expressed in relatively unambiguous words.”[4] Turing’s new question is: “Are there imaginable digital computers which would do well in the imitation game?”[5] This question, Turing believed, is one that can actually be answered. In the remainder of the paper, he argued against all the major objections to the proposition that “machines can think”.[6]

image

So, the first definition of AI was about thinking machines.  Turing decided to test thinking via a chat.

The definition of AI rapidly evolved to include the ability to perform complex reasoning and planing tasks.  Early success in the 50s led prominent researchers to make imprudent predictions about how AI would become a reality in the 60s.  The lack of realization of these predictions led to funding cut known as the AI winter in the 70s.

In the early 80s, building on some success for medical diagnosis, AI came back with expert systems.  These systems were trying to capture the expertise of humans in various domains, and were implemented as rule based systems.  This was the days were AI was focusing on the ability to perform tasks at best human expertise level.  Success like IBM Deep Blue beating the chess world champion, Gary Kasparov, in  1997 was the acme of this line of AI research.

Let’s contrast this with today’s AI.  The focus is on perception: can we have systems that recognize what is in a picture, what is in a video, what is said in a sound track?  Rapid progress is underway for these tasks thanks to the use of deep learning.  Is it AI still?  Are we automating human thinking?  Reality is we are working on automating tasks that most humans can do without any thinking effort. Yet we see lots of bragging about AI being a reality when all we have is some ability to mimic human perception.  I really find it ironic that our definition of intelligence is that of mere perception  rather than thinking.

Granted, not all AI work today is about perception.  Work on natural language processing (e.g. translation) is a bit closer to reasoning than mere perception tasks described above.  Success like IBM Watson at Jeopardy, or Google AlphaGO at Go are two examples of the traditional AI aiming at replicate tasks performed by human experts.    The good news (to me at least) is that the progress is so rapid on perception that it will move from a research field to an engineering field in the coming years.  We will then see a re-positioning of researchers on other AI related topics such as reasoning and planning.  We’ll be closer to Turing’s initial view of AI.

Data Science & Augmented Intelligence – Reblog from “Data Science: a new discipline to change the world” by Alan Wilson

This is a reblog of the post by Alan Wilson that appeared in the EPSRC blog. You can see the original here.

====

Data science – the new kid on the block

I have re-badged myself several times in my research career: mathematician, theoretical physicist, economist (of sorts), geographer, city planner, complexity scientist, and now data scientist. This is partly personal idiosyncrasy but also a reflection of how new interdisciplinary research challenges emerge. I now have the privilege of being the Chief Executive of The Alan Turing Institute – the national centre for data science. ‘Data science’ is the new kid on the block. How come?

First, there is an enormous amount of new ‘big’ data; second, this has had a powerful impact on all the sciences; and thirdly, on society, the economy and our way of life. Data science represents these combinations. The data comes from wide-spread digitisation combined with the ‘open data’ initiatives of government and extensive deployment of sensors and devices such as mobile phones. This generates huge research opportunities.

In broad terms, data science has two main branches. First, what can we do with the data? Applications of statistics and machine learning fall under this branch. Second, how can we transform existing science with this data and these methods? Much of the second is rooted in mathematics. To make this work in practice, there is a time-consuming first step: making the data useable by combining different sources in different formats. This is known as ‘data wrangling’, which coincidentally is the subject of a new Turing research project to speed up this time-consuming process. The whole field is driven by the power of the computer, and computer science. Understanding the effects of data on society, and the ethical questions it provokes, is led by the social sciences.

All of this combines in the idea of artificial intelligence, or AI. While the ‘machine’ has not yet passed the ‘Turing test’ and cannot compete with humans in thought, in many applications AI and data science now support human decision making. The current buzz phrase for this is ‘augmented intelligence’.

Cross-disciplinary potential

I can illustrate the research potential of data science through two examples, the first from my own field of urban research; the second from medicine – with recent AI research in this field learned, no doubt imperfectly, from my Turing colleague Mihaela van der Schaar.

There is a long history of developing mathematical and computer models of cities. Data arrives very slowly for model calibration – the census, for example, is critical. A combination of open government data and real-time flows from mobile phones and social media networks has changed this situation: real-time calibration is now possible. This potentially transforms both the science and its application in city planning. Machine learning complements, and potentially integrates with, the models. Data science in this case adds to an existing deep knowledge base.

Medical diagnosis is also underpinned by existing knowledge – physiology, cell and molecular biology for example. It is a skilled business, interpreting symptoms and tests. This can be enhanced through data science techniques – beginning with advances in imaging and visualisation and then the application of machine learning to the variety of evidence available. The clinician can add his or her own judgement. Treatment plans follow. At this point, something really new kicks in. ‘Live’ data on patients, including their responses to treatment, becomes available. This data can be combined with personal data to derive clusters of ‘like’ patients, enabling the exploration of the effectiveness of different treatment plans for different types of patients. This combination of data science techniques and human decision making is an excellent example of augmented intelligence. This opens the way to personalised intelligent medicine, which is set to have a transformative effect on healthcare (for those interested in finding out more, reserve a place for Mihaela van der Schaar’s Turing Lecture on 4 May).

An exciting new agenda

These kinds of developments of data science, and the associated applications, are possible in almost all sectors of industry. It is the role of the Alan Turing Institute to explore both the fundamental science underpinnings, and the potential applications, of data science across this wide landscape.

We currently work in fields as diverse as digital engineering, defence and security, computer technology and finance as well as cities and health. This range will expand as this very new Institute grows. We will work with and through universities and with commercial, public and third sector partners, to generate and develop the fruits of data science. This is a challenging agenda but a hugely exciting one.