Data Analytics Python

Data Science and Analytics with Python

A collection of post related to my upcoming book “Data Science and Analytics with Python”Take a look and enjoy.

CoreML - Boston Model: The Complete App

Look how far we have come... We started this series by looking at what CoreML is and made sure that our environment was suitable. We decided to use linear regression as our model, and chose to use the Boston Price dataset in our exploration for this implementation. We built our model using Python and created our .mlmodel object and had a quick exploration of the model's properties. We then started to build our app using Xcode (see Part 1, Part 2 and Part 3). In this final part we are going to take the .mlmodel and include it in out Xcode project, we will then use the inputs selected from out picker and calculate a prediction (based on our model) to be displayed to the user. Are you ready? Nu kör vi!

Let us start by adding the .mlmodel we created earlier on so that it is an available resource in our project. Open your Xcode project and locate your PriceBoston.mlmodel file. From the menu on the left-hand side select the "BostonPricer" folder. At the bottom of the window you will see a + sign, click on it and select "New Groups". This will create a sub-folder within "BostonPricer". Select the new folder and hit the return key, this will let you rename the folder to something more useful. In this case I am going to call this folder "Resources".

Open Finder and navigate to the location of your BostonPricer.mlmodel. Click and drag the file inside the "Resources" folder we just created. This will open a dialogue box asking for some options for adding this file to your project. I selected the "Create Folder References" and left the rest as it was shown by default. After hitting "Finish" you will see your model now being part of your project. Let's now go the code in ViewController and make some needed changes.  The first one is to tell our project that we are going to need the powers of the CoreML framework. At the top of the file, locate a line of code that imports UIKit, right below it type the following:

import CoreML

Inside the definition of the ViewController class, let us define a constant to reference the model. Look for the definitions of the crimeData and roomData constants and nearby them type the following:

let model = PriceBoston()

You will see that when you start typing the name of the model, Xcode will suggest the right name as it knows about the existence of the model as part of its resources, neat!

We need to make some changes to the getPrediction()function we created in the last post. Go to the function and look for place where we pick the values of crime and rooms and right after that write the following:

guard let priceBostonOutput = try? model.prediction(
            rooms: Double(rooms)
            ) else {
                fatalError("Unexpected runtime error.")

You may get a warning telling you that the constant priceBostonOutput was defined but not used. Don't worry, we will indeed use it in a little while. Just a couple of words about this piece of code, you will see that we are using the prediction method defined in the model and that we are passing the two input parameters that the model expects, namely crime and rooms. We are wrapping this call to the prediction method around a try statement so that we can catch any exceptions. This is where we are implementing our CoreML mode!!! Isn't that cool‽

We are not done yet though; remember that we have that warning from Xcode about using the model. Looking at the properties of the model, we can see that we also have an output attribute called price. This is the prediction we are looking for and the one we would like to display. Out of the box it may have a lot of decimal figures, and it is never a good practice to display those to the user (although they are important in precision terms...). Also, with Swift's strong typing we would have to typecast the double returned by the model into a string that can be printed. So, let us prepare some code to format the predicted price. At the top of the ViewController class, find the place where we defined the constants crimeData and roomData. Below them type the following code:

let priceFormat: NumberFormatter = {
        let formatting = NumberFormatter()
        formatting.numberStyle = .currency
        formatting.maximumFractionDigits = 2
        formatting.locale = Locale(identifier: "en_US")
        return formatting

We are defining a format that will show a number as currency in US dollars with two decimal figures. We can now pass our predicted price to this formatter and assign it to a new constant for future reference. Below the code where the getPrediction function was defined, write the following:

let priceText = priceFormat.string(from: NSNumber(value:

Now we have a nicely formatted string that can be used in the display. Let us change the message that we are asking our app to show when pressing the button:

let message = "The predicted price (in $1,000s) is " + priceText!

We are done! Launch your app simulator, select a couple of values from the picker and hit the "Calculate Prediction" button... Et voilà, we have completed our first implementation of a CoreML model in a working app.

There are many more things that we can do to improve the app. For instance, we can impose some constraints on the position of the different elements shown in the screen so that we can deploy the application in the various screen sizes offered by Apple devices. Improve the design and usability of the app and designing appropriate icons for the app (in various sizes). For the time being, I will leave some of those tasks for later. In the meantime you can take a look at the final code in my github site here.

Enjoy and do keep in touch, I would love to hear if you have found this series useful.



CoreML - iOS Implementation for the Boston Model (part 3) - Button

We are very close at getting a functioning app for our Boston Model. In the last post we were able to put together the code that fills in the values in the picker and were able to "pick" the values shown for crime rate and number of rooms respectively. These values are fed to the model we built in one of the earlier posts of this series and the idea is that we will action this via a button that triggers the calculation of the prediction. In turn the prediction will be shown in a floating dialogue box.

In this post we are going to activate the functionality of the button and show the user the values that have been picked. With this we will be ready to weave in the CoreML model in the final post of this series. So, what are we waiting for? Let us launch Xcode and get working. We have already done a bit of work for the button in the previous post where we connected the button to the ViewController generating a line of code that read as follows:

@IBOutlet weak var predictButton: UIButton!

If we launch the application and click on the button, sadly, nothing will happen. Let's change that: in the definition of the UIViewController class, after the didReceiveMemoryWarning function write the following piece of code:

@IBAction func getPrediction() {
        let selectedCrimeRow = inputPicker.selectedRow(inComponent: inputPredictor.crime.rawValue)
        let crime = crimeData[selectedCrimeRow]

        let selectedRoomRow = inputPicker.selectedRow(inComponent: inputPredictor.rooms.rawValue)
        let rooms = roomData[selectedRoomRow]

        let message = "The picked values are Crime: \(crime) and Rooms: \(rooms)"

        let alert = UIAlertController(title: "Values Picked",
                                      message: message,
                                      preferredStyle: .alert)

        let action = UIAlertAction(title: "OK", style: .default,
                                   handler: nil)

        present(alert, animated: true, completion: nil)

The first four lines of the getPrediction function takes the values from the picker and creates some constants for crime and rooms that will then be used in a message to be displayed in the application. We are telling Xcode to treat this message as an alert and ask it to present it to the user (last line in the code above). What we need to do now is tell Xcode that this function is to be triggered when we click on the button.

There are several way we can connect the button with the code above. In this case we are going to go to the Main.storyboard, control+click on the button and drag. This will show an arrow, we need to connect that arrow with the View Controller icon (a yellow circle with a white square inside) at the top of the view controller window we are putting together. When you let go, you will see a drop-down menu. From there, under "Sent Events" select the function we created above, namely getPrediction. See the screenshots below:

You can now run the application. Select a number from each of the columns in the picker, and when ready, prepare to be amazed: Click on the "Calculate Prediction" button, et voilà - you will see a new window telling you the values you have just picked. Tap "OK" and start again!

In the next post we will add the CoreML model, and modify the event for the button to take the two values picked and calculate a prediction which in turn will be shown in the floating window. Stay tuned.

You can look at the code (in development) in my github site here.


nteract - a great Notebook experience

I am a supporter of using Jupyter Notebooks for data exploration and code prototyping. It is a great way to start writing code and immediately get interactive feedback. Not only can you document your code there using markdown, but also you can embed images, plots, links and bring your work to life.

Nonetheless, there are some little annoyances that I have, for instance the fact that I need to launch a Kernel to open a file and having to do that "the long way" - i.e. I cannot double-click on the file that I am interested in seeing. Some ways to overcome this include looking at Gihub versions of my code as the notebooks are rendered automatically, or even saving HTML or PDF versions of the notebooks. I am sure some of you may have similar solutions for this.

Last week, while looking for entries on something completely different, I stumbled upon a post that suggested using nteract. It sounded promising and I took a look. It turned out to be related to the Hydrogen package available for Atom, something I have used in the past and loved it. nteract was different though as it offered a desktop version and other goodies such as in-app support for publishing, a terminal-free experience sticky cells, input and output hiding... Bring it on!

I just started using it, and so far so good. You may want to give it a try, and maybe even contribute to the git repo.



Intro to Data Science Talk

Full room and great audience at General Assembly his evening. Lots of thoughtful questions and good discussion.


Apple ML

Machine Learning with Apple - An Open Notebook

We all know how cool machine learning, predictive analytics and data science concepts and problems are. There are a number of really interesting technologies and frameworks to use and choose from. I have been a Python and R user for some time now and they seem to be pretty good for a lot of the things I have to do on a day-to-day basis.

As many of you know, I am also a mac user and have been for quite a lot time. I remember using early versions of Mathematica on PowerMacs back at Uni... I digress..


Apple has also been moving into the machine learning arena and has made available a few interesting goodies that help people like me make the most of the models we develop.

I am starting a series of posts that I hope can be seen as an "open notebook" of my experimentation and learning with Apple technology. One that comes to mind is CoreML, a new framework that makes running various machine learning and statistical models on macOS and iOS natively supported. The idea is that the framework helps data scientists and developers bridge the gap between them by integrating trained models into our apps. Sounds cool, don't you think? Ready... Let's go!


Now... presenting at ODSC Europe

Data science is definitely in everyone’s lips and this time I had the opportunity of showcasing some of my thoughts, practices and interests at the Open Data Science Conference in London.

The event was very well attended by data scientists, engineers and developers at all levels of seniority, as well as business stakeholders. I had the great opportunity to present the landscape that newcomers and seasoned practitioners must be familiar with to be able to make a successful transition into this exciting field.

It was also a great opportunity to showcase “Data Science and Analytics with Python” and to get to meet new people including some that know other members of my family too.



Data Science and Analytics with Python - New York Team

Earlier this week I received this picture of the team in New York. As you can see they have recently all received a copy of my "Data Science and Analytics with Python" book.

Thanks guys!



Another "Data Science and Analytics with Python" Delivered

Another "Data Science and Analytics with Python" Delivered. Thanks for sharing the picture Dave Groves.


Data Science and Analytics - In the hands of readers!

I’m very pleased to see that my “Data Science and Analytics” book is arriving to the hands of readers.

Here’s a picture that my colleague and friend Rob Hickling sent earlier today:



Data Science and Analytics with Python already being suggested!

"Data Science and Analytics with Python" was published yesterday and now it is already appearing as a suggested book for related titles.

You can find it with the link above or in Amazon here.




"Data Science and Analytics with Python" is published

Very pleased to see that finally the publication of my "Data Science and Analytics with Python" book has arrived.


Final version of "Data Science and Analytics with Python" approved

It has been a long road, one filled with unicorns and Jackalopes, decision trees and random forests, variance and bias, cats and dogs, and targets and features.

Well over a year ago, the idea of writing another book seemed like a farfetched proposition. Writing the book came about from the work that I have been doing in the area as well as from discussions with my colleagues and students, including also practitioners and beneficiaries of data science and analytics.

It is my sincere hope that the book is useful to those coming afresh to this new field as well as to those more seasoned data scientists.

This afternoon I had the pleasure of approving the final version of the book that will be sent to the printers in the next few days.

Once the book is available you can get a copy directly with CRC Press or from Amazon.




Data Science and Analytics with Python - Cover

Well, I am very pleased to show you the cover that will be used for "Data Science and Analytics with Python" book. Not long to publication day!



%d bloggers like this: