Data Science and Analytics with Python

A collection of post related to my upcoming book “Data Science and Analytics with Python”Take a look and enjoy.

Apple ML

Machine Learning with Apple - An Open Notebook

We all know how cool machine learning, predictive analytics and data science concepts and problems are. There are a number of really interesting technologies and frameworks to use and choose from. I have been a Python and R user for some time now and they seem to be pretty good for a lot of the things I have to do on a day-to-day basis.

As many of you know, I am also a mac user and have been for quite a lot time. I remember using early versions of Mathematica on PowerMacs back at Uni... I digress..


Apple has also been moving into the machine learning arena and has made available a few interesting goodies that help people like me make the most of the models we develop.

I am starting a series of posts that I hope can be seen as an "open notebook" of my experimentation and learning with Apple technology. One that comes to mind is CoreML, a new framework that makes running various machine learning and statistical models on macOS and iOS natively supported. The idea is that the framework helps data scientists and developers bridge the gap between them by integrating trained models into our apps. Sounds cool, don't you think? Ready... Let's go!


Now... presenting at ODSC Europe

Data science is definitely in everyone’s lips and this time I had the opportunity of showcasing some of my thoughts, practices and interests at the Open Data Science Conference in London.

The event was very well attended by data scientists, engineers and developers at all levels of seniority, as well as business stakeholders. I had the great opportunity to present the landscape that newcomers and seasoned practitioners must be familiar with to be able to make a successful transition into this exciting field.

It was also a great opportunity to showcase “Data Science and Analytics with Python” and to get to meet new people including some that know other members of my family too.



Data Science and Analytics with Python - New York Team

Earlier this week I received this picture of the team in New York. As you can see they have recently all received a copy of my "Data Science and Analytics with Python" book.

Thanks guys!



Another "Data Science and Analytics with Python" Delivered

Another "Data Science and Analytics with Python" Delivered. Thanks for sharing the picture Dave Groves.


Data Science and Analytics - In the hands of readers!

I’m very pleased to see that my “Data Science and Analytics” book is arriving to the hands of readers.

Here’s a picture that my colleague and friend Rob Hickling sent earlier today:



Data Science and Analytics with Python already being suggested!

"Data Science and Analytics with Python" was published yesterday and now it is already appearing as a suggested book for related titles.

You can find it with the link above or in Amazon here.




"Data Science and Analytics with Python" is published

Very pleased to see that finally the publication of my "Data Science and Analytics with Python" book has arrived.


Final version of "Data Science and Analytics with Python" approved

It has been a long road, one filled with unicorns and Jackalopes, decision trees and random forests, variance and bias, cats and dogs, and targets and features.

Well over a year ago, the idea of writing another book seemed like a farfetched proposition. Writing the book came about from the work that I have been doing in the area as well as from discussions with my colleagues and students, including also practitioners and beneficiaries of data science and analytics.

It is my sincere hope that the book is useful to those coming afresh to this new field as well as to those more seasoned data scientists.

This afternoon I had the pleasure of approving the final version of the book that will be sent to the printers in the next few days.

Once the book is available you can get a copy directly with CRC Press or from Amazon.




Data Science and Analytics with Python - Cover

Well, I am very pleased to show you the cover that will be used for "Data Science and Analytics with Python" book. Not long to publication day!


Data Science and Analytics with Python - Proofread Manuscript

I have now received comments and corrections for the proofreading of my “Data Science and Analytics with Python” book.

Two weeks and counting to return corrections and comments back to the editor and project manager.



Anaconda - Guarenteed Python packages via Conda and Conda-Forge

During the weekend I got a member of the team getting in touch because he was unable to get a Python package working for him . He had just installed Python in his machine, but things were not quite right... For example pip was not working and he had a bit of a bother setting some environment variables... I recommended to him having a look at installing Python via the Anaconda distribution. Today he was up and running with his app.

Given that outcome, I thought it was a great coincidence that the latest episode of Talk Python To Me that started playing on my way back home happened to be about Conda and Conda-Forge. I highly recommend listening to it. Take a loook:

Talk Python To Me - Python conversations for passionate developers - #94 Guarenteed packages via Conda and Conda-Forge

Have you ever had trouble installing a package you wanted to use in your Python app? Likely it contained some odd dependency, required a compilation step, maybe even using an uncommon compiler like Fortran. Did you try it on Windows? How many times have you seen "Cannot find vcvarsall.bat" before you had to take a walk?

If this sounds familiar, you might want to check conda the package manager, Anaconda, the distribution, conda forge, and conda build. They dramatically lower the bar for installing packages on all the platforms.

This week you'll meet Phil Elson, Kale Franz, and Michael Sarahan who all work on various parts of this ecosystem.

Links from the show:

Anaconda distribution:

Phil Elson on Twitter: @pypelson
Kale Franz: @kalefranz
Michael Sarahan:


Data Analytics Python

"Data Science and Analytics with Python" enters production

Data Analytics Python

I am very pleased to tell you about some news I received a couple of weeks ago from my editor: my book "Data Science and Analytics with Python" has been transferred to the production department so that they can begin the publication process!

The book has been assigned a Project Editor who will handle the proofreading and handle all aspects of the production process. This was after clearing the review process I told you about some time ago. The review was lengthy but it was very positive and the comments of the reviewers have definitely improved the manuscript.

As a result of the review, the table of contents has changed a bit since the last update I posted. Here is the revised table:

  1. The Trials and Tribulations of a Data Scientist
  2. Python: For Something Completely Different!
  3. The Machine that Goes “Ping”: Machine Learning and Pattern Recognition
  4. The Relationship Conundrum: Regression
  5. Jackalopes and Hares: Clustering
  6. Unicorns and Horses: Classification
  7. Decisions, Decisions: Hierarchical Clustering, Decision Trees and Ensemble Techniques
  8. Less is More: Dimensionality Reduction
  9. Kernel Trick Under the Sleeve: Support Vector Machines

Each of the chapters is intended to be sufficiently self-contained. There are some occasions where reference to other sections is needed, and I am confident that it is a good thing for the reader. Chapter 1 is effectively a discussion of what data science and analytics are, paying particular attention to the data exploration process and munging. It also offers my perspective as to what skills and roles are required to get a successful data science function.

Chapter 2 is a quick reminder of some of the most important features of Python. We then move into the core of machine learning concepts that are used in the rest of the book. Chapter 4 covers regression from ordinary least squares to LASSO and ridge regression. Chapter 5 covers clustering (k-means for example) and Chapter 6 classification algorithms such as Logistic Regression and Naïve Bayes.

In Chapter 7 we introduce the use of hierarchical clustering, decision trees and talk about ensemble techniques such as bagging and boosting.

Dimensionality reduction techniques such as Principal Component Analysis are discussed in Chapter 8 and Chapter 9 covers the support vector machine algorithm and the all important Kernel trick in applications such as regression and classification.

The book contains 55 figures and 18 tables, plus plenty of bits and pieces of Python code  to play with.

I guess I will have to sit and wait for the proofreading to be completed and then start the arduous process of going through the comments and suggestions. As ever I will keep you posted as how things go.

Ah! By the way, I will start a mailing list to tell people when the book is ready, so if you are interested, please let me know!

Keep in touch!

PS. The table of contents is also now available at CRC Press here.


Artificial Intelligence, Revealed

A few weeks ago I was invited by General Assembly to give a short intro to Data Science to a group of interested (and interesting) students. They all had different backgrounds, but they all shared an interest for technology and related subjects.

While I was explaining some of the differences between supervised and unsupervised machine learning, I used my example of an alien life trying to cluster (and eventually classify) cats and dogs. If you are interested to know more about this, you will probably have to wait for the publication of my "Data Science and Analytics with Python" book.. I digress...

So, Ed Shipley - one of the admissions managers at GA London - asked me and the students if we had seen the videos that Facebook had produced to explain machine learning... He was reminded of them as they use an example about a machine distinguishing between dogs and cars... (see what they did there?...). If you haven't seen the videos, here you go:

Intro to AI

Machine Learning

Convolutional Neural Nets



%d bloggers like this: