A collection of Data Science and Data Visualisation related posts, pics and thoughts. Take a look and enjoy.
Now Reading: Dark Data by David Hand
I first came across a mention of this book in the Summer 2020 number of Imperial, the magazine for the Imperial College Community in a feature note about the book.
It sounded like an interesting read and I had a look for the Princeton University Press book and to my surprise I found an version in Italian published by Rizzoli a few months earlier... I wonder how that worked out. It was cheaper and I was tempted to give it a go in Italian with the name Il tradimento dei numeri (i.e. “The betrayal of the numbers”...). I wonder what hidden story is behind all this...
In the end I decided to go for the English version... Let’s see how it goes.
David Hand is emeritus professor of mathematics at Imperial College London, a former president of the Royal Statistical Society, and a Fellow of the British Academy.
There is a website dedicated to the book: https://darkdata.websiteRead me...
I had an opportunity to be one of the panellists in the Data Skeptic podcast recently. It was great to have been invited and as a listener to the podcast it was a really treat to be able to take part. Also, recording it was fun...
You can listen to the episode here.
In the episode Kyle talks about the relationship between Covid-19 and Carbon Emissions. George tells us about the new Hateful Memes Challenge from Facebook. Lan joins us to talk about Google's AI Explorables. I talk about a paper that uses neural networks to detect infections in the ear.
Let me know what you guys think!Read me...
I was working today in the deployment of a small neural network model prototype converted to Core ML to be used in an iPhone app.
I was trying to find the best way to get things to work and then it occurred to me I had solved a similar issue before... where‽ when‽ aha!
The answer was actually in my Advanced Data Science and Analytics with Python.Read me...
With the lockdown and social distancing rules forcing all of us to adjust our calendars, events and even lesson plans and lectures, I was not surprised to hear of speaking opportunities that otherwise may not arise.
A great example is the reprise of a talk I gave about a year ago while visiting Mexico. It was a great opportunity to talk to Social Science students at the Political Science Faculty of the Universidad Autónoma del Estado de México. The subject was open but had to cover the use of technology and I thought that talking about the use of natural language processing in terms of digital humanities would be a winner. And it was...
In March this year I was approached by the Faculty to re-run the talk but this time instead of doing it face to face we would use a teleconference room. Not only was I, the speaker, talking from the comfort of my own living room, but also all the attendees would be at home. Furthermore, some of the students may not have access to the live presentation (lack of broadband, equipment, etc) and recoding the session for later usage was the best option for them.
I didn’t hesitate in saying yes, and I enjoyed the interaction a lot. Today I learnt that the session was the focus of a small note in a local newspaper. The session was run in Spanish and the note in Portal, the local newspaper, is in Spanish too. I really liked that they picked a line I used in the session to convince the students that technology is not just for the natural sciences:
“Hay que hacer ciencias sociales con técnicas del Siglo XIX... El mundo es de los geeks.
“We should study social sciences applying techniques of the 21st Century. The world today belongs to us, the geeks.
The point is that although qualitative and quantitative techniques are widely used in social science, the use of new platforms and even programming languages such as python open up opportunities for social scientists too.
The talk is available in the blog the class uses to share their discussions: The Share Knowledge Network - Follow this link for the talk.
The newspaper article by Ximena Barragán can be found here.Read me...
It is official! "Advanced Data Science and Analytics with Python" is published!
According to the information I had received from CRC Press, my publisher, the book would be available on May 7th. According to the official page of the book the volume was available since May 5th.
Looking forward to hearing what you think of the book.Read me...
I am reaching out as volume 2 of my data science book will be out for publication in May and my publisher has made it possible for me to offer 20% off. You can order the book here.
This follows from "Data Science and Analytics with Python" and both books are intended for practitioners in data science and data analytics in both academic and business environments.
The new book aims to present the reader with concepts in data science and analytics that were deemed to be more advanced or simply out of scope in the author's first book, and are used in data analytics using tools developed in Python such as SciKit Learn, Pandas, Numpy, etc. The use of Python is of particular benefit given its recent popularity in the data science community. The book is therefore a reference to be used by seasoned programmers and newcomers alike and the key benefit is the practical approach presented throughout the book
More information about the first book can be found here.Read me...
With all the changes that have taken place in the las couple of weeks, I was thinking of the support that we can provide to each other while keeping to the new ways of working around us. Working from home is nothing new for some, but not for many. Socialising is an important part of the human experience.
I therefore thought of putting an open invite for a virtual coffee to the data science/physics/maths community dealing with the new ways of working, business, mental health and general stuff:
The response was great and I promptly created a new page in this site dedicated to some information for the new Jackalope Data Science Community. The first call took place on March 26th, 6.30pm via Meet. There were about 12 attendees mainly from the UK, with some from Cyprus, the US and other places around the world.
It was great to see so many friends there and the chat ranged from how to distinguish between weekdays and weekends these days, to how we are coping with working from home and how companies and businesses are reacting. It was entertaining, and personally I found it very useful.
We are planning to get together again in a couple of weeks. If you are interested to join us and learn more the Jackalope Data Science Community, get in touch.Read me...
Well, this are the final corrections for my latest book "Advanced Data Science and Analytics with Python". Next stop publication!
Super excited to have received the proofread version of Advanced Data Science and Analytics with Python. They all seem to be very straightforward corrections: a few missing commas, some italics here and there and capitalisation bits and bobs.
I hope to be able to finish the corrections before my deadline for March 25th, and then enter the last phase before publication in May 2020.Read me...
I have received the latest information about the status of my book “Advanced Data Science and Analytics with Python”. This time reviewing the latest cover drafts for the book.
This is currently my favourite one.
Awaiting the proofreading comments, and I hope to update you about that soon.Read me...
If you are interested in #DataScience you surely have heard of #pandas and you would be pleased to hear that version 1.0 finally out. With better integration with bumpy and improvements with numba among others. Take a look!
— Read on www.anaconda.com/pandas-1-0-is-here/
It was great to invited to give the joint Physics Astronomy and Maths + Computer Science research seminar today at the University of Hertfordshire. I had a good opportunity to meet old colleagues and meet new faculty. There were also many students and they with many questions.
I was glad to hear they are thinking about offering more data science courses and even a dedicated programme. I would definitely be interested to hear more about that.Read me...
There you go, the first checkpoint is completed: I have officially submitted the completed version of "Advanced Data Science and Analytics with Python".
The book has been some time in the making (and in the thinking...). It is a follow up from my previous book, imaginatively called "Data Science and Analytics with Python" . The book covers aspects that were necessarily left out in the previous volume; however, the readers in mind are still technical people interested in moving into the data science and analytics world. I have tried to keep the same tone as in the first book, peppering the pages with some bits and bobs of popular culture, science fiction and indeed Monty Python puns.
Advanced Data Science and Analytics with Python enables data scientists to continue developing their skills and apply them in business as well as academic settings. The subjects discussed in this book are complementary and a follow up from the topics discuss in Data Science and Analytics with Python. The aim is to cover important advanced areas in data science using tools developed in Python such as SciKit-learn, Pandas, Numpy, Beautiful Soup, NLTK, NetworkX and others. The development is also supported by the use of frameworks such as Keras, TensorFlow and Core ML, as well as Swift for the development of iOS and MacOS applications.
The book can be read independently form the previous volume and each of the chapters in this volume is sufficiently independent from the others proving flexibiity for the reader. Each of the topics adressed in the book tackles the data science workflow from a practical perspective, concentrating on the process and results obtained. The implementation and deployment of trained models are central to the book
Time series analysis, natural language processing, topic modelling, social network analysis, neural networds and deep learning are comprehensively covrered in the book. The book discusses the need to develop data products and tackles the subject of bringing models to their intended audiences. In this case literally to the users fingertips in the form of an iPhone app.
While the book is still in the oven, you may want to take a look at the first volume. You can get your copy here:
Furthermore you can see my Author profile here.Read me...
It was a pleasure to come to the opening day of ODSC Europe 2019. This time round I was the first speaker of the first session, and it was very apt as the talk was effectively an introduction to Data Science.
The next 4 days will be very hectic for the attendees and it the quality is similar to the previous editions we are going to have a great time.Read me...
Last October I had the great opportunity to come and give a talk at the Facultad de Ciencias Políticas, UAEM, México. The main audience were students of the qualitative analysis methods course, but there were people also from informatics and systems engineering.
It was an opportunity to showcase some of the advances that natural language processing offers to social scientists interested in analysing discourse, from politics through to social interactions.
The talk covered a introduction and brief history of the field. We went through the different stages of the analysis, from reading the data, obtaining tokens and labelling their part of speech (POS) and then looking at syntactic and semantic analysis.
We finished the session with a couple of demos. One looking at speeches of Clinton and Trump during their presidential campaigns; the other one was a simple analysis of a novel in Spanish.
Thanks for the invite.