A collection of Data Science and Data Visualisation related posts, pics and thoughts. Take a look and enjoy.
There you go, the first checkpoint is completed: I have officially submitted the completed version of "Advanced Data Science and Analytics with Python".
The book has been some time in the making (and in the thinking...). It is a follow up from my previous book, imaginatively called "Data Science and Analytics with Python" . The book covers aspects that were necessarily left out in the previous volume; however, the readers in mind are still technical people interested in moving into the data science and analytics world. I have tried to keep the same tone as in the first book, peppering the pages with some bits and bobs of popular culture, science fiction and indeed Monty Python puns.
Advanced Data Science and Analytics with Python enables data scientists to continue developing their skills and apply them in business as well as academic settings. The subjects discussed in this book are complementary and a follow up from the topics discuss in Data Science and Analytics with Python. The aim is to cover important advanced areas in data science using tools developed in Python such as SciKit-learn, Pandas, Numpy, Beautiful Soup, NLTK, NetworkX and others. The development is also supported by the use of frameworks such as Keras, TensorFlow and Core ML, as well as Swift for the development of iOS and MacOS applications.
The book can be read independently form the previous volume and each of the chapters in this volume is sufficiently independent from the others proving flexibiity for the reader. Each of the topics adressed in the book tackles the data science workflow from a practical perspective, concentrating on the process and results obtained. The implementation and deployment of trained models are central to the book
Time series analysis, natural language processing, topic modelling, social network analysis, neural networds and deep learning are comprehensively covrered in the book. The book discusses the need to develop data products and tackles the subject of bringing models to their intended audiences. In this case literally to the users fingertips in the form of an iPhone app.
While the book is still in the oven, you may want to take a look at the first volume. You can get your copy here:
Furthermore you can see my Author profile here.Read me...
It was a pleasure to come to the opening day of ODSC Europe 2019. This time round I was the first speaker of the first session, and it was very apt as the talk was effectively an introduction to Data Science.
The next 4 days will be very hectic for the attendees and it the quality is similar to the previous editions we are going to have a great time.Read me...
Last October I had the great opportunity to come and give a talk at the Facultad de Ciencias Políticas, UAEM, México. The main audience were students of the qualitative analysis methods course, but there were people also from informatics and systems engineering.
It was an opportunity to showcase some of the advances that natural language processing offers to social scientists interested in analysing discourse, from politics through to social interactions.
The talk covered a introduction and brief history of the field. We went through the different stages of the analysis, from reading the data, obtaining tokens and labelling their part of speech (POS) and then looking at syntactic and semantic analysis.
We finished the session with a couple of demos. One looking at speeches of Clinton and Trump during their presidential campaigns; the other one was a simple analysis of a novel in Spanish.
Thanks for the invite.
It has been a few months of writing, testing, re-writing and starting again, and I am pleased to say that the first complete draft of "Advanced Data Science and Analytics with Python" is ready. Last chapter is done and starting revisions now. Yay!Read me...
I know there are a ton of posts out there covering this very topic. I am writing this post more for my out benefit, so that I have a reliable place to check the commands I need to add a new conda environment to my Jupyter and nteract IDEs.
First to create an environment that contains, say TensorFlow, Pillow, Keras and pandas we need to type the following in the command line:
$ conda create -n tensorflow_env tensorflow pillow keras pandas jupyter ipykernel nb_conda
Now, to add this to the list of available environments in either Jupyter or nteract, we type the following:
$ conda activate tensor_env $ python -m ipykernel install --name tensorflow_env $ conda deactivate
Et voilà, you should now see the environment in the dropdown menu!Read me...
Using the time wisely during the Bank Holiday weekend. As my dad would say, "resting while making bricks"... Currently reviewing/editing/correcting Chapter 3 of "Advanced Data Science and Analytics with Python". Yes, that is volume 2 of "Data Science and Analytics with Python".
On my way back to London and making the most of the time in the train to work on my Data Science and Analytics Vol 2 book. Working with #StarWars data to explain Social Network Analysis #datascience #geek
It is that time of year when we have an opportunity to look back and see what we have achieved while taking an opportunity to see what the next year will bring. This may be of interest just to me, so please accept my apologies... Here we go:
In no particular order:
- I signed up with my publisher Taylor & Francis to write a volume 2 for my "Data Science and Analytics with Python" book
- During the year I had a opportunities to attend some great events such as the EGG Conference by Dataiku or the BBC Machine Learning Fireside Chats as well as multiple events with the Turing Institute
- I continued delivering training at General Assembly, reaching out to people interested in learning more about Python and Data Science. It has been an interesting year and it is great to see what former students are currently doing with the skills learnt
- The work delivered for companies such as Louis Vuitton, Volvo, Foster & Partners, and others was fantastic. I am also very proud to have tackled some strategy work for the Mayo Clinic and deliver a presentation in a lecture theatre at Mayo
- I contributed to some open source software projects
- It was a busy year in terms of speaking engagements having delivered keynotes at Entrepares 2018 and the IV Seminario de Periodismo Iberoamericano de Ciencia Tecnología e Innovación both in Puebla, Mexico. I also ran an Introduction to Data Science workshop at ODSC18 in London and an Introduction to Python at Entrepares 2018. I gave a talk about Data Science Practices at Google Campus in London. The interactive Q&A session was an fun way to answer queries from the audience. I also was a member in various debate panels
- I rekindled playing board games with a couple of good friends of mine, and it has been a geeky blast!
- I started a new role and still looking to get my foot through the door with Apple
- I've been delving more into Machine Learning systems and platforms, learning about interpretability, reliability, monitoring, and more. There is still plenty more to learn
- I met Chris Robshaw and attended a bunch of rugby matches through the year
Looking forward to 2019, learning and developing more.Read me...
Reblog from here.
New tools: Data Illustrator and Charticulator
Anyone interested in creating their own data visualizations should be giddy with delight with the quickly growing number of tools available to create them without any need for programming skills, and in most cases for free: Tableau, Flourish, Datawrapper, RawGraphs, Chartbuilder or QGIS (for mapping) are some of the best, and the list goes on and on. I’m convinced in a relatively short time drag and drop tools with be as powerful and flexible as D3.js and other developer tools, making data visualization accesible to everyone.
The exciting news is seeing two software giants entering the field with new web-based tools: Adobe launched Data Illustrator a few months ago in a collaboration with the Georgia Institute of Technology, and Microsoft Research is behind the just released Charticulator. Both work very intuitively, allowing the author to bind multiple attributes of data to graphical elements. They are indeed powered by D3.js, among other libraries.
Both offer introduction videos in their hope pages. Here is Data Illustrator:
And here is Charticulator:
The tools offer tutorial sections and multiple step-by-step videos in their galleries; and they link to the research papers describing the tools, which are worth reading (Data Illustrator, Charticulator).
Creating complex visualizations like the chord diagram below seems ridiculously simple in Charticulator, and the same can be said of Data Illustrator’s visualizations. See the video:
This is not a review as I have just started playing with them, but on first look both tools are impressive. It’s still really early in their development, but if Adobe and Microsoft throw their mighty resources to support and improve them, we can expect great things in the near future. Perhaps one day Data Illustrator could be embedded within Adobe Illustrator, allowing designers to work fluidly and easily between D3 and Illustrator without leaving the graphical interface. And Charticulator could integrate into PowerPoint. Stay tuned!Read me...
I recently came across Flourish, a data visualisation tool that makes things easy and can be used even if your programming skills are a bit rusty. The tool is the brainchild of studio Kiln, who have made the tool entirely web-based and they even offer a free public version.
Starting up is easy as you are encouraged to use templates and can upload your data from a CSV or Excel. Some of the templates offer the usual scatterplots and bar charts, but you also have things like Sankey diagrams or 3D globe maps. If you are interested you can also create your own custom templates.
Flourish’s free version allows you to publish and share visualisations, or to embed them in your website. Beware that the data will be visible to everyone once you publish. Give it a go and let me know what you think.
This collection includes books on all aspects of deep learning. It begins with titles that cover the subject as a whole, before moving onto work that should help beginners expand their knowledge from machine learning to deep learning. The list concludes with books that discuss neural networks, both titles that introduce the topic and ones that go in-depth, covering the architecture of such networks.
1. Deep Learning
By Ian Goodfellow, Yoshua Bengio and Aaron Courville
The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. The online version of the book is now complete and will remain available online for free.
2. Deep Learning Tutorial
By LISA Lab, University of Montreal
Developed by LISA lab at University of Montreal, this free and concise tutorial presented in the form of a book explores the basics of machine learning. The book emphasizes with using the Theano library (developed originally by the university itself) for creating deep learning models in Python.
3. Deep Learning: Methods and Applications
By Li Deng and Dong Yu
This book provides an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks.
This book is oriented to engineers with only some basic understanding of Machine Learning who want to expand their wisdom in the exciting world of Deep Learning with a hands-on approach that uses TensorFlow.
5. Neural Networks and Deep Learning
By Michael Nielsen
This book teaches you about Neural networks, a beautiful biologically-inspired programming paradigm which enables a computer to learn from observational data. It also covers deep learning, a powerful set of techniques for learning in neural networks.
6. A Brief Introduction to Neural Networks
By David Kriesel
This title covers Neural networks in depth. Neural networks are a bio-inspired mechanism of data processing, that enables computers to learn technically similar to a brain and even generalize once solutions to enough problem instances are taught. Available in English and German.
7. Neural Network Design (2nd edition)
By Martin T. Hagan, Howard B. Demuth, Mark H. Beale and Orlando D. Jess
NEURAL NETWORK DESIGN (2nd Edition) provides a clear and detailed survey of fundamental neural network architectures and learning rules. In it, the authors emphasize a fundamental understanding of the principal neural networks and the methods for training them. The authors also discuss applications of networks to practical engineering problems in pattern recognition, clustering, signal processing, and control systems. Readability and natural flow of material is emphasized throughout the text.
8. Neural Networks and Learning Machines (3rd edition)
By Simon Haykin
This third edition of Simon Haykin’s book provides an up-to-date treatment of neural networks in a comprehensive, thorough and readable manner, split into three sections. The book begins by looking at the classical approach on supervised learning, before continuing on to kernel methods based on radial-basis function (RBF) networks. The final part of the book is devoted to regularization theory, which is at the core of machine learning.Read me...