Random thoughts about random subjects… From science to literature and between manga and watercolours, passing by data science and rugby; including film, physics and fiction, programming, pictures and puns.
I recently asked the guys at the Data Science class at GA to bring a good example of a “daft graph” and they all did that with gusto. As usual, there was a certain cable news channel that was mentioned a lot of times for their misleading use of graphics.
1919– Extracts from an Investigation into the Physical Properties of Books as They Are At Present Published. The Society of Calligraphers, Boston.
This is a small pamphlet that was designed and authored by the graphic designer W.A. Dwiggins and his cousin L.B. Sigfried. It pilloried the format of books and his concern for the poor methods of printing trade books in the US at that time.
The book was published by the imaginary Society of Calligraphers and the stinging investigation was a hoax cooked up by Dwiggins – nevertheless it did have an effect on publishing in the US following its wide distribution.
The graph by Dwiggins shows the reduction in book quality since 1910.
The Business Intelligence and Analytics Platform report by Gartner is close to the heart of many people interested in the analytics market and it reflect the changes and new tools that people are and will be using.
The so-called Magic Quadrant is a concise summary of the position of various players in the BI/Analytics space and it makes for good marketing, particularly for those in the Leaders-Visionaries quadrant. A particular mention goes to Tableau which has been there for three years in a row. The “ability to execute” has positioned them quite high in the quadrant, followed by Qlik. The usual suspects such as IBM and SAS are still there, and it is interesting to see Microsoft there too, one may assume taking some of the space that Revolution Analytics would have used…
I came across this blog post by Kevin Markham where he describes his experiences and thoughts about having taught the 11-week data science course with General Assembly. He has managed to capture some of my own thoughts with the course that I ran in London and I agree with the points he makes. The use of Python as the main language for the course went really well, and although there was the nagging itch of implementing some things in R, I think it paid off in the long run. We also used Anaconda as the recommended distribution and although some were happy with using other distros, in general it made things easier for the availability of packages and their usage.
Kevin mentions the need for “more concepts than maths” and I agree with this view. I think it is important to discuss some of the maths, but given that the mix of students includes a wide range of abilities, making emphasis in the concepts is far more important. Having said that, I think it was important to make this clear from the beginning. I also tried to provide enough references and reading material for those more inclined to delve into the maths.
I had a session for APIs and databases (mySQL and MongoDB) and I am glad I did that as this enabled the students to consider that part of the project lifecycle. Kevin (and Alessandro Gagliardi – in the comments) comment on their experiences of including some NLP in their courses and that seems to be a good idea to consider. In my case I had a session on social network analysis, and although it was the basics, I think the students really enjoyed it.
As for visualisation, I had an invited speaker from Tableau and then worked with the tool for the rest of the class. I think it went well, and perhaps the only issue is that it came way to early in the course. One thing that was emphasised from the start was the project and even when the students were constantly reminded about this, it still was difficult to get the projects “finished”. I put that in quotes as I believe that the main purpose of the project is to get the students started, particularly as it is hard so say when a project is actually done… there is always more things to tweak or do!
I came across this post by Andy Cotgreave about features to use in an interactive dashboard as those created in Tableau.
He deconstructed a dashboard (a bit meta there, right?) to quantify the Impact vs Difficulty of each of the design choices you can potentially make. You can see a screenshot below, but you can play with the interactive version in the link above.
Yesterday I had the chance to attend the first Visualized.io conference in London. It was a fully packed day with lots of interesting speakers and fun people. The variety of the talks was quite good and most of the presentations were very well prepared. I was surprised at the bad use of video in a couple of the talk in the morning session, but apart from that it was all very good.
I ended up winning a print and it is not decorating one of the walls at home. You can see a picture at the end of the gallery below. The conference tool place at Protein in the heart of Hipsterland (aka Shoreditch) and it was a well attended event.
I particularly enjoyed the talk by David McCandless who turned out to be the mystery guest. Similarly, the presentation by Pascal Raabe about memories was very good and inspiring. Another good presentation was the “smelly” talk given by Kate McLean.
Andy Kirk gave a view about the Design of Time and you can see the slides here.
If you are interested in seeing what twitter was saying before, during and after the conference, check this page.
Finally, the conference was at Eventfire archived here, and I am surprised to see that I was the top contributor according to them! :D
Talking to some friends from General Assembly, I ended up being asked to provide a brief quote about what data science is and given the short amount of time to think about the question I ended up with the following:
“Data science and analytics are rapidly gaining prominence as some of the more sought after disciplines in academic and professional circles. In a nutshell, data science can be understood as the extraction of knowledge and insight from various sources of data, and the skills required to achieve this range from programming to design, and from mathematics to storytelling.”
I am convinced there is more to it than the above lines, but I was asked for a small quote. Anyway, what do you think?
The idea of the application is to generate a “never-ending and ever changing version of any song”, and this is done in a very engaging and entertaining way. You can upload your own track, which in turn is uploaded The Echo Nest, where it is decomposed into individual beats.
The beats of the song get analysed and matched to similar bits in the same song; the result is presented in a chord diagram and as the song is played the paths that join similar sounding beats come into play and make the song to brach out to a completely different part of the song. Enjoy!
When I first heard about the plans that the British Library had about an exhibitions called Science is Beautiful I got very excited. I did even make an entry in my diary about the date that it was planned to be opened. Closer to the time I even encourage Twitter followers and colleagues to go to the exhibition.
The exhibition promised to explore how “our understanding of ourselves and our planet has evolved alongside our ability to represent, graph and map the mass data of the time.” So I finally made some time and made it to the British Library today… the exhibition was indeed there with some nice looking maps and graphics, but I could not help feeling utterly disappointed. I was very surprised they even call this an exhibition, the very few images, documents and interactive displays were very few and not very immersive. Probably my favourite part was looking at “The Pedigree of Man” and the “Nightingale’s Rose” together with an interactive show. Nonetheless, I felt that the British Library could have done a much better job given the wealth of documents they surely have at hand. Besides, the technology used to support the exhibits was not that great… for example the touch screens were not very responsive and did not add much to the presentation.