Random thoughts about random subjects… From science to literature and between manga and watercolours, passing by data science and rugby; including film, physics and fiction, programming, pictures and puns.
With the lockdown and social distancing rules forcing all of us to adjust our calendars, events and even lesson plans and lectures, I was not surprised to hear of speaking opportunities that otherwise may not arise.
A great example is the reprise of a talk I gave about a year ago while visiting Mexico. It was a great opportunity to talk to Social Science students at the Political Science Faculty of the Universidad Autónoma del Estado de México. The subject was open but had to cover the use of technology and I thought that talking about the use of natural language processing in terms of digital humanities would be a winner. And it was…
In March this year I was approached by the Faculty to re-run the talk but this time instead of doing it face to face we would use a teleconference room. Not only was I, the speaker, talking from the comfort of my own living room, but also all the attendees would be at home. Furthermore, some of the students may not have access to the live presentation (lack of broadband, equipment, etc) and recoding the session for later usage was the best option for them.
I didn’t hesitate in saying yes, and I enjoyed the interaction a lot. Today I learnt that the session was the focus of a small note in a local newspaper. The session was run in Spanish and the note in Portal, the local newspaper, is in Spanish too. I really liked that they picked a line I used in the session to convince the students that technology is not just for the natural sciences:
“Hay que hacer ciencias sociales con técnicas del Siglo XIX… El mundo es de los geeks.
“We should study social sciences applying techniques of the 21st Century. The world today belongs to us, the geeks.
The point is that although qualitative and quantitative techniques are widely used in social science, the use of new platforms and even programming languages such as python open up opportunities for social scientists too.
What kind of probability are people talking about when they say something is “highly likely” or has “almost no chance”? The chart below, created by Reddit user zonination, visualizes the responses of 46 other Reddit users to “What probability would you assign to the phase: <phrase>” for various statements of probability. Each set of responses has been converted to a kernel destiny estimate and presented as a joyplot using R.
Somewhat surprisingly, the results from the Redditors hew quite closely to a similar study of 23 NATO intelligence officers in 2007. In that study, the officers — who were accustomed to reading intelligence reports with assertions of likelihood — were giving a similar task with the same descriptions of probability. The results, here presented as a dotplot, are quite similar.
For details on the analysis of the Redditors, including the data and R code behind the joyplot chart, check out the Github repository linked below.
Today I had the opportunity of running a #DataScience bootcamp in London. It was an all-day affair and although the attendees were engaged, I’m sure that by the end of the 6th hour they were quite tired.
The discussions ranged from what data science is, the skills required to become a data scientist and also to manage them. Finally we implemented some data analyses based on linear regression, all using R. I was very pleased to see some of the results.
Visualising data is definitely a very powerful tool. If I were to give you a table full of numbers, and I told you that the data “clearly” shows something, you might take a look at the tabulated data and quite possibly ignore it. However, if is presented the data in a format that is appealing to the eye, you will probably take a look and start your own interpretation.
Hans Rosling makes this point in a very interesting and quite frankly enthusing way. He plots the income per person versus life expectancy for several countries and takes us in a 200-year tour. The income per person (GDP per capita) is adjusted for inflation and for differences in costs of living (purchasing power) across countries. Catch what happens at the time of the First World War and the Spanish Flu Epidemic. Also note the behaviour of African and Asian countries. You can play with the data yourself in Gapminder World.
This is a short clip from the longer film The Joy of Stats, recently shown in BBC4. Enjoy!