# Coding and Computer Tricks

A collection of posts related to Coding, Programming, Hacking and Computer Tricks. Take a look and enjoy

## Persistent "Previous Recipients" in Mac Mail

Hello everyone! I am very pleased to take a question from John who got in touch with Quantum Tunnel using the form here. John's favourite scientist is Einstein and his question is as follows:

In Mac mail I cannot delete unwanted email addresses. I have done the routine of deleting all addresses from the previous receiptant list, but when starting a new email unwanted addresses appear.. Any help is appreciated. Thanks, John

John is referring to the solution I provided in this earlier post. Sadly, the list of his lucky friends/colleagues/family (delete as appropriate) he has email recently persists even after clearing the "Previous Recipients" as explained in the post before.

There may be a way to force the clearing of these persistent email address:

• Quit Mail and Address Book (in case the latter is open)
• Open a terminal and type the following command:
• rm ~/Library/Application Support/AddressBook/MailRecents-v4.abcdmr
• Log out and back in again
• Start Mail
• You may have to clear the "Previous Recipients" list as per the post mentioned above

You should now be able to clear the list. And... In case you were wondering, the file we deleted should be created afresh to start accumulating new "recent recipients" (yay!)

Et voilà!

## Finding iBooks Files in My Mac

I was looking for the location of iBooks files (including ePub, PDFs and others) so that I can curate the list of manually exported files. Finding iBooks in my Mac should not be a difficult task, although it took a few minutes. I thought of sharing that here in the blog for future reference and in the hope that some of yo may find it useful.

We will use the Terminal, as doing things from Finder tends to redirect us. A first place to look into is the following one:

/Library/Containers/com.apple.BKAgentService/Data/Documents/iBooks

Now, that may not be the entire list of your books. In case you have enabled iCloud, then things may be stored in your Mobile Documents folder:

cd ~/Library/Mobile Documents/iCloud~com~apple~iBooks/Documents/

For things that you have bought in the iBooks store, take a look here:

cd ~/Library/Containers/com.apple.BKAgentService/Data/Documents/iBooks

Et voilà!

## CoreML - Boston Model: The Complete App

Look how far we have come... We started this series by looking at what CoreML is and made sure that our environment was suitable. We decided to use linear regression as our model, and chose to use the Boston Price dataset in our exploration for this implementation. We built our model using Python and created our .mlmodel object and had a quick exploration of the model's properties. We then started to build our app using Xcode (see Part 1, Part 2 and Part 3). In this final part we are going to take the .mlmodel and include it in out Xcode project, we will then use the inputs selected from out picker and calculate a prediction (based on our model) to be displayed to the user. Are you ready? Nu kör vi!

Let us start by adding the .mlmodel we created earlier on so that it is an available resource in our project. Open your Xcode project and locate your PriceBoston.mlmodel file. From the menu on the left-hand side select the "BostonPricer" folder. At the bottom of the window you will see a + sign, click on it and select "New Groups". This will create a sub-folder within "BostonPricer". Select the new folder and hit the return key, this will let you rename the folder to something more useful. In this case I am going to call this folder "Resources".

Open Finder and navigate to the location of your BostonPricer.mlmodel. Click and drag the file inside the "Resources" folder we just created. This will open a dialogue box asking for some options for adding this file to your project. I selected the "Create Folder References" and left the rest as it was shown by default. After hitting "Finish" you will see your model now being part of your project. Let's now go the code in ViewController and make some needed changes.  The first one is to tell our project that we are going to need the powers of the CoreML framework. At the top of the file, locate a line of code that imports UIKit, right below it type the following:

import CoreML

Inside the definition of the ViewController class, let us define a constant to reference the model. Look for the definitions of the crimeData and roomData constants and nearby them type the following:

let model = PriceBoston()

You will see that when you start typing the name of the model, Xcode will suggest the right name as it knows about the existence of the model as part of its resources, neat!

We need to make some changes to the getPrediction()function we created in the last post. Go to the function and look for place where we pick the values of crime and rooms and right after that write the following:

guard let priceBostonOutput = try? model.prediction(
crime:crime,
rooms: Double(rooms)
) else {
fatalError("Unexpected runtime error.")
}

You may get a warning telling you that the constant priceBostonOutput was defined but not used. Don't worry, we will indeed use it in a little while. Just a couple of words about this piece of code, you will see that we are using the prediction method defined in the model and that we are passing the two input parameters that the model expects, namely crime and rooms. We are wrapping this call to the prediction method around a try statement so that we can catch any exceptions. This is where we are implementing our CoreML mode!!! Isn't that cool‽

We are not done yet though; remember that we have that warning from Xcode about using the model. Looking at the properties of the model, we can see that we also have an output attribute called price. This is the prediction we are looking for and the one we would like to display. Out of the box it may have a lot of decimal figures, and it is never a good practice to display those to the user (although they are important in precision terms...). Also, with Swift's strong typing we would have to typecast the double returned by the model into a string that can be printed. So, let us prepare some code to format the predicted price. At the top of the ViewController class, find the place where we defined the constants crimeData and roomData. Below them type the following code:

let priceFormat: NumberFormatter = {
let formatting = NumberFormatter()
formatting.numberStyle = .currency
formatting.maximumFractionDigits = 2
formatting.locale = Locale(identifier: "en_US")
return formatting
}()

We are defining a format that will show a number as currency in US dollars with two decimal figures. We can now pass our predicted price to this formatter and assign it to a new constant for future reference. Below the code where the getPrediction function was defined, write the following:

let priceText = priceFormat.string(from: NSNumber(value:
priceBostonOutput.price))

Now we have a nicely formatted string that can be used in the display. Let us change the message that we are asking our app to show when pressing the button:

let message = "The predicted price (in 1,000s) is " + priceText! We are done! Launch your app simulator, select a couple of values from the picker and hit the "Calculate Prediction" button... Et voilà, we have completed our first implementation of a CoreML model in a working app. There are many more things that we can do to improve the app. For instance, we can impose some constraints on the position of the different elements shown in the screen so that we can deploy the application in the various screen sizes offered by Apple devices. Improve the design and usability of the app and designing appropriate icons for the app (in various sizes). For the time being, I will leave some of those tasks for later. In the meantime you can take a look at the final code in my github site here. Enjoy and do keep in touch, I would love to hear if you have found this series useful. Read me... ## CoreML - iOS Implementation for the Boston Model (part 3) - Button We are very close at getting a functioning app for our Boston Model. In the last post we were able to put together the code that fills in the values in the picker and were able to "pick" the values shown for crime rate and number of rooms respectively. These values are fed to the model we built in one of the earlier posts of this series and the idea is that we will action this via a button that triggers the calculation of the prediction. In turn the prediction will be shown in a floating dialogue box. In this post we are going to activate the functionality of the button and show the user the values that have been picked. With this we will be ready to weave in the CoreML model in the final post of this series. So, what are we waiting for? Let us launch Xcode and get working. We have already done a bit of work for the button in the previous post where we connected the button to the ViewController generating a line of code that read as follows: @IBOutlet weak var predictButton: UIButton! If we launch the application and click on the button, sadly, nothing will happen. Let's change that: in the definition of the UIViewController class, after the didReceiveMemoryWarning function write the following piece of code: @IBAction func getPrediction() { let selectedCrimeRow = inputPicker.selectedRow(inComponent: inputPredictor.crime.rawValue) let crime = crimeData[selectedCrimeRow] let selectedRoomRow = inputPicker.selectedRow(inComponent: inputPredictor.rooms.rawValue) let rooms = roomData[selectedRoomRow] let message = "The picked values are Crime: \(crime) and Rooms: \(rooms)" let alert = UIAlertController(title: "Values Picked", message: message, preferredStyle: .alert) let action = UIAlertAction(title: "OK", style: .default, handler: nil) alert.addAction(action) present(alert, animated: true, completion: nil) } The first four lines of the getPrediction function takes the values from the picker and creates some constants for crime and rooms that will then be used in a message to be displayed in the application. We are telling Xcode to treat this message as an alert and ask it to present it to the user (last line in the code above). What we need to do now is tell Xcode that this function is to be triggered when we click on the button. There are several way we can connect the button with the code above. In this case we are going to go to the Main.storyboard, control+click on the button and drag. This will show an arrow, we need to connect that arrow with the View Controller icon (a yellow circle with a white square inside) at the top of the view controller window we are putting together. When you let go, you will see a drop-down menu. From there, under "Sent Events" select the function we created above, namely getPrediction. See the screenshots below: You can now run the application. Select a number from each of the columns in the picker, and when ready, prepare to be amazed: Click on the "Calculate Prediction" button, et voilà - you will see a new window telling you the values you have just picked. Tap "OK" and start again! In the next post we will add the CoreML model, and modify the event for the button to take the two values picked and calculate a prediction which in turn will be shown in the floating window. Stay tuned. You can look at the code (in development) in my github site here. Read me... ## JupyterLab is Ready for Users This is a reblog of this original post. # JupyterLab is Ready for Users ## We are proud to announce the beta release series of JupyterLab, the next-generation web-based interface for Project Jupyter. Project Jupyter Feb 20 tl;dr: JupyterLab is ready for daily use (documentation, try it with Binder) &amp;amp;lt;img class="progressiveMedia-noscript js-progressiveMedia-inner" src="<a href= "https://cdn-images-1.medium.com/max/1600/1*_jDTWlZNUySwrRBgVNqoNw.png">https://cdn-images-1.medium.com/max/1600/1*_jDTWlZNUySwrRBgVNqoNw.png</a>"&amp;amp;gt; JupyterLab is an interactive development environment for working with notebooks, code, and data. ### The Evolution of the Jupyter Notebook Project Jupyter exists to develop open-source software, open standards, and services for interactive and reproducible computing. Since 2011, the Jupyter Notebook has been our flagship project for creating reproducible computational narratives. The Jupyter Notebook enables users to create and share documents that combine live code with narrative text, mathematical equations, visualizations, interactive controls, and other rich output. It also provides building blocks for interactive computing with data: a file browser, terminals, and a text editor. The Jupyter Notebook has become ubiquitous with the rapid growth of data science and machine learning and the rising popularity of open-source software in industry and academia: • Today there are millions of users of the Jupyter Notebook in many domains, from data science and machine learning to music and education. Our international community comes from almost every country on earth.¹ • The Jupyter Notebook now supports over 100 programming languages, most of which have been developed by the community. • There are over 1.7 million public Jupyter notebooks hosted on GitHub. Authors are publishing Jupyter notebooks in conjunction with scientific research, academic journals, data journalism, educational courses, and books. At the same time, the community has faced challenges in using various software workflows with the notebook alone, such as running code from text files interactively. The classic Jupyter Notebook, built on web technologies from 2011, is also difficult to customize and extend. ### JupyterLab: Ready for Users JupyterLab is an interactive development environment for working with notebooks, code and data. Most importantly, JupyterLab has full support for Jupyter notebooks. Additionally, JupyterLab enables you to use text editors, terminals, data file viewers, and other custom components side by side with notebooks in a tabbed work area. &amp;amp;lt;img class="progressiveMedia-noscript js-progressiveMedia-inner" src="<a href= "https://cdn-images-1.medium.com/max/1600/1*O20XGvUOTLoFKQ9o20usIA.png">https://cdn-images-1.medium.com/max/1600/1*O20XGvUOTLoFKQ9o20usIA.png</a>"&amp;amp;gt; JupyterLab enables you to arrange your work area with notebooks, text files, terminals, and notebook outputs.JupyterLab provides a high level of integration between notebooks, documents, and activities: • Drag-and-drop to reorder notebook cells and copy them between notebooks. • Run code blocks interactively from text files (.py, .R, .md, .tex, etc.). • Link a code console to a notebook kernel to explore code interactively without cluttering up the notebook with temporary scratch work. • Edit popular file formats with live preview, such as Markdown, JSON, CSV, Vega, VegaLite, and more. JupyterLab has been over three years in the making, with over 11,000 commits and 2,000 releases of npm and Python packages. Over 100 contributors from the broader community have helped build JupyterLab in addition to our core JupyterLab developers. To get started, see the JupyterLab documentation for installation instructions and a walk-through, or try JupyterLab with Binder. You can also set up JupyterHub to use JupyterLab. ### Customize Your JupyterLab Experience JupyterLab is built on top of an extension system that enables you to customize and enhance JupyterLab by installing additional extensions. In fact, the builtin functionality of JupyterLab itself (notebooks, terminals, file browser, menu system, etc.) is provided by a set of core extensions. &amp;amp;lt;img class="progressiveMedia-noscript js-progressiveMedia-inner" src="<a href= "https://cdn-images-1.medium.com/max/1600/1*OneJZOqKqBZ9oN80kRX7kQ.png">https://cdn-images-1.medium.com/max/1600/1*OneJZOqKqBZ9oN80kRX7kQ.png</a>"&amp;amp;gt; JupyterLab extensions enable you to work with diverse data formats such as GeoJSON, JSON and CSV.²Among other things, extensions can: • Provide new themes, file editors and viewers, or renderers for rich outputs in notebooks; • Add menu items, keyboard shortcuts, or advanced settings options; • Provide an API for other extensions to use. Community-developed extensions on GitHub are tagged with the jupyterlab-extension topic, and currently include file viewers (GeoJSON, FASTA, etc.), Google Drive integration, GitHub browsing, and ipywidgets support. ### Develop JupyterLab Extensions While many JupyterLab users will install additional JupyterLab extensions, some of you will want to develop your own. The extension development API is evolving during the beta release series and will stabilize in JupyterLab 1.0. To start developing a JupyterLab extension, see the JupyterLab Extension Developer Guide and the TypeScript or JavaScript extension templates. JupyterLab itself is co-developed on top of PhosphorJS, a new Javascript library for building extensible, high-performance, desktop-style web applications. We use modern JavaScript technologies such as TypeScript, React, Lerna, Yarn, and webpack. Unit tests, documentation, consistent coding standards, and user experience research help us maintain a high-quality application. ### JupyterLab 1.0 and Beyond We plan to release JupyterLab 1.0 later in 2018. The beta releases leading up to 1.0 will focus on stabilizing the extension development API, user interface improvements, and additional core features. All releases in the beta series will be stable enough for daily usage. JupyterLab 1.0 will eventually replace the classic Jupyter Notebook. Throughout this transition, the same notebook document format will be supported by both the classic Notebook and JupyterLab. ### Get Involved There are many ways you can participate in the JupyterLab effort. We welcome contributions from all members of the Jupyter community: • Use our extension development API to make your own JupyterLab extensions. Please add the jupyterlab-extension topic if your extension is hosted on GitHub. We appreciate feedback as we evolve toward a stable API for JupyterLab 1.0. • Contribute to the development, documentation, and design of JupyterLab on GitHub. To get started with development, please see our Contributing Guide and Code of Conduct. We label issues that are ideal for new contributors as “good first issue” or “help wanted”. • Connect with us on our GitHub Issues page or on our Gitter Channel. If you find a bug, have questions, or want to provide feedback, please join the conversation. We are thrilled to see how you use and extend JupyterLab. Sincerely, We thank Bloomberg and Anaconda for their support and collaboration in developing JupyterLab. We also thank the Alfred P. Sloan Foundation, the Gordon and Betty Moore Foundation, and the Helmsley Charitable Trust for their support. [1] Based on the 249 country codes listed under ISO 3166–1, recent Google analytics data from 2018 indicates that jupyter.org has hosted visitors from 213 countries. [2] Data visualized in this screenshot is licensed CC-BY-NC 3.0. See http://datacanvas.org/public-transportation/ for more details. Read me... ## nteract - a great Notebook experience I am a supporter of using Jupyter Notebooks for data exploration and code prototyping. It is a great way to start writing code and immediately get interactive feedback. Not only can you document your code there using markdown, but also you can embed images, plots, links and bring your work to life. Nonetheless, there are some little annoyances that I have, for instance the fact that I need to launch a Kernel to open a file and having to do that "the long way" - i.e. I cannot double-click on the file that I am interested in seeing. Some ways to overcome this include looking at Gihub versions of my code as the notebooks are rendered automatically, or even saving HTML or PDF versions of the notebooks. I am sure some of you may have similar solutions for this. Last week, while looking for entries on something completely different, I stumbled upon a post that suggested using nteract. It sounded promising and I took a look. It turned out to be related to the Hydrogen package available for Atom, something I have used in the past and loved it. nteract was different though as it offered a desktop version and other goodies such as in-app support for publishing, a terminal-free experience sticky cells, input and output hiding... Bring it on! I just started using it, and so far so good. You may want to give it a try, and maybe even contribute to the git repo. Read me... ## CoreML - iOS Implementation for the Boston Model (part 2) - Filling the Picker Right! Where were we? Yes, last time we put together a skeleton for the CoreML Boston Model application that will take two inputs (crime rate and number of rooms) and provide a prediction of the price of a Boston property (yes, based on somewhat all prices...). We are making use of three three labels, one picker and one button. Let us start creating variables to hold the potential values for the input variables. We will do this in the ViewController by selecting this file from the left-hand side menu: Inside the ViewController class definition enter the following variable assignments: let crimeData = Array(stride(from: 0.1, through: 0.3, by: 0.01)) let roomData = Array(4...9) These values are informed by the data exploration we carried out in an earlier post. We are going to use the arrays defined above to populate the values that will be shown in our picker. For this we need to define a data source for the picker and make sure that there are two components to choose values from. Before we do any of that we need to connect the view from our storyboard to the code, in particular we need to create outlets for the picker and for the button. Select the Main.storyboard from the menu in the left-hand side. With the Main.storyboard in view, in the top right-hand corner of Xcode you will see a button with an icon that has two intersecting circles, click on that icon. you will now see the storyboard side-by-side with the code. While pressing the Control key, select the picker by clicking on it; without letting go drag into the code window (you will see an arrow appear as you drag): You will se a dialogue window where you can now enter a name for the element in your Storyboard. In this case I am calling my picker inputPicker, as shown in the figure on the left. After pressing the "connect" button a new line of code appears and you will see a small circle on top of the code line number indicating that a connection with the Storyboard has been made. Do the same for the button and call it predictButton. In order to make our life a little bit easier, we are going to bundle together the input values. At the bottom of the ViewController code write the following: enum inputPredictor: Int { case crime = 0 case rooms } We have define an object called inputPredictor that will hold the values of for crime and rooms. In turn we will use this object to populate the picker as follows: In the same ViewController file, after the class definition that is provided in the project by default we are going to write an extension for the data source. Write the following code: extension ViewController: UIPickerViewDataSource { func numberOfComponents(in pickerView: UIPickerView) -> Int { return 2 } func pickerView(_ pickerView: UIPickerView, numberOfRowsInComponent component: Int) -> Int { guard let inputVals = inputPredictor(rawValue: component) else { fatalError("No predictor for component") } switch inputVals { case .crime: return crimeData.count case .rooms: return roomData.count } } } With the function numberOfComponents we are indicating that we want to have 2 components in this view. Notice that inside the pickerView function we are creating a constant inputVals defined by the values from inputPredictor. So far we have indicated where the values for the picker come from, but we have not delegated the actions that can be taken with those values, namely displaying them and picking them (after all, this element is a picker!) so that we can use the values elsewhere. If you were to execute this app, you will see an empty picker... OK, so what we need to do is create the UIPickerViewDelegate, and we do this by entering the following code right under the previous snippet: extension ViewController: UIPickerViewDelegate { func pickerView(_ pickerView: UIPickerView, titleForRow row: Int, forComponent component: Int) -> String? { guard let inputVals = inputPredictor(rawValue: component) else { fatalError("No predictor for component") } switch inputVals { case .crime: return String(crimeData[row]) case .rooms: return String(roomData[row]) } } func pickerView(_ pickerView: UIPickerView, didSelectRow row: Int, inComponent component: Int) { guard let inputVals = inputPredictor(rawValue: component) else { fatalError("No predictor for component") } switch inputVals { case .crime: print(String(crimeData[row])) case .rooms: print(String(roomData[row])) } } }  In the first function we are defining what values are supposed to be shown for the titleForRow in the picker, and we do this for each of the two elements we have, i.e. crime and rooms. In the second function we are defining what happens when we didSelectRow, in other words select the value that is being shown by each of the two elements in the picker. Not too bad, right? Well, if you were to run this application you will still see no change in the picker... Why is that? The answer is that we need to let the application know what needs to be show when the elements load. Go back to the top of the code (around line 20 or so) below the code lines that defined the outlets for the picker and the button. There write the following code: override func viewDidLoad() { super.viewDidLoad() // Picker data source and delegate inputPicker.dataSource = self inputPicker.delegate = self } OK, we can now run the application: On the top left-hand side of the Xcode window you will see a play button; clicking on it will launch the Simulator and you will be able to see your picker working. Go on, select a few values from each of the elements: In the next post we will write code to activate the button to run a prediction using our CoreML model with the values selected from the picker and show the result to the user. Stay tuned! You can look at the code (in development) in my github site here. Read me... ## Siri doesn’t like Rugby Well, it seems that Siri does not like Rugby. Only information out baseball, basketball, American football, ice hockey or cricket (!). Apparently golf and tennis to follow... Oh well... Read me... ## CoreML - Model properties If you have been following the posts in this open notebook, you may know that by now we have managed to create a linear regression model for the Boston Price dataset based on two predictors, namely crime rate and average number of rooms. It is by no means the best model out there ad our aim is to explore the creation of a model (in this case with Python) and convert it to a Core ML model that can be deployed in an iOS app. Before move on to the development of the app, I thought it would be good to take a look at the properties of the converted model. If we open the PriceBoston.mlmodel we saved in the previous post (in Xcode of course) we will see the following information: We can see the name of the model (PriceBoston) and the fact that it is a "Pipeline Regressor". The model can be given various attributes such as Author, Description, License, etc. We can also see the listing of the Model Evaluation Parameters in the form of Inputs (crime rate and number of rooms) and Outputs (price). There is also an entry to describe the Model Class (PriceBoston) and without attaching this model to a target the class is actually not present. Once we make this model part of a target inside an app, Xcode will generate the appropriate code Just to give you a flavour of the code that will be generated when we attach this model to a target, please take a look at the screenshot below: You can see that the code was generated automatically (see the comment at the beginning of the Swift file). The code defines the input variables and feature names, defines a way to extract values out of the input strings, sets up the model output and other bits and pieces such as defining the class for model loading and prediction (not shown). All this is taken care of by Xcode, making it very easy for us to use the model in our app. We will start building that app in the following posts (bear with me, I promise we will get there). Enjoy! Read me... ## CoreML - Building the model for Boston Prices In the last post we have taken a look at the Boston Prices dataset loaded directly from Scikit-learn. In this post we are going to build a linear regression model and convert it to a .mlmodel to be used in an iOS app. We are going to need some modules: import coremltools import pandas as pd from sklearn import datasets, linear_model from sklearn.model_selection import train_test_split from sklearn import metrics import numpy as np The cormeltools is the module that will enable the conversion to use our model in iOS. Let us start by defining a main function to load the dataset: def main(): print('Starting up - Loading Boston dataset.') boston = datasets.load_boston() boston_df = pd.DataFrame(boston.data) boston_df.columns = boston.feature_names print(boston_df.columns) In the code above we have loaded the dataset and created a pandas dataframe to hold the data and the names of the columns. As we mentioned in the previous post, we are going to use only the crime rate and the number of rooms to create our model:  print("We now choose the features to be included in our model.") X = boston_df[['CRIM', 'RM']] y = boston.target Please note that we are separating the target variable from the predictor variables. Although this dataset in not too large, we are going to follow best practice and split the data into training and testing sets:  X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=7) We will only use the training set in the creation of the model and will test with the remaining data points.  my_model = glm_boston(X_train, y_train) The line of code above assumes that we have defined the function gym_boston as follows: def glm_boston(X, y): print("Implementing a simple linear regression.") lm = linear_model.LinearRegression() gml = lm.fit(X, y) return gml Notice that we are using the LinearRegression implementation in Scikit-learn. Let us go back to the main function we are building and extract the coefficients for our linear model. Refer to the CoreML - Linear Regression post to remember that type of model that we are building is of the form $y = \alpha + \beta_1 x_1 + \beta_2 x_2 + \epsilon$:  coefs = [my_model.intercept_, my_model.coef_] print("The intercept is {0}.".format(coefs[0])) print("The coefficients are {0}.".format(coefs[1])) We can also take a look at some metrics that tell let us evaluate our model against the test data:  # calculate MAE, MSE, RMSE print("The mean absolute error is {0}.".format( metrics.mean_absolute_error(y_test, y_pred))) print("The mean squared error is {0}.".format( metrics.mean_squared_error(y_test, y_pred))) print("The root mean squared error is {0}.".format( np.sqrt(metrics.mean_squared_error(y_test, y_pred)))) ## CoreML conversion And now for the big moment: We are going to convert our model to an .mlmodel object!! Ready?  print("Let us now convert this model into a Core ML object:") # Convert model to Core ML coreml_model = coremltools.converters.sklearn.convert(my_model, input_features=["crime", "rooms"], output_feature_names="price") # Save Core ML Model coreml_model.save("PriceBoston.mlmodel") print("Done!") We are using the sklearn.convert method of coremltools.converters to create the my_model model with the necessary inputs (i.e. crime and rooms) and output (price). Finally we save the model in a file with the name PriceBoston.mlmodel. Et voilà! In the next post we will start creating an iOS app to use the model we have just built. You can look at the code (in development) in my github site here. Read me... ## Core ML - Preparing the environment Hello again! In preparation to training a model to be converted by Core ML to be used in an application, I would like to make sure we have a suitable environment to work on. One of the first things that came to my attention looking at the coreml module is the fact that it only supports Python 2! Yes, you read correctly, you will have to make sure you use Python 2.7 if you want to make this work. As you probably know, Python 2 will be retired in 2020, so I hope that Apple is considering in their development cycles. In the meantime you can see the countdown to Python 2's retirement here, and thanks Python 2 for the many years of service... Anyway, if you are a Python 2 user, then you are good to go. If on the other hand you have moved with the times you may need to make appropriate installations. I am using Anaconda (you may use your favourite distro) and I will be creating a conda environment (I'm calling it coreml) with Python 2.7 and some of the libraries I will be using: > conda create --name coreml python=2.7 ipython jupyter scikit-learn > source activate coreml (coreml) > pip install coremltools I am sure there may be other modules that will be needed, and I will make appropriate installations (and additions to this post) as that becomes clearer. You can get a look at Apple's coremltools github repo here. ADDITIONS: As I mentioned, there may have been other modules that needed installing in the new environment here is a list: • pandas • matplotlib • pillow Read me... ## Core ML - What is it? In a previous post I mentioned that I will be sharing some notes about my journey with doing data science and machine learning by Apple technology. This is the firsts of those posts and here I will go about what Core ML is... Core ML is a computer framework. So what is a framework? Well, in computer terms is a software abstraction that enables generic functionality to be modified as required by the user to transform it into software for specific purposes to enable the development of a system or even a humble project. So Core ML is an Apple provided framework to speed apps that use trained machine learning models. Notice that word in bold - trained - is part of the description of the framework. This means that the model has to be developed externally with appropriate training data for the specific project in mind. For instance if you are interested in building a classifier that distinguishes cats from cars, then you need to train the model with lots of cat and car images. As it stands Core ML supports a variety of machine learning models, from generalised linear models (GLMs for short) to neural nets. Furthermore it helps with the tests of adding the trained machine learning model to your application by automatically creating a custom programmatic interface that supplies an APU to your model. All this within the comfort of Xcode! There is an important point to remember. The model has to be developed externally from Core ML, in other words you may want to use your favourite machine learning framework (that word again), computer language and environment to cover the different aspects of the data science workflow. You can read more in that in Chapter 3 of my "Data Science and Analytics with Python" book. So whether you use Scikit-learnm, Keras or Caffe, the model you develop has to be trained (tested and evaluated) beforehand. Once you are ready, then Core ML will support you in bringing it to the masses via your app. As mentioned in the Core ML documentation: Core ML is optimized for on-device performance, which minimizes memory footprint and power consumption. Running strictly on the device ensures the privacy of user data and guarantees that your app remains functional and responsive when a network connection is unavailable. OK, so in the next few posts we will be using Python and coreml tools to generate a so-called .mlmodel file that Xcode can use and deploy. Stay tuned! Read me... ## Machine Learning with Apple - An Open Notebook We all know how cool machine learning, predictive analytics and data science concepts and problems are. There are a number of really interesting technologies and frameworks to use and choose from. I have been a Python and R user for some time now and they seem to be pretty good for a lot of the things I have to do on a day-to-day basis. As many of you know, I am also a mac user and have been for quite a lot time. I remember using early versions of Mathematica on PowerMacs back at Uni... I digress.. Apple has also been moving into the machine learning arena and has made available a few interesting goodies that help people like me make the most of the models we develop. I am starting a series of posts that I hope can be seen as an "open notebook" of my experimentation and learning with Apple technology. One that comes to mind is CoreML, a new framework that makes running various machine learning and statistical models on macOS and iOS natively supported. The idea is that the framework helps data scientists and developers bridge the gap between them by integrating trained models into our apps. Sounds cool, don't you think? Ready... Let's go! Read me... ## 10-10 Celebrate Ada Lovelace day! It's Ada Lovelace day, celebrating the work of women in mathematics, science, technology and engineering. To join the celebration +Plus Magazine revisits a collection of interviews with female mathematicians produced earlier this year. The interviews accompany the Women of Mathematics photo exhibition, which celebrates female mathematicians from institutions throughout Europe. It was launched in Berlin in the summer of 2016 and is now touring European institutions. To watch the interviews with the women or read the transcripts, and to see the portraits that featured in the exhibition, click on the links below. For more content by or about female mathematicians click here. Read me... ## The incredible growth of Python A reblog of the post by Dave Robinson | 09/12/2017 We recently explored how wealthy countries (those defined as high-income by the World Bank) tend to visit a different set of technologies than the rest of the world. Among the largest differences we saw was in the programming language Python. When we focus on high-income countries, the growth of Python is even larger than it might appear from tools like Stack Overflow Trends, or in other rankings that consider global software development. In this post, we’ll explore the extraordinary growth of the Python programming language in the last five years, as seen by Stack Overflow traffic within high-income countries. The term “fastest-growing” can be hard to define precisely, but we make the case that Python has a solid claim to being the fastest-growing major programming language. All the numbers discussed in this post are for high-income countries; they’re generally representative of trends in the United States, United Kingdom, Germany, Canada, and other such countries, which in combination make up about 64% of Stack Overflow’s traffic. Many other countries such as India, Brazil, Russia, and China also make enormous contributions to the global software development ecosystem, and this post is less descriptive of those economies, though we’ll see that Python has shown growth there as well. It’s worth emphasizing up front that the number of users of a language isn’t a measure of the language’s quality: we’re describing the languages developers use, but not prescribing anything. (Full disclosure: I used to programprimarily in Python, though I have since switched entirely to R). ### Python’s growth in high-income countries You can see on Stack Overflow Trends that Python has been growing rapidly in the last few years. But for this post we’ll focus on high-income countries, and consider visits to questions rather than questions asked (this tends to give similar results, but has less month-by-month noise, especially for smaller tags). We have data on Stack Overflow question views going back to late 2011, and in this time period we can consider the growth of Python relative to five other major programming languages. (Note that this is therefore a shorter time scale than the Trends tool, which goes back to 2008). These are currently six of the ten most-visited Stack Overflow tags in high-income countries; the four we didn’t include are CSS, HTML, Android, and JQuery. June 2017 was the first month that Python was the most visited tag on Stack Overflow within high-income nations. This included being the most visited tag within the US and the UK, and in the top 2 in almost all other high income nations (next to either Java or JavaScript). This is especially impressive because in 2012, it was less visited than any of the other 5 languages, and has grown by 2.5-fold in that time. Part of this is because of the seasonal nature of traffic to Java. Since it’s heavily taught in undergraduate courses, Java traffic tends to rise during the fall and spring and drop during the summer. Will it catch up with Python again by the end of the year? We can try forecasting the next two years of growth with a http://otexts.org/fpp2/sec-6-stl.html, which combines growth with seasonal trends to make a prediction about future values. According to this model, Python could either stay in the lead or be overtaken by Java in the fall (it’s roughly within the variation of the model’s predictions), but it’s clearly on track to become the most visited tag in 2018. STL also suggests that JavaScript and Java will remain at similar levels of traffic among high income countries, just as they have for the last two years. ### What tags are growing the fastest overall? The above was looking only at the six most-visited programming languages. Among other notable technologies, which are currently growing the fastest in high-income countries? We defined the growth rate in terms of the ratio between 2017 and 2016 share of traffic. We decided to consider only programming languages (like Java and Python) and platforms (such as iOS, Android, Windows and Linux) in this analysis, as opposed to frameworks like Angular or libraries like TensorFlow (although many of those showed notable growth that may be examined in a future post). Because of the challenges in defining “fastest-growing” described in this comic, we compare the growth to the overall average in a mean-difference plot. With a 27% year-over year-growth rate, Python stands alone as a tag that is both large and growing rapidly; the next-largest tag that shows similar growth is R. We see that traffic to most other large tags has stayed pretty steady within high-income countries, with visits to Android, iOS, and PHP decreasing slightly. We previously examined some of the shrinking tags like Objective-C, Perl and Ruby in our post on the death of Flash). We can also notice that among functional programming languages, Scala is the largest and growing, while F# and Clojure are smaller and shrinking, with Haskell in between and remaining steady. There’s an important omission from the above chart: traffic to TypeScript questions grew by an impressive 142% in the last year, enough that we left it off to avoid overwhelming the rest of the scale. You can also see that some other smaller languages are growing similarly or faster than Python (like R, Go and Rust), and there are a number of tags like Swift and Scala that are also showing impressive growth. How does their traffic over time compare to Python’s? The growth of languages like R and Swift is indeed impressive, and TypeScript has shown especially rapid expansion in an even shorter time. Many of these smaller languages grew from getting almost no question traffic to become notable presences in the software ecosystem. But as this graph shows, it’s easier to show rapid growth when a tag started relatively small. Note that we’re not saying these languages are in any way “competing” with Python. Rather, we’re explaining why we’d treat their growth in a separate category; these were lower-traffic tags to start with. Python is an unusual case for being both one of the most visited tags on Stack Overflow and one of the fastest-growing ones. (Incidentally, it is also accelerating! Its year-over-year growth has become faster each year since 2013). ### Rest of the world So far in this post we’ve been analyzing the trends in high-income countries. Does Python show a similar growth in the rest of the world, in countries like India, Brazil, Russia and China? Indeed it does. Outside of high-income countries Python is still the fastest growing major programming language; it simply started at a lower level and the growth began two years later (in 2014 rather than 2012). In fact, the year-over-year growth rate of Python in non-high-income countries is slightly higher than it is in high-income countries. We don’t examine it here, but R, the other language whose usage is positively correlated with GDP, is growing in these countries as well. Many of the conclusions in this post about the growth and decline of tags (as opposed to the absolute rankings) in high-income countries hold true for the rest of the world; there’s a 0.979 Spearman correlation between the growth rates in the two segments. In some cases, you can see a “lagging” phenomenon similar to what happened with Python, where a technology was widely adopted within high-income countries a year or two before it expanded in the rest of the world. (This is an interesting phenomenon and may be the subject of a future blog post!) ### Next time We’re not looking to contribute to any “language war.” The number of users of a language doesn’t imply anything about its quality, and certainly can’t tell you which language is more appropriate for a particular situation. With that perspective in mind, however, we believe it’s worth understanding what languages make up the developer ecosystem, and how that ecosystem might be changing. This post demonstrated that Python has shown a surprising growth in the last five years, especially within high-income countries. In our next post, we’ll start to explore the “why”. We’ll segment the growth by country and by industry, and examine what other technologies tend to be used alongside Python (to estimate, for example, how much of the growth has been due to increased usage of Python for web development versus for data science). Original Source. [tags Programming, Phyton, Data Science] Read me... ## Data Science and Analytics with Python - New York Team Earlier this week I received this picture of the team in New York. As you can see they have recently all received a copy of my "Data Science and Analytics with Python" book. Thanks guys! Read me... ## Languages for Data Science Very often the question about what programming language is best for data science work. The answer may depend on who you ask, there are many options out there and they all have their advantages and disadvantages. Here are some thoughts from Peter Gleeson on this matter: While there is no correct answer, there are several things to take into consideration. Your success as a data scientist will depend on many points, including: Specificity When it comes to advanced data science, you will only get so far reinventing the wheel each time. Learn to master the various packages and modules offered in your chosen language. The extent to which this is possible depends on what domain-specific packages are available to you in the first place! Generality A top data scientist will have good all-round programming skills as well as the ability to crunch numbers. Much of the day-to-day work in data science revolves around sourcing and processing raw data or ‘data cleaning’. For this, no amount of fancy machine learning packages are going to help. Productivity In the often fast-paced world of commercial data science, there is much to be said for getting the job done quickly. However, this is what enables technical debt to creep in — and only with sensible practices can this be minimized. Performance In some cases it is vital to optimize the performance of your code, especially when dealing with large volumes of mission-critical data. Compiled languages are typically much faster than interpreted ones; likewise statically typed languages are considerably more fail-proof than dynamically typed. The obvious trade-off is against productivity. To some extent, these can be seen as a pair of axes (Generality-Specificity, Performance-Productivity). Each of the languages below fall somewhere on these spectra. With these core principles in mind, let’s take a look at some of the more popular languages used in data science. What follows is a combination of research and personal experience of myself, friends and colleagues — but it is by no means definitive! In approximately order of popularity, here goes: ### R #### What you need to know Released in 1995 as a direct descendant of the older S programming language, R has since gone from strength to strength. Written in C, Fortran and itself, the project is currently supported by the R Foundation for Statistical Computing. #### License Free! #### Pros • Excellent range of high-quality, domain specific and open source packages. R has a package for almost every quantitative and statistical application imaginable. This includes neural networks, non-linear regression, phylogenetics, advanced plotting and many, many others. • The base installation comes with very comprehensive, in-built statistical functions and methods. R also handles matrix algebra particularly well. • Data visualization is a key strength with the use of libraries such as ggplot2. #### Cons • Performance. There’s no two ways about it, R is not a quick language. • Domain specificity. R is fantastic for statistics and data science purposes. But less so for general purpose programming. • Quirks. R has a few unusual features that might catch out programmers experienced with other languages. For instance: indexing from 1, using multiple assignment operators, unconventional data structures. #### Verdict — “brilliant at what it’s designed for” R is a powerful language that excels at a huge variety of statistical and data visualization applications, and being open source allows for a very active community of contributors. Its recent growth in popularity is a testament to how effective it is at what it does. ### Python #### What you need to know Guido van Rossum introduced Python back in 1991. It has since become an extremely popular general purpose language, and is widely used within the data science community. The major versions are currently 3.6 and 2.7. #### License Free! #### Pros • Python is a very popular, mainstream general purpose programming language. It has an extensive range of purpose-built modules and community support. Many online services provide a Python API. • Python is an easy language to learn. The low barrier to entry makes it an ideal first language for those new to programming. • Packages such as pandas, scikit-learn and Tensorflow make Python a solid option for advanced machine learning applications. #### Cons • Type safety: Python is a dynamically typed language, which means you must show due care. Type errors (such as passing a String as an argument to a method which expects an Integer) are to be expected from time-to-time. • For specific statistical and data analysis purposes, R’s vast range of packages gives it a slight edge over Python. For general purpose languages, there are faster and safer alternatives to Python. #### Verdict — “excellent all-rounder” Python is a very good choice of language for data science, and not just at entry-level. Much of the data science process revolves around the ETL process (extraction-transformation-loading). This makes Python’s generality ideally suited. Libraries such as Google’s Tensorflow make Python a very exciting language to work in for machine learning. ### SQL #### What you need to know SQL (‘Structured Query Language’) defines, manages and queries relational databases. The language appeared by 1974 and has since undergone many implementations, but the core principles remain the same. #### License Varies — some implementations are free, others proprietary #### Pros • Very efficient at querying, updating and manipulating relational databases. • Declarative syntax makes SQL an often very readable language . There’s no ambiguity about what SELECT name FROM users WHERE age > 18 is supposed to do! • SQL is very used across a range of applications, making it a very useful language to be familiar with. Modules such as SQLAlchemy make integrating SQL with other languages straightforward. #### Cons • SQL’s analytical capabilities are rather limited — beyond aggregating and summing, counting and averaging data, your options are limited. • For programmers coming from an imperative background, SQL’s declarative syntax can present a learning curve. • There are many different implementations of SQL such as PostgreSQL, SQLite, MariaDB . They are all different enough to make inter-operability something of a headache. #### Verdict — “timeless and efficient” SQL is more useful as a data processing language than as an advanced analytical tool. Yet so much of the data science process hinges upon ETL, and SQL’s longevity and efficiency are proof that it is a very useful language for the modern data scientist to know. ### Java #### What you need to know Java is an extremely popular, general purpose language which runs on the (JVM) Java Virtual Machine. It’s an abstract computing system that enables seamless portability between platforms. Currently supported by Oracle Corporation. #### License Version 8 — Free! Legacy versions, proprietary. #### Pros • Ubiquity . Many modern systems and applications are built upon a Java back-end. The ability to integrate data science methods directly into the existing codebase is a powerful one to have. • Strongly typed. Java is no-nonsense when it comes to ensuring type safety. For mission-critical big data applications, this is invaluable. • Java is a high-performance, general purpose, compiled language . This makes it suitable for writing efficient ETL production code and computationally intensive machine learning algorithms. #### Cons • For ad-hoc analyses and more dedicated statistical applications, Java’s verbosity makes it an unlikely first choice. Dynamically typed scripting languages such as R and Python lend themselves to much greater productivity. • Compared to domain-specific languages like R, there aren’t a great number of libraries available for advanced statistical methods in Java. #### Verdict — “a serious contender for data science” There is a lot to be said for learning Java as a first choice data science language. Many companies will appreciate the ability to seamlessly integrate data science production code directly into their existing codebase, and you will find Java’s performance and and type safety are real advantages. However, you’ll be without the range of stats-specific packages available to other languages. That said, definitely one to consider — especially if you already know one of R and/or Python. ### Scala #### What you need to know Developed by Martin Odersky and released in 2004, Scala is a language which runs on the JVM. It is a multi-paradigm language, enabling both object-oriented and functional approaches. Cluster computing framework Apache Spark is written in Scala. #### License Free! #### Pros • Scala + Spark = High performance cluster computing. Scala is an ideal choice of language for those working with high-volume data sets. • Multi-paradigmatic: Scala programmers can have the best of both worlds. Both object-oriented and functional programming paradigms available to them. • Scala is compiled to Java bytecode and runs on a JVM. This allows inter-operability with the Java language itself, making Scala a very powerful general purpose language, while also being well-suited for data science. #### Cons • Scala is not a straightforward language to get up and running with if you’re just starting out. Your best bet is to download sbt and set up an IDE such as Eclipse or IntelliJ with a specific Scala plug-in. • The syntax and type system are often described as complex. This makes for a steep learning curve for those coming from dynamic languages such as Python. #### Verdict — “perfect, for suitably big data” When it comes to using cluster computing to work with Big Data, then Scala + Spark are fantastic solutions. If you have experience with Java and other statically typed languages, you’ll appreciate these features of Scala too. Yet if your application doesn’t deal with the volumes of data that justify the added complexity of Scala, you will likely find your productivity being much higher using other languages such as R or Python. ### Julia #### What you need to know Released just over 5 years ago, Julia has made an impression in the world of numerical computing. Its profile was raised thanks to early adoption by several major organizationsincluding many in the finance industry. #### License Free! #### Pros • Julia is a JIT (‘just-in-time’) compiled language, which lets it offer good performance. It also offers the simplicity, dynamic-typing and scripting capabilities of an interpreted language like Python. • Julia was purpose-designed for numerical analysis. It is capable of general purpose programming as well. • Readability. Many users of the language cite this as a key advantage #### Cons • Maturity. As a new language, some Julia users have experienced instability when using packages. But the core language itself is reportedly stable enough for production use. • Limited packages are another consequence of the language’s youthfulness and small development community. Unlike long-established R and Python, Julia doesn’t have the choice of packages (yet). #### Verdict — “one for the future” The main issue with Julia is one that cannot be blamed for. As a recently developed language, it isn’t as mature or production-ready as its main alternatives Python and R. But, if you are willing to be patient, there’s every reason to pay close attention as the language evolves in the coming years. ### MATLAB #### What you need to know MATLAB is an established numerical computing language used throughout academia and industry. It is developed and licensed by MathWorks, a company established in 1984 to commercialize the software. #### License Proprietary — pricing varies depending on your use case #### Pros • Designed for numerical computing. MATLAB is well-suited for quantitative applications with sophisticated mathematical requirements such as signal processing, Fourier transforms, matrix algebra and image processing. • Data Visualization. MATLAB has some great inbuilt plotting capabilities. • MATLAB is often taught as part of many undergraduate courses in quantitative subjects such as Physics, Engineering and Applied Mathematics. As a consequence, it is widely used within these fields. #### Cons • Proprietary licence. Depending on your use-case (academic, personal or enterprise) you may have to fork out for a pricey licence. There are free alternatives available such as Octave. This is something you should give real consideration to. • MATLAB isn’t an obvious choice for general-purpose programming. #### Veredict — “best for mathematically intensive applications” MATLAB’s widespread use in a range of quantitative and numerical fields throughout industry and academia makes it a serious option for data science. The clear use-case would be when your application or day-to-day role requires intensive, advanced mathematical functionality; indeed, MATLAB was specifically designed for this. ### Other Languages There are other mainstream languages that may or may not be of interest to data scientists. This section provides a quick overview… with plenty of room for debate of course! #### C++ C++ is not a common choice for data science, although it has lightning fast performance and widespread mainstream popularity. The simple reason may be a question of productivity versus performance. “If you’re writing code to do some ad-hoc analysis that will probably only be run one time, would you rather spend 30 minutes writing a program that will run in 10 seconds, or 10 minutes writing a program that will run in 1 minute?” The dude’s got a point. Yet for serious production-level performance, C++ would be an excellent choice for implementing machine learning algorithms optimized at a low-level. Verdict — “not for day-to-day work, but if performance is critical…” ## Python 3, Pandas and Encoding Issues It is not unusual to come across encoding problems when opening files in Python 3. The subject matter is a large topic of discussion, and here I am providing some quick ways to deal with a typical encoding issue you are likely to encounter. Say you are interested in opening a CSV file to be loaded into a pandas dataframe. If the stars align and the generator of your CSV is magnanimous, they may have saved the file using UTF-8. If so you may get away with reading the file (here called my file.csv) as follows import python as pd df = pd.read_csv('myfile.csv') You should in principle pass a parameter to pandas telling it what encoding the file has been saved with, so a more complete version of the snippet above would be: import python as pd df = pd.read_csv('myfile.csv', encoding='utf-8') Encoding conundrum What happens when you don't know what encoding was used to save the file? Well, you can ask, but it is very unlikely that the file generator know... What to do? Well there are some libraries that can be helpful. Install the chardet module as follows from the terminal pip install chardet And use the following snippet as a guide: import chardet import python as pd def find_encoding(fname): r_file = open(fname, 'rb').read() result = chardet.detect(r_file) charenc = result['encoding'] return charenc my_encoding = find_encoding('myfile.csv') df = pd.read_csv('myfile.csv', encoding=my_encoding)  Et voilà! Read me... ## Running a Python workshop ## iPad keyboard - Caps Lock key changes language I have been experiencing this issue for some time now... I have an external keyboard for my iPad and every time that I hit the Caps Lock key, instead of locking the capital letters in the keyboard, the iPad changes language... This is particularly annoying as I use several languages, from Spanish to Japanese. I decided that enough is enough and I have now managed to find a way to avoid this: 1. Go to Settings, General 2. Open Keyboard 3. Select Hardware Keyboard 4. Switch off the Caps Lock switch to/from Latin Et voilà! Read me... ## Jupyter not Launching after Updating to MacOS 10.12.5 If you have been itching to update the operating system on your shiny Mac, beware that there is a broken link when launching Jupyter. Do not panic, simply follow these simple steps: • Open a terminal and type the following command: jupyter notebook --generate-config • In the terminal navigate to the place where the configuration file is. In other words, type the following: cd ~/.jupyter/jupyter_notebook_config.py • Open the jupyter_noteebook_config.py file and look for a line that says: # c.NotebookApp.browser = '' Change the line as follows: Delete the hash and write the name of your browser in between the quotes. For SAFARI it would look as follows c.NotebookApp.browser = 'safari' If you are a CHROME supporter, the command looks a tad bit more complicated: c.NotebookApp.browser = 'open -a /Applications/Google\ Chrome.app %s' • Save the script and close it. • Restart your terminal and launch Jupyter... Et voilà! Read me... ## Jetpack Issue Solved - HTTP status code was not 200 (500) I have been having some troubles managing a wordpress site via the Wordpress App. The stats and other things worked fine, but I was not able to see my existing posts or pages. Everytime I tried to synchronise them I would get an error that read something along the lines of: "transport error – HTTP status code was not 200 (500)”  I did not pay too much attention to it as I thought it was an error with the App and that in new updates it would get sorted... but that did not seem to be the case. This morning I rolled my sleeves and decided to take a look at things. I ended up updating the PHP version in the site from 5.3 to 5.4 and it seems to have done the trick!! So, I thought of letting you know. Now, I cannot guarantee that that is the end of the story, but at least my posts are showing in the mobile app. -j Read me... ## Data Science and Analytics with Python - Cover Well, I am very pleased to show you the cover that will be used for "Data Science and Analytics with Python" book. Not long to publication day! Read me... ## GA - Intro to Data Science and Analytics Very pleased to have given an intro talk on Data Science and Analytics at General Assembly yesterday. Read me... ## "Essential Matlab and Octave" in the CERN Document Server I got pinged this screenshot from a friend that saw "Essential MATLAB and Octave" being included in the CERN Document Server! Chuffed! Read me... ## Data Science and Analytics with Python - Proofread Manuscript I have now received comments and corrections for the proofreading of my “Data Science and Analytics with Python” book. Two weeks and counting to return corrections and comments back to the editor and project manager. Read me... ## Anaconda - Guarenteed Python packages via Conda and Conda-Forge During the weekend I got a member of the team getting in touch because he was unable to get a Python package working for him . He had just installed Python in his machine, but things were not quite right... For example pip was not working and he had a bit of a bother setting some environment variables... I recommended to him having a look at installing Python via the Anaconda distribution. Today he was up and running with his app. Given that outcome, I thought it was a great coincidence that the latest episode of Talk Python To Me that started playing on my way back home happened to be about Conda and Conda-Forge. I highly recommend listening to it. Take a loook: ### Talk Python To Me - Python conversations for passionate developers - #94 Guarenteed packages via Conda and Conda-Forge Have you ever had trouble installing a package you wanted to use in your Python app? Likely it contained some odd dependency, required a compilation step, maybe even using an uncommon compiler like Fortran. Did you try it on Windows? How many times have you seen "Cannot find vcvarsall.bat" before you had to take a walk? If this sounds familiar, you might want to check conda the package manager, Anaconda, the distribution, conda forge, and conda build. They dramatically lower the bar for installing packages on all the platforms. This week you'll meet Phil Elson, Kale Franz, and Michael Sarahan who all work on various parts of this ecosystem. Links from the show: conda: conda.pydata.org conda-build: conda.pydata.org/docs/commands/build/conda-build.html Anaconda distribution: continuum.io/anaconda-overview conda-forge: conda-forge.github.io Phil Elson on Twitter: @pypelson Kale Franz: @kalefranz Michael Sarahan: github.com/msarahan Read me... ## The Winton Gallery opens at the Science Museum During the recent Christmas and New Year break I had the opportunity to visit the Science Museum (yes, again...). This time to see the newly opened Winton Gallery that housed the Mathematics exhibit in the museum. Not only is the exhibit about a subject matter close to my heart, but also the gallery was designed by Zaha Hadid Architects. I must admit, that the first I heard of this was in a recent visit to the IMAX at the Science Museum to see Rogue One... Anyway, I took some pictures that you can see in the photo gallery here, and I am also re-posting an entry that appeared in the London Mathematical Society newsletter Number 465 for January 2017. Mathematics: The Winton Gallery opens at the Science Museum, London On 8 December 2016 the Science Museum opened a pioneering new gallery that explores how mathematicians, their tools and ideas have helped shape the modern world over the last 400 years. Mathematics: The Winton Gallery places mathematics at the heart of all our lives, bringing the subject to life through remarkable stories, artefacts and design. More than 100 treasures from the Science Museum’s world-class science, technology, engineering and mathematics collections help tell powerful stories about how mathematical practice has shaped and been shaped by some of our most fundamental human concerns – including money, trade, travel, war, life and death. From a beautiful 17th-century Islamic astrolabe that used ancient mathematical techniques to map the night sky to an early example of the famous Enigma machine, designed to resist even the most advanced mathematical techniques for codebreaking, each historical object has an important story to tell about how mathematics has shaped our world. Archive photography and lm helps capture these stories and digital exhibits alongside key objects introduce the wide range of people who made, used or were affected by each mathematical device. Dramatically positioned at the centre of the gallery is the Handley Page ‘Gugnunc’ aircraft, built in 1929 for a competition to construct a safe aircraft. Ground-breaking aerodynamic research influenced the wing design of this experimental aircraft, helping transform public opinion about the safety of ying and securing the future of the aviation industry. This aeroplane highlights perfectly the central theme of the gallery about how mathematical practice is driven by, and in uences, real-world concerns and activities. Mathematics also defines Zaha Hadid Architects’ design for the gallery. Inspired by the Handley Page aircraft, the gallery is laid out using principles of mathematics and physics. These principles also inform the three-dimensional curved surfaces representing the patterns of air ow that would have streamed around this aircraft. Patrik Schumacher, Partner at Zaha Hadid Architects, recently noted that mathematics was part of Zaha Hadid’s life from a young age and was always the foundation of her architecture, describing the new mathematics gallery as ‘an important part of Zaha’s legacy in London’. Gallery curator David Rooney, who was respon- sible for the Science Museum’s recent award- winning Codebreaker: Alan Turing’s Life and Legacy exhibition, explained that the gallery tells ‘a rich cultural story of human endeavor that has helped transform the world’. The mathematics gallery was made possible through an unprecedented donation from long-standing supporters of science, David and Claudia Harding. Additional support was also provided by Principal Sponsor Samsung, Major Sponsor MathWorks and a number of individual donors. A lavishly illustrated new book, Mathematics: How It Shaped Our World, written by David Rooney and published by Scala Arts & Heritage Publishers, accompanies the new display. It expands the stories covered in the gallery and contains an absorbing series of newly commissioned essays by prominent historians and mathematicians including June Barrow-Green, Jim Bennett, Patricia Fara, Dame Celia Hoyles and Helen Wilson, with an afterword from Dame Zaha Hadid with Patrick Schumacher. Read me... ## World of Watson - Talk on Data Science Last week I had the opportunity to attend the annual IBM conference in Las Vegas. The World of Watson conference, formally known as Insight, provided me with an opportunity to meet new interesting people, talk to colleagues and customers, learn new things and share some ideas with like-minded people. As you can imagine, with Watson being at the centre stage of the event, there were a large number of presentations, stands and marketing featuring Watson-related things: from cognitive chocolate and brews through to cognitive computing and beyond. My session took place on Monday October 24th and I was very pleased to see a full room, and even later standing-room only just minuted before the start. We covered some of the fundamentals of data science and machine learning and took the pulse of their use in the insurance industry in particular. I then had the opportunity of sharing some of the results of the work we have been doing over the past 12 months at the Data Science Studio in London. The case studies showcased included examples in insurance, banking, wealth management and retail. All in all, it was a very successful and enjoyable trip, in spite of the constant flashing lights of the slot machines around Las Vegas different venues. Read me... ## Extract tables from messy spreadsheets with jailbreakr (reblog) The original blog can be seen here. R has some good tools for importing data from spreadsheets, among them the readxl package for Excel and the googlesheets package for Google Sheets. But these only work well when the data in the spreadsheet are arranged as a rectangular table, and not overly encumbered with formatting or generated with formulas. As Jenny Bryan pointed out in her recent talk at the useR!2016 conference (and embedded below, or download PDF slides here), in practice few spreadsheets have "a clean little rectangle of data in the upper-left corner", because most people use spreadsheets not just a file format for data retrieval, but also as a reporting/visualization/analysis tool. Nonetheless, for a practicing data scientist, there's a lot of useful data locked up in these messy spreadsheets that needs to be imported into R before we can begin analysis. As just one example given by Jenny in her talk, this spreadsheet was included as one of 15,000 spreadsheet attachments (one with 175 tabs!) in the Enron Corpus. To make it easier to import data into R from messy spreadsheets like this, Jenny and co-author Richard G. FitzJohn created the jailbreakr package. The package is in its early stages, but it can already import Excel (xlsx format) and Google Sheets intro R as a new "linen" objects from which small sub-tables can easily be extracted as data frames. It can also print spreadsheets in a condensed text-based format with one character per cell — useful if you're trying to figure out why an apparently simple spreadsheet isn't importing as you expect. (Check out the "weekend getaway winner" story near the end of Jenny's talk for a great example.) The jailbreakr package isn't yet on CRAN, but if you want to try it out you can download it from the Github repository (or even contribute!) at the link below. Github (rsheets): jailbreakr Read me... ## Raspberry Pi I am very pleased to have finally received the Raspberry Pi 3 that I ordered the other day. I also got a Sense Hat - an add-on board for Raspberry Pi, made especially for the Astro Pi mission The Sense HAT has an 8×8 RGB LED matrix, a five-button joystick and includes the following sensors: • Gyroscope • Accelerometer • Magnetometer • Temperature • Barometric pressure • Humidity There is even a Python library providing easy access to everything on the board. I can't wait to start using it with some of the APIs available at Bluemix for example. Any ideas are more than welcome. Read me... ## Bluemix - a set of tools/tutorials for app development IBM's Bluemix provides access to a large set of API's such as Watson services like AlchemyAPI, Natural Language Classifier, Visual Recognition, Personality Insights and more. I have recently started playing with it a bit more. You can set up a free account (free for 30 days) and see what you think. Check it out: Here is what IBM has to say about it: Bluemix is the latest cloud offering from IBM. It enables organizations and developers to quickly and easily create, deploy, and manage applications on the cloud. Bluemix is an implementation of IBM's Open Cloud Architecture based on Cloud Foundry, an open source Platform as a Service (PaaS). Bluemix delivers enterprise-level services that can easily integrate with your cloud applications without you needing to know how to install or configure them. I will be happy to hear what you build and how you use bluemix. Keep in touch. Read me... ## Now Reading: Algorithms to Live By Now reading: Algorithms to Live By A pretty good read about how computer algorithms can be applied to our everyday lives, helping to solve common decision-making problems and more. Read me... ## Getting ready for WWDC16 in San Francisco Getting ready for WWDC16 in San Francisco Read me... ## Installing Spark 1.6.1 on a Mac with Scala 2.11 I have recently gone through the process of installing Spark in my mac for testing and development purposes. I also wanted to make sure I could use the installation not only with Scala, but also with PySpark through a Jupyter notebook. If you are interested in doing the same, here are the steps I followed. First of all, here are the packages you will need: • Python 2.7 or higher • Java SE Development Kit • Scala and Scala Build Tool • Spark 1.6.1 (at the time of writing) • Jupyter Notebook Python You can chose the best python distribution that suits your needs. I find Anaconda to be fine for my purposes. You can obtain a graphical installer from https://www.continuum.io/downloads. I am using Python 2.7 at the time of writing. Java SE Development Kit You will need to download Oracle Java SE Development Kit 7 or 8 at Oracle JDK downloads page. In my case, at the time of writing I am using 1.7.0_80. You can check the version you have by opening a terminal and typing java -version You also have to make sure that the appropriate environment variable is set up. In your ~/.bashr_profile add the following lines: export JAVA_HOME=(/usr/libexec/java_home)

Scala and Scala Build Tool

In this case, I found it much easier to use Homebrew to install and manage the Scala language. I f you have never used Homebrew, I recommend that you take a look. To install it you have to type the following in your terminal:

ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" Once you have Hombrew you can install Scala and the Scala Build Tool as follows: > brew install scala > brew install bst You may want to create appropriate environments in your ~/.bashr_profile : export SCALA_HOME=/usr/local/bin/scala export PATH=$PATH:$SCALA_HOME/bin Spark 1.6.1 Obtain Spark from https://spark.apache.org/downloads.html Note that for building Spark with Scala 2.11 you will need to download the Spark source code and build it appropriately. Once you have downloaded the tgz file, unzip it into an appropriate location (your home directory for example) and navigate to the unzipped folder (for example ~/spark-1.6.1 ) To build Spark with Scala 2.11 you need to type the following commands: > ./dev/change-version-to-2.11.sh > build/sbt clean assembly This may take a while, so sit tight! When finished, you can check that everything is working by launching either the Scala shell: > ./bin/spark-shell or the Python shell: > ./bin/pyspark Once again there are some environment variables that are recommended: export SPARK_PATH=~/spark-1.6.1 export PYSPARK_DRIVER_PYTHON="jupyter" export PYSPARK_DRIVER_PYTHON_OPTS="notebook" alias sparknb='$SPARK_PATH/bin/pyspark --master local[2]'

The last line is an alias that will enable us to launch a Jupyter notebook with PySpark. Totally optional!

Jupyter Notebook

If all is working well you are ready to go. Source your

bash_profile

and  launch a Jupyter notebook:

> sparknb

Et voilà!

## Makeover Monday: Will a sugar tax have an impact on childhood obesity?

Following up the Data+Visual Meetup hosted at IBM last Wednesday, I wanted to take part in the Makeover Monday project that Andy Kriebel highlighted during his talk.

This week the data was came from the BBC and in particular the visualisation that shows how people in the UK get their added sugar:

This is a story that follows up the recent announcement by Chancellor George Osbourne about a tax on sugary drinks in the UK. Here is my Makeover Monday for the visualisation above:

Voilà!

## How much should we fear the rise of artificial intelligence?

1. When the arena is something as pure as a board game, where the rules are entirely known and always exactly the same, the results are remarkable. When the arena is something as messy, unrepeatable and ill-defined as actuality, the business of adaptation and translation is a great deal more difficult.

Tom Chatfield

From the opinion article of Tom Chatfiled in The Guardian.

## Google Drive not synching on your Mac? Here is what to do

I am not a big used of Google Drive. It is a good service and it mostly does what one may need from a suite of productivity apps... but for some reason I only use it in very limited cases.

So, no surprise that I had not noticed that the synching between the cloud version of my documents and those in my mac had gone pear shaped. I tried logging out of Drive but that did not help. I attempted forcing the synch by making changed in both the cloud version and the Mac, but same result.

I managed to sort it out in the end and here is what I did:

1. Exit the Drive application
2. Navigate to the Application Support folder and look for the Google folderYou may need to find the hidden Library folder
• In Finder look for the Go menu and press Option + Cmd to reveal the hidden folder
• Once there look for the "Application Support"
• Alternatively you can press Cmd + Shift + G and go to "~/Library/Application Support/Google"
3. Delete the Drive fokder
4. Start the Drive application

Et voilà

## A quick way to tame Mac notification while giving a presentation

Surely you have suffered this same situation: You are giving a really good presentation, with a fantastic slide deck in your shiny MacBook, you are dominating the stage and people are nodding at your witty insights... and then an email notification appears in the top right-hand corner of the screen, followed by a FaceTime call from your other-half.... Noooooo!

A good way to disable these notification is to ⌥-click (option-click) the notification bar:

## How Watson answers a question

If you wanted to know more about how Watson works, here is a good video that may help.

https://youtu.be/DywO4zksfXw

## Watson and some well-known people

I recently came across a few interesting videos of IBM's Watson appearing alongside some well-known people like Ridley Scott and Cerrie Fisher, here are some:

https://youtu.be/EjLDFqGaOE4

https://youtu.be/NX8y9T1MaP4

## Quantum algorithms for topological and geometric analysis of data

Story Source:

The above post is reprinted from materials provided by Massachusetts Institute of Technology. The original item was written by David L. Chandler. Note: Materials may be edited for content and length.

From gene mapping to space exploration, humanity continues to generate ever-larger sets of data -- far more information than people can actually process, manage, or understand.

Machine learning systems can help researchers deal with this ever-growing flood of information. Some of the most powerful of these analytical tools are based on a strange branch of geometry called topology, which deals with properties that stay the same even when something is bent and stretched every which way.

Such topological systems are especially useful for analyzing the connections in complex networks, such as the internal wiring of the brain, the U.S. power grid, or the global interconnections of the Internet. But even with the most powerful modern supercomputers, such problems remain daunting and impractical to solve. Now, a new approach that would use quantum computers to streamline these problems has been developed by researchers at MIT, the University of Waterloo, and the University of Southern California.

The team describes their theoretical proposal this week in the journal Nature Communications. Seth Lloyd, the paper's lead author and the Nam P. Suh Professor of Mechanical Engineering, explains that algebraic topology is key to the new method. This approach, he says, helps to reduce the impact of the inevitable distortions that arise every time someone collects data about the real world.

In a topological description, basic features of the data (How many holes does it have? How are the different parts connected?) are considered the same no matter how much they are stretched, compressed, or distorted. Lloyd explains that it is often these fundamental topological attributes "that are important in trying to reconstruct the underlying patterns in the real world that the data are supposed to represent."

It doesn't matter what kind of dataset is being analyzed, he says. The topological approach to looking for connections and holes "works whether it's an actual physical hole, or the data represents a logical argument and there's a hole in the argument. This will find both kinds of holes."

Using conventional computers, that approach is too demanding for all but the simplest situations. Topological analysis "represents a crucial way of getting at the significant features of the data, but it's computationally very expensive," Lloyd says. "This is where quantum mechanics kicks in." The new quantum-based approach, he says, could exponentially speed up such calculations.

Lloyd offers an example to illustrate that potential speedup: If you have a dataset with 300 points, a conventional approach to analyzing all the topological features in that system would require "a computer the size of the universe," he says. That is, it would take 2300 (two to the 300th power) processing units -- approximately the number of all the particles in the universe. In other words, the problem is simply not solvable in that way.

"That's where our algorithm kicks in," he says. Solving the same problem with the new system, using a quantum computer, would require just 300 quantum bits -- and a device this size may be achieved in the next few years, according to Lloyd.

"Our algorithm shows that you don't need a big quantum computer to kick some serious topological butt," he says.

There are many important kinds of huge datasets where the quantum-topological approach could be useful, Lloyd says, for example understanding interconnections in the brain. "By applying topological analysis to datasets gleaned by electroencephalography or functional MRI, you can reveal the complex connectivity and topology of the sequences of firing neurons that underlie our thought processes," he says.

The same approach could be used for analyzing many other kinds of information. "You could apply it to the world's economy, or to social networks, or almost any system that involves long-range transport of goods or information," Lloyd says. But the limits of classical computation have prevented such approaches from being applied before.

While this work is theoretical, "experimentalists have already contacted us about trying prototypes," he says. "You could find the topology of simple structures on a very simple quantum computer. People are trying proof-of-concept experiments."

The team also included Silvano Garnerone of the University of Waterloo in Ontario, Canada, and Paolo Zanardi of the Center for Quantum Information Science and Technology at the University of Southern California.

## Data Science Bootcamp - Done

Today I had the opportunity of running a #DataScience bootcamp in London. It was an all-day affair and although the attendees were engaged, I’m sure that by the end of the 6th hour they were quite tired.
The discussions ranged from what data science is, the skills required to become a data scientist and also to manage them. Finally we implemented some data analyses based  on linear regression, all using R. I was very pleased to see some of the results.

.

## Opening old Keynote/Pages files in new versions

Greetings readers! I hope you are all enjoying the break and getting ready for 2016.

This time I wanted to bring to your attention some information that you may find to be very useful. Particularly if, like me, you happen to have need some old slides, presentations or talks you have in Keynote but forgot (or rather did not need) to update to a newer version of the software. You may have thought that there would be some backward compatibility for this sort of thing, and you may be surprised that there is not an obvious click-and-update type solution. Nonetheless, not all is lost and you would not have to trash your presentations, unless of course they were not the slides you were looking for... This trick also works with Pages by the way.

You may find that when opening your old slide decks, Keynotes complains with:

This document can't be opened because it's too old. To open it, save it with Keynote '09 first.

and Pages with:

This document can't be opened because it's too old. To open it, save it with Pages '09 first.

Of course, if you have both versions installed this should not be a problem, but why would you do that? So, if you cannot open the old file in the first place, here is what you need to do (please make sure that you have a backup copy of your file... you never know...):

1. Open the Terminal and navigate to the directory where the old file is saved. So if your file is called
my_presentation.keynote

and it is saved in your Desktop just type

> cd Desktop
2. Rename the file with a .zip extension:
> mv my_presentation.keynote my_presentation.zip
3. Unzip the file:
> unzip my_presentation.zip -d my_presentation
4. Type the following command:
gunzip --stdout index.apxl.gz | sed 's-:version="72007061400"-:version="92008102400"-g' > index.apxl

and hit return. If you do not get any errors you are good to go.

5. Remove the
index.apxl.gz
6. Re-compress the folder and change the extension to the original one.

Try opening your file, it may still complain but at least you will be able to open it. Et voilà!

## MacTex updates for El Capitan

El Capitan! Great! The new version of the OS X operating system. New features, new fonts, new problems... I knew that updating was going to bring some unexpected problems with my applications, but I wanted to update... And ditto, as soon as I tried to take a look under the hood for a couple of things I realised that a fresh installation of homebrewwas going to be needed.

More importantly, with my new book on data science (aka "Data Science and Analytics with Python"), LaTeX is probably one of the most used things in my computer. So, I wanted to check that things were fine and although I could compile (currently trying to finish Chapter 3 in case you are wondering) but there were some issues here and there, for example TeX Live thought I was using version 0 (yes zero!) and it could not find some files.

It turns out that El Capitan does not let us write to /usr and the 2015 TeX distribution creates symbolic links to /usr/texbin, is removed (if it was there from a previous OS version) and cannot be installed. If a GUI looks by default at that location it will sadly no longer find it. That is why the terminal was not affected! (Phew!)

The solution is to tell the broken applications to look at /Library/TeX/texbin, in /Library/TeX which is “owned” by MacTEX so is allowed by El Capitan. So to fix Tex Live do the following:

•  Open TEX Live Utility  Preferences and click on the Choose. . .
•  That opens a file chooser. Type Shift-Cmd-G , enter /Library/TeX  into the dialog box and then press Return .
• Finally Double-Click  on texbin
• Et voilà

## Google's and Gerty's logos are quite similar

I have recently updated my applications and hit confused when trying to launche my book reader Gerty and instead of opening the book(s) I'm currently reading, I found staring at Googles's search bar...

I am sure that is something neither of them would like, but hey... Just pointing it out. The similarity is superficial, but enough to get confused when looking at small icons in a screen. Check it out:

## iPython Notebook is now Jupyter... I knew it!

It is not really news... Jupyter is the new name of the loved iPython project, and it has been for a while and as they Jupiter projects puts it themselves

## The language-agnostic parts of IPython are getting a new home in Project Jupyter

As announced in the python.org page, as of version 4.0, the The Big Split from the old iPython starts. I knew this and I even tweeted about it:

All, great, right? Well I still got surprised when after updating my Python installation and tried to start my ipython notebook I got an error that ended with:

File "importstring.py", line 31, in import_item
module = __import__(package, fromlist=[obj])
ImportError: No module named notebook.notebookapp


Then I remembered and to fix my problem I simply tried installing Jupyter (*I am using Anaconda) with the following command

conda install jupyter

Et voilà!

iPython Notebook is now Jupyter... I knew it!

## Cloudera Breakfast Briefing and Tofu Scientists

Last Thursday I attended a Cloudera Breakfast Briefing where Sean Owen was speaking about Spark and the examples were related to building decision trees and random forests. It was a good session in general.

Sean started his talk with an example using the Iris dataset using R, in particular the "party" library. He then moved on to talk about Spark and MLlib.

For the rest of the talk he used the "Covertype" data set that contains 581,012 data points describing trees using 54 features (elevation, slope, soil tye, etc,) predicting forest cover type (spruce, aspen, etc.). A very apt dataset for the construction of random forests, right? I was very pleased to see a new (for me) dataset being used!

Sean want over some bits and pieces about using Spark, highlighting the compactness of the code. He also turned his attention to the tuning of hyper-parameters and its importance.

There are different ways to approach this, but it is always about finding a balance, a trade-off. For a tree we can play with the depth of the tree, the maximum number of bins (i.e. the number of different decision rules to be tried), the amount of impurity (Gini or Entropy measures).

If we don't know the right values for the hyperparameters, we can try several ones.  Particularly if you have enough room on your cluster.

• Building a random forest: let various trees see only a subset of the data, then combine. Another approach is to let the trees see a subset of the features. The latter is a nice idea as this may be a more reasonable approach for large clusters, where communication among nodes is kept to a minimum -> good for Spark or Hadoop.

Sean finished with some suggestions of things one can try:

• Try SVM and LogisticRegression in MLlib
• Real-time scoring with Spark Streaming
• Use random decision forests for regression

Nonetheless, the best bit of this all was that after asking a couple of questions I managed to get my hands in a "Tofu Scientist" T-Shirt! Result!

## No shuffle in new iOS 8.4 Music App

I was not too sure about the new Apple Music offering, but so far it seems quite alright! The music choices are generally good, and I hope that as I use the music app in iOS 8.4 more the choices get better.

Unfortunately I ended up using the app while not having mobile coverage and no WiFi either... so I reverted to "My Music" and since I was in the middle of a run, I wanted just to hit the shuffle button and hope for the best... However, I was surprised that there was no shuffle button to be seen... I ended up hitting the first song in the list and take it from there. It turns out that the shuffle option is set by default, you just have to seed it by starting playing any song. That seems good, except for the fact that it is not obvious at all.

You can select if you want the shuffle mode or not after starting playing any song and expanding the "Now Playing" bar:

And there you will be able to see the usual Shuffle icon:

## You have big data? MIT researchers can help shrinking it!

In a lot of machine learning and data science applications, it is not unusual to use matrices to represent data. It is indeed a very convenient way to keep the information but also to do manipulations, calculations and other useful tricks. As the size of the data increases, of course the size of the matrices grows too and that can be a bit problematic. Finding a way to reduce the size of these matrices while keeping the information is a challenge that a lot of us have faced. Using techniques that exploit the sparsity of the matrices, or even reducing the dimensionality via principal components is common practice.

Reading the latest World Economic Forum Newsletter I came to find out about a new algorithm that MIT researchers will present in the ACM Symposium on Theory of Computing in June. The algorithm is said to find the smallest possible approximation of an original matrix, guaranteeing reliable computations. Indeed the best way to determine how well de "reduced" matrix approximated the original one you need to measure the "distance" between them and a common distance to use the the typical Euclidean measure that we are all familiar with... What? You say you aren't?... Remember Pythagoras?

The square of the hypotenuse (the side opposite the right angle) is equal to the sum of the squares of the other two sides.

There you go... all you have to do is extend it to n-dimensions et voilà... That is not the only way to measure distance. Think for example the way in which you more in a grid-city such in Manhattan, New York City... You cannot take move diagonally (except in Broadway I suppose...) so you need to go north-sound or east-west. That distance is actually called the "Manhattan distance".

Mathematicians refer to "norms" when talking about distance measurement and indeed both the Euclidean and Manhattan distances mentioned above are norms:

• Manhattan distance is a 1-norm measure, the sum of differences are raise to the power of 1.
• Euclidean distance is a 2-norm measure, the sum of differences are raise to the power of 2.
• etc...

So what about the MIT algorithm proposed by Richard Peng and Michael Cohen? Well they show that their algorithm is optimal for "reducing" matrices under any norm! The first step is to assign each row of the original matrix a “weight”. A row’s weight is related to the number of other rows that it is similar to. It also determines the likelihood that the row will be included in the reduced matrix.

Let us imagine that the row is indeed included in the reduced matrix. Then its values will be multiplied according to its weight. So, for instance, if 10 rows are similar to one another, but not to any other rows of the matrix, each will have a 10 percent chance of getting into the condensed matrix. If one of them does, its entries will all be multiplied by 10, so that it will reflect the contribution of the other nine rows.

You would think that using the Manhattan distance would be simpler than the Euclidian one when calculating the weights... Well you would be wrong! The previous best effort to reduce a matrix under the 1-norm would return a matrix whose number of rows was proportional to the number of columns of the original matrix raised to the power of 2.5. In the case of the Euclidean distance it would return a matrix whose number of rows is proportional to the number of columns of the original matrix times its own logarithm.

The MIT algorithm of Peng and Cohen is able to reduce matrices under the 1-norm as well as it does under the 2-norm. One important thing is that for the Euclidean norm, the reduction is as good as that of other algorithms... and that is because they use the same best algorithm out there... However, for the 1-norm it uses it recursively!

Interested in reading the paper? Well go to the ArXiV and take a look!

## Markup for Fast Data Science Publication - Reblog

I am an avid user of Markdown via Mou and R Markdown (with RStudio). The facility that the iPython Notebook offers in combining code and text to be rendered in an interactive webpage is the choice for a number of things, including the 11-week Data Science course I teach at General Assembly.

As for LaTeX, well, I could not have survived my PhD without it and I still use it heavily. I have even created some videos about how to use LaTeX, you can take a loot at them

My book "Essential Matlab and Octave" was written and formatted in its entirety using LaTeX. My new book "Data Science and Analytics with Python" is having the same treatment.

I was very pleased to see the following blog post by Benjamin Bengfort. This is a reblog of that post and the original can be found here.

Markup for Fast Data Science Publication
Benjamin Bengfort

A central lesson of science is that to understand complex issues (or even simple ones), we must try to free our minds of dogma and to guarantee the freedom to publish, to contradict, and to experiment. — Carl Sagan in Billions & Billions: Thoughts on Life and Death at the Brink of the Millennium

As data scientists, it's easy to get bogged down in the details. We're busy implementing Python and R code to extract valuable insights from data, train effective machine learning models, or put a distributed computation system together. Many of these tasks, especially those relating to data ingestion or wrangling, are time-consuming but are the bread and butter of the data scientist's daily grind. What we often forget, however, is that we must not only be data engineers, but also contributors to the data science corpus of knowledge.

If a data product derives its value from data and generates more data in return, then a data scientist derives their value from previously published works and should generate more publications in return. Indeed, one of the reasons that Machine Learning has grown ubiquitous (see the many Python-tagged questions related to ML on Stack Overflow) is thanks to meticulous blog posts and tools from scientific research (e.g. Scikit-Learn) that enable the rapid implementation of a variety of algorithms. Google in particular has driven the growth of data products by publishing systems papers about their methodologies, enabling the creation of open source tools like Hadoop and Word2Vec.

By building on a firm base for both software and for modeling, we are able to achieve greater results, faster. Exploration, discussion, criticism, and experimentation all enable us to have new ideas, write better code, and implement better systems by tapping into the collective genius of a data community. Publishing is vitally important to keeping this data science gravy train on the tracks for the foreseeable future.

In academia, the phrase "publish or perish" describes the pressure to establish legitimacy through publications. Clearly, we don't want to take our rule as authors that far, but the question remains, "How can we effectively build publishing into our workflow?" The answer is through markup languages - simple, streamlined markup that we can add to plain text documents that build into a publishing layout or format. For example, the following markup languages/platforms build into the accompanying publishable formats:

• Markdown → HTML
• iPython Notebook (JSON + Markdown) → Interactive Code
• reStructuredText + Sphinx → Python Documentation, ReadTheDocs.org
• AsciiDoc → ePub, Mobi, DocBook, PDF
• LaTeX → PDF

The great thing about markup languages is that they can be managed inline with your code workflow in the same software versioning repository. Github goes even further as to automatically render Markdown files! In this post, we'll get you started with several markup and publication styles so that you can find what best fits into your workflow and deployment methodology.

Markdown

Markdown is the most ubiquitous of the markup languages we'll describe in this post, and its simplicity means that it is often chosen for a variety of domains and applications, not just publishing. Markdown, originally created by John Gruber, is a text-to-HTML processor, where lightweight syntactic elements are used instead of the more heavyweight HTML tags. Markdown is intended for folks writing for the web, not designing for the web, and in some CMS systems, it is simply the way that you write, no fancy text editor required.

Markdown has seen special growth thanks to Github, which has an extended version of Markdown, usually referred to as "Github-Flavored Markdown." This style of Markdown extends the basics of the original Markdown to include tables, syntax highlighting, and other inline formatting elements. If you create a Markdown file in Github, it is automatically rendered when viewing files on the web, and if you include a README.md in a directory, that file is rendered below the directory contents when browsing code. Github Issues are also expected to be in Markdown, further extended with tools like checkbox lists.

Markdown is used for so many applications it is difficult to name them all. Below are a select few that might prove useful to your publishing tasks.

• Jekyll allows you to create static websites that are built from posts and pages written in Markdown.
• Github Pages allows you to quickly publish Jekyll-generated static sites from a Github repository for free.
• Silvrback is a lightweight blogging platform that allows you to write in Markdown (this blog is hosted on Silvrback).
• Day One is a simple journaling app that allows you to write journal entries in Markdown.
• iPython Notebook expects Markdown to describe blocks of code.
• Stack Overflow expects questions, answers, and comments to be written in Markdown.
• MkDocs is a software documentation tool written in Markdown that can be hosted on ReadTheDocs.org.
• GitBook is a toolchain for publishing books written in Markdown to the web or as an eBook.

There are also a wide variety of editors, browser plugins, viewers, and tools available for Markdown. Both Sublime Text and Atom support Markdown and automatic preview, as well as most text editors you'll use for coding. Mou is a desktop Markdown editor for Mac OSX and iA Writer is a distraction-free writing tool for Markdown for iOS. (Please comment your favorite tools for Windows and Android). For Chrome, extensions like Markdown Here make it easy to compose emails in Gmail via Markdown or Markdown Preview to view Markdown documents directly in the browser.

Clearly, Markdown enjoys a broad ecosystem and diverse usage. If you're still writing HTML for anything other than templates, you're definitely doing it wrong at this point! It's also worth including Markdown rendering for your own projects if you have user submitted text (also great for text-processing).

Rendering Markdown can be accomplished with the Python Markdown library, usually combined with the Bleach library for sanitizing bad HTML and linkifying raw text. A simple demo of this is as follows:

First install markdown and bleach using pip:

$pip install markdown bleach Then create a markdown parsing function as follows: import bleach from markdown import markdown def htmlize(text): """ This helper method renders Markdown then uses Bleach to sanitize it as well as converting all links in text to actual anchor tags. """ text = bleach.clean(text, strip=True) # Clean the text by stripping bad HTML tags text = markdown(text) # Convert the markdown to HTML text = bleach.linkify(text) # Add links from the text and add nofollow to existing links return text  Given a markdown file test.md whose contents are as follows: # My Markdown Document For more information, search on [Google](http://www.google.com). _Grocery List:_ 1. Apples 2. Bananas 3. Oranges The following code: >>> with open('test.md', 'r') as f: ... print htmlize(f.read()) Will produce the following HTML output: <h1>My Markdown Document</h1> For more information, search on <a href="http://www.google.com" rel="nofollow">Google</a>. <em>Grocery List:</em> <ol> <li>Apples</li> <li>Bananas</li> <li>Oranges</li> </ol> Hopefully this brief example has also served as a demonstration of how Markdown and other markup languages work to render much simpler text with lightweight markup constructs into a larger publishing framework. Markdown itself is most often used for web publishing, so if you need to write HTML, then this is the choice for you! To learn more about Markdown syntax, please see Markdown Basics. iPython Notebook iPython Notebook is an web-based, interactive environment that combines Python code execution, text (marked up with Markdown), mathematics, graphs, and media into a single document. The motivation for iPython Notebook was purely scientific: How do you demonstrate or present your results in a repeatable fashion where others can understand the work you've done? By creating an interactive environment where code, graphics, mathematical formulas, and rich text are unified and executable, iPython Notebook gives a presentation layer to otherwise unreadable or inscrutable code. Although Markdown is a big part of iPython Notebook, it deserves a special mention because of how critical it is to the data science community. iPython Notebook is interesting because it combines both the presentation layer as well as the markup layer. When run as a server, usually locally, the notebook is editable, explorable (a tree view will present multiple notebook files), and executable - any code written in Python in the notebook can be evaluated and run using an interactive kernel in the background. Math formula written in LaTeX are rendered using MathJax. To enhance the delivery and shareability of these notebooks, the NBViewer allows you to share static notebooks from a Github repository. iPython Notebook comes with most scientific distributions of Python like Anaconda or Canopy, but it is also easy to install iPython with pip: $ pip install ipython

iPython itself is an enhanced interactive Python shell or REPL that extends the basic Python REPL with many advanced features, primarily allowing for a decoupled two-process model that enables the notebook. This process model essentially runs Python as a background kernel that receives execution instructions from clients and returns responses back to them.

To start an iPython notebook execute the following command:

$ipython notebook This will start a local server at http://127.0.0.1:8888 and automatically open your default browser to it. You'll start in the "dashboard view", which shows all of the notebooks available in the current working directory. Here you can create new notebooks and start to edit them. Notebooks are saved as .ipynb files in the local directory, a format called "Jupyter" that is simple JSON with a specific structure for representing each cell in the notebook. The Jupyter notebook files are easily reversioned via Git and Github since they are also plain text. To learn more about iPython Notebook, please see the iPython Notebook documentation. reStructuredText reStructuredText is an easy-to-read plaintext markup syntax specifically designed for use in Python docstrings or to generate Python documentation. In fact, the reStructuredText parser is a component of Docutils, an open-source text processing system that is used by Sphinx to generate intelligent and beautiful software documentation, in particular the native Python documentation. Python software has a long history of good documentation, particularly because of the idea that batteries should come included. And documentation is a very strong battery! PyPi, the Python Package Index, ensures that third party packages provide documentation, and that the documentation can be easily hosted online through Python Hosted. Because of the ease of use and ubiquity of the tools, Python programmers are known for having very consistently documented code; sometimes it's hard to tell the standard library from third party modules! In How to Develop Quality Python Code, I mentioned that you should use Sphinx to generate documentation for your apps and libraries in a docs directory at the top-level. Generating reStructuredText documentation in a docs directory is fairly easy: $ mkdir docs
$cd docs$ sphinx-quickstart

The quickstart utility will ask you many questions to configure your documentation. Aside from the project name, author, and version (which you have to type in yourself), the defaults are fine. However, I do like to change a few things:

...
> todo: write "todo" entries that can be shown or hidden on build (y/n) [n]: y
> coverage: checks for documentation coverage (y/n) [n]: y
...
> mathjax: include math, rendered in the browser by MathJax (y/n) [n]: y

Similar to iPython Notebook, reStructured text can render LaTeX syntax mathematical formulas. This utility will create a Makefile for you; to generate HTML documentation, simply run the following command in the docs directory:

$make html The output will be built in the folder _build/html where you can open the index.html in your browser. While hosting documentation on Python Hosted is a good choice, a better choice might be Read the Docs, a website that allows you to create, host, and browse documentation. One great part of Read the Docs is the stylesheet that they use; it's more readable than older ones. Additionally, Read the Docs allows you to connect a Github repository so that whenever you push new code (and new documentation), it is automatically built and updated on the website. Read the Docs can even maintain different versions of documentation for different releases. Note that even if you aren't interested in the overhead of learning reStructuredText, you should use your newly found Markdown skills to ensure that you have good documentation hosted on Read the Docs. See MkDocs for document generation in Markdown that Read the Docs will render. To learn more about reStructuredText syntax, please see the reStructuredText Primer. AsciiDoc When writing longer publications, you'll need a more expressive tool that is just as lightweight as Markdown but able to handle constructs that go beyond simple HTML, for example cross-references, chapter compilation, or multi-document build chains. Longer publications should also move beyond the web and be renderable as an eBook (ePub or Mobi formats) or for print layout, e.g. PDF. These requirements add more overhead, but simplify workflows for larger media publication. Writing for O'Reilly, I discovered that I really enjoyed working in AsciiDoc - a lightweight markup syntax, very similar to Markdown, which renders to HTML or DocBook. DocBook is very important, because it can be post-processed into other presentation formats such as HTML, PDF, EPUB, DVI, MOBI, and more, making AsciiDoc an effective tool not only for web publishing but also print and book publishing. Most text editors have an AsciiDoc grammar for syntax highlighting, in particular sublime-asciidoc and Atom AsciiDoc Preview, which make writing AsciiDoc as easy as Markdown. AsciiDoctor is an AsciiDoc-specific toolchain for building books and websites from AsciiDoc. The project connects the various AsciiDoc tools and allows a simple command-line interface as well as preview tools. AsciiDoctor is primarily used for HTML and eBook formats, but at the time of this writing there is a PDF renderer, which is in beta. Another interesting project of O'Reilly's is Atlas, a system for push-button publishing that manages AsciiDoc using a Git repository and wraps editorial build processes, comments, and automatic editing in a web platform. I'd be remiss not to mention GitBook which provides a similar toolchain for publishing larger books, though with Markdown. Editor's Note: GitBook does support AsciiDoc. To learn more about AsciiDoc markup see AsciiDoc 101. LaTeX If you've done any graduate work in the STEM degrees then you are probably already familiar with LaTeX to write and publish articles, reports, conference and journal papers, and books. LaTeX is not a simple markup language, to say the least, but it is effective. It is able to handle almost any publishing scenario you can throw at it, including (and in particular) rendering complex mathematical formulas correctly from a text markup language. Most data scientists still use LaTeX, using MathJax or the Daum Equation Editor, if only for the math. If you're going to be writing PDFs or reports, I can provide two primary tips for working with LaTeX. First consider cloud-based editing with Overleaf or ShareLaTeX, which allows you to collaborate and edit LaTeX documents similarly to Google Docs. Both of these systems have many of the classes and stylesheets already so that you don't have to worry too much about the formatting, and instead just get down to writing. Additionally, they aggregate other tools like LaTeX templates and provide templates of their own for most document types. My personal favorite workflow, however, is to use the Atom editor with the LaTeX package and the LaTeX grammar. When using Atom, you get very nice Git and Github integration - perfect for collaboration on larger documents. If you have a TeX distribution installed (and you will need to do that on your local system, no matter what), then you can automatically build your documents within Atom and view them in PDF preview. A complete tutorial for learning LaTeX can be found at Text Formatting with LaTeX. Conclusion Software developers agree that testing and documentation is vital to the successful creation and deployment of applications. However, although Agile workflows are designed to ensure that documentation and testing are included in the software development lifecycle, too often testing and documentation is left to last, or forgotten. When managing a development project, team leads need to ensure that documentation and testing are part of the "definition of done." In the same way, writing is vital to the successful creation and deployment of data products, and is similarly left to last or forgotten. Through publication of our work and ideas, we open ourselves up to criticism, an effective methodology for testing ideas and discovering new ones. Similarly, by explicitly sharing our methods, we make it easier for others to build systems rapidly, and in return, write tutorials that help us better build our systems. And if we translate scientific papers into practical guides, we help to push science along as well. Don't get bogged down in the details of writing, however. Use simple, lightweight markup languages to include documentation alongside your projects. Collaborate with other authors and your team using version control systems, and use free tools to make your work widely available. All of this is possible becasue of lightweight markup languages, and the more profecient you are at including writing in your workflow, the easier it will be to share your ideas. Helpful Links This post is particularly link-heavy with many references to tools and languages. For reference, here are my preferred guides for each of the Markup languages discussed: Books to Read Special thanks to Rebecca Bilbro for editing and contributing to this post. Without her, this would certainly have been much less readable! As always, please follow @DistrictDataLab on Twitter and subscribe to this blog by clicking the Subscribe button on the blog home page. Benjamin Bengfort Read me... ## Using curl to download a shortened URL - Dropbox, bit.ly I was in the middle of an introductory workshop for Data Science at General Assembly and I was talking about using command line instructions to facilitate the manipulation of files and folders. We covered some of the usual ones such as ls, mv, mkdir, cat, more, less, etc. I was then going to demonstrate how easy it was to download a file from the command line using curl and I had prepared a small file uploaded to Dropbox and shortened its URL with bit.ly. "So far so good" - I thought - and then proceeded with the demonstration... Only to find out that the command I was using was indeed downloading a file, but it was the only downloading the wrapper html created by bit.ly for the re-directioning... I should have known better than that! Of course all this happening while various pairs of gazing eyes were upon me... I tried again using a different flag and... nothing! and again... nothing... Pressure mounting, I decided to cut the embarrassment short and apologised. Got them to download the file in the less glamorous way by using the browser... So, if you are ever in that predicament, here is the solution, use the -L flag with curl: $ curl -L -o newname.ext http://your.shortened.url

The -L deals with the redirectioning of the shortened URL and make sure that you use the -o flag to assign a new name to your file.

E voilà!

## The physical book! Essential MATLAB and Octave

It has been a long wait, but finally today I got my hands on the physical version of my book. So pleased.

It is available from the publishers
http://www.crcpress.com/product/isbn/9781482234633

## How to choose between learning Python or R first - Reblog

This post is a reblog of a post by Chenh Han Lee, the original can be seen at Udacity.

January 12, 2015

If you’re interested in a career in data, and you’re familiar with the set of skills you’ll need to master, you know that Python and R are two of the most popular languages for data analysis. If you’re not exactly sure which to start learning first, you’re reading the right article.

When it comes to data analysis, both Python and R are simple (and free) to install and relatively easy to get started with. If you’re a newcomer to the world of data science and don’t have experience in either language, or with programming in general, it makes sense to be unsure whether to learn R or Python first.

Luckily, you can’t really go wrong with either.

The Case for R

R has a long and trusted history and a robust supporting community in the data industry. Together, those facts mean that you can rely on online support from others in the field if you need assistance or have questions about using the language. Plus, there are plenty of publicly released packages, more than 5,000 in fact, that you can download to use in tandem with R to extend its capabilities to new heights. That makes R great for conducting complex exploratory data analysis. R also integrates well with other computer languages like C++, Java, and C.

When you need to do heavy statistical analysis or graphing, R’s your go-to. Common mathematical operations like matrix multiplication work straight out of the box, and the language’s array-oriented syntax makes it easier to translate from math to code, especially for someone with no or minimal programming background.

The Case for Python

Python is a general-purpose programming language that can pretty much do anything you need it to: data munging, data engineering, data wrangling, website scraping, web app building, and more. It’s simpler to master than R if you have previously learned an object-oriented programming language like Java or C++.

In addition, because Python is an object-oriented programming language, it’s easier to write large-scale, maintainable, and robust code with it than with R. Using Python, the prototype code that you write on your own computer can be used as production code if needed.

Although Python doesn’t have as comprehensive a set of packages and libraries available to data professionals as R, the combination of Python with tools like Pandas, Numpy, Scipy, Scikit-Learn, and Seaborn will get you pretty darn close. The language is also slowly becoming more useful for tasks like machine learning, and basic to intermediate statistical work (formerly just R’s domain).

Choosing Between Python and R

Here are a few guidelines for determining whether to begin your data language studies with Python or with R.

Personal preference

Choose the language to begin with based on your personal preference, on which comes more naturally to you, which is easier to grasp from the get-go. To give you a sense of what to expect, mathematicians and statisticians tend to prefer R, whereas computer scientists and software engineers tend to favor Python. The best news is that once you learn to program well in one language, it’s pretty easy to pick up others.

Project selection

You can also make the Python vs. R call based on a project you know you’ll be working on in your data studies. If you’re working with data that’s been gathered and cleaned for you, and your main focus is the analysis of that data, go with R. If you have to work with dirty or jumbled data, or to scrape data from websites, files, or other data sources, you should start learning, or advancing your studies in, Python.

Collaboration

Once you have the basics of data analysis under your belt, another criterion for evaluating which language to further your skills in is what language your teammates are using. If you’re all literally speaking the same language, it’ll make collaboration—as well as learning from each other—much easier.

Job market

Jobs calling for skill in Python compared to R have increased similarly over the last few years.

That said, as you can see, Python has started to overtake R in data jobs. Thanks to the expansion of the Python ecosystem, tools for nearly every aspect of computing are readily available in the language. In addition, since Python can be used to develop web applications, it enables companies to employ crossover between Python developers and data science teams. That’s a major boon given the shortage of data experts in the current marketplace.

The Bottom Line

In general, you can’t err whether you choose to learn Python first or R first for data analysis. Each language has its pros and cons for different scenarios and tasks. In addition, there are actually libraries to use Python with R, and vice versa—so learning one won’t preclude you from being able to learn and use the other. Perhaps the best solution is to use the above guidelines to decide which of the two languages to begin with, then fortify your skill set by learning the other one.

Is your brain warmed up enough yet? Get to it!

## Failed Battery

I have had this 17-in MacBook Pro for a few years… perhaps about 8 years? Probably a bit more? In any case, I have it more as a memento than anything else as I have a more modern one these days. I still keep it updated and all the rest of it so I was rather surprised to get it out and see that the battery has effectively bursted!!! I hope the rest of the machine still works though :(

## Programming Languages Ranking 2014

Well, it seems that it is that time of the month when the TIOBE index releases the rankings of programming languages. Happy to see R improving it position going from 15 to 12. Matlab is at 24 though...

The index is based on number of skilled engineers world-wide, courses and third party vendors that use each of the languages and popular search engines are used to calculate the ratings. Just remember that the TIOBE index is not about the best programming language or the language in which most lines of code have been written.

The definition of the TIOBE index can be found here. In any case here are the rankings:

Nov 2014 Nov 2013 Change Programming Language Ratings Change
1 1 C 17.469% -0.69%
2 2 Java 14.391% -2.13%
3 3 Objective-C 9.063% -0.34%
4 4 C++ 6.098% -2.27%
5 5 C# 4.985% -1.04%
6 6 PHP 3.043% -2.34%
7 8 Python 2.589% -0.52%
8 10 JavaScript 2.088% +0.04%
9 12 Perl 2.073% +0.55%
10 11 Visual Basic .NET 2.061% +0.09%
11 - Visual Basic 1.657% +1.66%
12 31 R 1.548% +1.14%
13 9 Transact-SQL 1.408% -1.11%
14 13 Ruby 1.211% -0.09%
15 17 Delphi/Object Pascal 0.957% +0.31%
16 23 F# 0.892% +0.39%
17 18 PL/SQL 0.870% +0.27%
18 - Swift 0.834% +0.83%
19 14 Pascal 0.831% +0.12%
20 81 Dart 0.816% +0.73%

## Apple Notes and Gmail Notes

I accidentally ended up creating some notes in the Gmail Notes inside my iDevice only to be completely confounded by the fact I could not see them in my desktop. I tried to find some resolution by looking at the instructions for the Apple notes, but got frustrated with the lack of information.

So, here it is how I solved my issue:

It seems that as an Apple Notes user, one can select to have the Notes saved "On my iPhone/iPad/Mac" or synced to any email account of one's choice. If you chose the first option, then no issues there, but the "fun" part comes with the latter. In that case the application will send notes from the device via Gmail to the Gmail servers, or for that matter to the email account you designated under IMAP. This means that your notes are therefore treated as normal email and labelled as "Notes". Not only that, they are automatically archived on arrival. The initial transfer is one-way only and this implies that the notes can't be restored from Gmail to the device. In order to find your Notes in Gmail you have to search for the "Notes" label!

If you call up your note on your device, the application access it from Gmail and displays it. But if you deleted it, as many of us do, then the app gets confused as it does not know where they are... If they are deleted from the device removes the label in Gmail and thus they cannot be accessed by the device and they get zombiefied in Gmail! They will still be present in All Mail, but without label.

How to fix this... well it depends. If the Notes have been deleted from the Gmail account from the web interface they will still be there in the Trash for 30 days. You can "restore" then during that time and will be showing in the Notes App on the device.

If the Notes folder was deleted using the Mail App on the device, the notes will (probably) still be there under "All Mail" but without a label. You can search for them and re-apply the label!

My advice would be not to use the synching at all... it has caused more pains than it should be. Let me know if this helps.

I was confronted with an old issue, that had not been an issue for a while: writing to an external hard drive that was formatted with Windows (NTFS) from my mac. I used to have NTFS-3G (together with MacFUSE) installed and that used to be fine. However, I guess something when a bit eerie with Mavericks as I was not able to get my old solution to work.

So, here is what I did (you will need superuser powers, so be prepared to type your password):

Open a Terminal (Terminal.app) and create a file called stab in the /etc folder. For instance you can type:

$sudo nano /etc/fstab You can now enter some information in your newly created file telling MacOS information about your device. If your external drive is called mydevice enter the following: LABEL=mydevice none ntfs rw,auto,nobrowse Use tabs between the fields listed above. Save your file and you are now ready to plug your device. There is a small caveat: Once you do this, your hard drive is not going to appear in your Desktop. But do not disappear, you can still use the terminal to access the drives mounted by going to /Volumes folder as follows: $ sudo ln -s /Volumes ~/Desktop/Volumes

et voilà!

## WWDC programme

Yay, it looks like the programme for WWDC has been released.

I have been playing on and off with Jekyll and I find it very interesting, useful and once installed, easy tool for creating posts. However, the installation may or may not be that easy. Last time I installed it using Ruby directly and did not bother updating it for a while.

Recently I decided to update it and this time I decided to do that with the help of the fantastic Homebrew.

Everything seemed to work fine except that discountkept on complaining. So started afresh and this time round the terminal complained saying:

-bash: jekyll: command not found

The problem was easy to solve once I remembered that brew places the code in the brew Cellar and thus Ruby could not find the gem directory. So I simply added the correct path and exported it:

export PATH=/usr/local/Cellar/ruby/2.1.1/bin

et voilà!

## Changing date/time in Ubuntu virtualbox

I was a bit puzzled by the fact I could not easily change the date/time in an instance of a virtualbox as used by the High Performance Scientific Computing Coursera course run by Dr. Randall J. LeVeque via Coursera.

I tried using the simple date command but I kept on being told that

date: cannot set date: Operation not permitted

I tried updating the Ubuntu distro, but no luck. Eventually I found a solution using a symlink to localtime:

$cd /etc $ mv localtime localtime_original
$ln -s /usr/share/zoneinfo/Europe/London ./localtime You will have to use the correct zone for your location. Et voilà! Read me... ## Navigating the terminal The work computer of one of my colleagues has recently closed the circle and he has now a shiny new apple computer. He is very well-versed in a bunch of computer related tasks, nonetheless he asked me the other day about shortcuts to navigate a shell terminal. I showed him a few tricks, and I thought posting some here just in case they are helpful to my readers too: • To go to the beginning of the command line - Ctlr+A • To go to the end of the command line - Ctr+E • To delete from the current position to the beginning of the line - Ctrl+U • To undelete - Ctrl+Y • To delete words to the front of current position - Ctrl+K • To delete words to the back of current position - Ctrl-W He also was wondering about an easy way to create a file and open it immediately. The way I do that is with a bash function placed in my .bashrc file mytouch { touch #1 open #1 }  Enjoy! Read me... ## An alternative way to reduce the size of PDF in a mac I am sure you, like me, have had the need to reduce the file size of a PDF. Take for example the occasional need of sending a PDF by email just to find out that the size is such that the message is rejected. I have used Adobe Acrobat Pro to help, but recently I came across an alternative way of achieving this: Use Colorsync Utility in a mac. Here is how: 1. Right click the PDF that needs reducing and select “Open with…” 2. Select Colorsync Utility and wait for the application to open the file 3. At the bottom of the status bar in the application, you can now select one of the quartz filters available 4. Press “Apply” 5. and voilà Read me... ## Getting Gephi 0.8.2 to work with a Mac Ever since the previous Java update for the mac, my Gephi installation was not happy. I resorted to uninstalling version 0.8.2-beta and going back to 0.8.1. Not a bad version, but definitely not one with the latest updates. Well, at least it worked, did not freeze or panicked when trying to click on the menus. :D I am very pleased to say that I have managed to get my installation of Gephi 0.8.2-beta working and here it is how: Edit the contents of the package located in /Applications/gephi_0.8.2-beta.app/Contents/Resources/gephi/bin/gephi To do so you can right-click on the Gephi application and open "Show Package Contents". You can then navigate to the location mentioned above. I used Aquamacs to edit the file, but you can use your favourite plain-text editor. Towards the beginning of the file add the following line: jdkhome="/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home" Save the file and start Gephi as usual. This did the trick for me. I would like to credit the GitHub page for Gephi were I ended up connecting the dots. Read me... ## Aquaterm plotting issue with Octave and Gnuplot (Mac) I recently updated my version of Octave using Homebrew and something went a bit eerie... Nothing major except that instead of plotting to the Aquaterm terminal, Octave and Gnuplot were only happy with X11. Not the greatest of issues, but I really prefer the look of graphs in Aquaterm and here are some steps I followed to get things sorted: First I uninstalled gnuplot from Homebrew using: brew uninstall gnuplot Just in case the problem was with AquaTerm I re-downloaded it and installed it again. You can obtain AquaTerm here. I then reinstalled gnuplot just to realise that some symlinks were not created. You can check them thy typing: ls /usr/local/lib/libaquaterm* ls /usr/local/include/aquaterm/* If they do not, you can set them up by typing the following commands in your shell: sudo ln -s /Library/Frameworks/AquaTerm.framework/Versions/A/AquaTerm /usr/local/lib/libaquaterm.dylib sudo ln -s /Library/Frameworks/AquaTerm.framework/Versions/A/AquaTerm /usr/local/lib/libaquaterm.1.0.0.dylib sudo ln -s /Library/Frameworks/AquaTerm.framework/Versions/A/Headers/* /usr/local/include/aquaterm/ That did the trick for me. I hope you find this helpful. Read me... ## Essential MATLAB and Octave As probably some of you know, I am currently writing a book about MATLAB and Octave focussed at new comers to both programming and the MATLAB/Octave environments. The book is tentatively entitled "Essential MATLAB and Octave" and I am getting closer and closer to getting the text finished. The next step is preparing exercises and finalising things. My publisher, CRC Press, has been great and I hope the book does well. I'm aiming to finish things by May and in principle the book will be available from Novemeber or so. The whole process does take a while but I am really looking forward to seeing the finished thing out there. So, what triggered this post? Well, I have seen the appearance of a site with the book announced. I am not sure if these are usual practices but in any case it is a good thing, don't you think? Read me... ## Grace Hopper Doodle Once again Google puts out a doodle worth mentioning. This time they celebrate the 107th birthday anniversary of computer scientist Grace Hopper. In case you do not know who Hopper is, well, let me smile say that she is the amazon woman behind COBOL (Common Business Oriented Language), which is still very much used today. Grace Hopper was born in New York in 1906 and studied Mathematics and Physics (of course) at Vassar College where she graduated in 1928. She then obtained a master's degree at Yale in 1930 and a PhD in 1934. Hopper joined the US Navy reserve during World War two and she was assigned to the Bureau of Ordinance Computation Project at Harvard University where she was only the third person to program the Harvard Mark I computer. She continued to work at Harvard until 1949 when she joined the Eckert-Mauchly Computer Corporation as a senior programmer. She helped to develop the UNIVAC I, which was the second commercial computer produced in the US. In the 1950s Hopper created the first ever compiler, known as the A compiler and the first version was called the A-O. Hopper continued to serve in the navy until 1986 when she was the oldest commissioned officer on active duty in the United States Navy. She died in Arlington, Virginia in 1992 at the age of 85. Read me... ## LondonR - Shiny I had the chance to attend the latest LondonR meeting last week. It was a good interesting gathering and I was pleasantly surprised to see that it was well attended by a variety of like-minded people. The meeting had talks by • Andy South - Making beautiful world maps with country-referenced data using rworldmap and other R packages • Malcolm Sherrington - Algorithmic Trading with R • Chris Beeley - Shiny happy web interfaces - Shiny, HTML, CSS, JavaScript, and Shiny Server working together I am also very pleased that I managed to be on time to answer the question that Chris Beeley put on the day to win a digital copy of his book Web application development with R using Shiny. The book is available form Packt Publishing, Thanks to Chris Beeley and Packt for the book. Read me... ## Furigana (ふりがな) in LaTeX Some time ago I wrote a post about adding furigana using MS Word for Mac. It seems that the post has been quite useful to a few readers, nonetheless some of you have contacted me about the remark I made about doing this with in LaTeX. So far I have helped people when they have requested help, but as I promised in that post, I have finally come to adding a post to add furigana using LaTex. Here is how: You will need the following packages installed in your LaTeX distribution: With these packages installed and working in your distribution, you can now use a document similar to the following: documentclass[12pt]{article} usepackage[10pt]{type1ec} % use only 10pt fonts usepackage[T1]{fontenc} usepackage{CJKutf8} usepackage[german, russian, vietnam, USenglish]{babel} usepackage[overlap, CJK]{ruby} usepackage{CJKulem} renewcommand{rubysep}{-0.2ex} newenvironment{Japanese}{% CJKfamily{min}% CJKtilde CJKnospace}{} begin{document} begin{CJK}{UTF8}{} begin{Japanese} noindent これは日本語の文章 noindent Hello begin{equation} frac{2}{pi} end{equation} 私は日本語の勉強します! furigana: ruby{私}{わたし} end{Japanese} end{CJK} end{document} The outcome of the script above can be seen below: Read me... ## Disabled bundles for Mail in Mavericks I have just updated a previous post with some of the UUIDs for using some plugins with Mail.app. The correct strings for Mail 7.0 in Mavericks are: <string>0941BB9F-231F-452D-A26F-47A43863C991</string> <string>04D6EC0A-52FF-4BBE-9896-C0B5FB851BBA</string> For instructions on what to do with the strings above, please refer to this post. Read me... ## Command Line: a few tips In various posts in the past, I have given some tips using the Terminal and some comments have arrived about how complicated they may seem. Nonetheless, I still think that the flexibility offered by the tools provided are what make the UNIX/Mac environment so good. So in this post I would like to share some useful tips to use the terminal. Let me know what you think! ### 1. Download a File from the Web & Watch Progress If you know the URL of a file that you need to download from the web you can use curl with the -O command to start downloading it: $ curl -O url

Be sure to use the full URL. Also, remember to use the upper case ‘O’ and not the lowercase ‘o’ to keep the same file name on your local machine.

### 2. List Directory Contents by Modification Date

You can indeed take a look at the graphical interface, but if all you want is a quick list of the files in a directory showing permissions, users, file size, and modification date, with the most recently modified files and folders appearing from the bottom up then simply type the following:

$ls -thor ### 3. Search Spotlight with Live Results from the Command Line To do that you can use the mdfind command: $ mdfind -time findme

This can go awfully quick depending on the specificity of the searched terms, but if you see a match hit Control+C to stop looking.

### 4. Kill Processes Using Wildcards

Simply use the pkill command. For example, if you want to get rid of all the processes that start with "Sam" just type:

### 6. Get the Last Occurrence of a Command Without Executing It

Once again, the bang is your friend. Use the following command, where "searchterm" must be substituted by the command you are looking for:

$!searchterm:p For example, to find the last full command that used the prefix “sudo” you would use: $ !sudo:p

### 7. Instantly Create a Blank File or Multiple Files

All you have to do is "touch" the file...

Where $MATLAB is the path to your installation. And voilà! Incidentally, if you are having problems with the graphics in MATLAB, such as the application crashing when plotting and the like, you can type the following command before launching MATLAB as specified above: export DYLD_LIBRARY_PATH=/System/Library/Frameworks/JavaVM.framework/Libraries Let me know how you get on with this and should you find another alternative solution let me know! Enjoy! Read me... ## Python for iOS There used to be a time when computers came with tools for someone to start programming, something like a version of BASIC would get you started. That has changed to the point that some users cannot even imagine how to interact with their machines without a nice, eye-candy, even cumbersome graphical interface. iPhones and iPads are indeed powerful devices, but in their wisdom Apple would not let you easily program them. Fortunately people are not easily convinced to drop it and recently I came across Python for iOS which is available in the AppStore. The application provides us with a simple Python interpreter that make it easy to use in these devices. The user needs to remember that the application does not create native apps, but the tool might be very handy in conjunction with a more advanced development tool. Also, it has the added bonus of allowing newcomers to start programming in devices that are largely seen to be purely as consumer ones as well as using a popular language. Will you give it a go? Let me know what you think. Read me... ## A hidden shortcut to switch to previous Desktop Space in Mac OS X Imagine this, you are using Desktop 1 to write a long document and you are doing so with information that is displayed in Desktop 4. You can indeed move the relevant window from Desktop 4 to Desktop 1, but that simply does not help. So you end up moving back and forth between the two. Did you know that you can do this using a double-tap with four fingers on your trackpad? No? Well, this is because it is a hidden gesture. To activate the gesture all you have to do is open a Terminal (Finder - Applications - Utilitites - Terminal) and type the following two commands (please note that the first line is one command): defaults write com.apple.dock double-tap-jump-back -bool TRUE killall Dock The changes take effect immediately after the second command is issued. Enjoy! Read me... ## Japanese writing in iOS and Mac OS X I have written Japanese (not as well as I would like though) for some time and doing so with the computer is always a pleasure. I'm used to almost always use the same dictionary, the same method of writing, the same shortcuts, etc. Here are some of the things I use. In iOS: - I have always used the "qwerty" keyboard for both Japanese and the rest (UK English, Spanish), and I have occasionally used the 10-key swiping method with the Kana keyboard (see picture below). In order to use the kana keyboard you really need to know your syllabary (hiragana) as the keys are arranged by sound, and all you have to do is select the correct consonant sound and swipe in the direction of the vowel sound: • A – in the middle, • I – swipe to the left, • U – swipe upwards, • E – swipe to the right, • O – swipe downwards. It may take a while to get used to it, but I just love the simplicity of it all and it even has a key for emoticons! The Japanese auto-completion seems much more advanced in mobiles than in computers, and I think better than editors in either English, Spanish or other languages because it automatically chooses for you. As for a dictionary I use Kotoba.It is free and works without a network connection. For Mac: I use "kotoeri" by default. I have also enabled the option to write kanji by hand with the trackpad to search the dictionary for kanji. Instruction on how to activate it and use it can be found here. In a nutshell: 1. Choose System Preferences from the Apple () menu. 2. Choose Language & Text from the View menu. 3. Select the Input Sources tab. 4. Enable the checkboxes for Pinyin, Wubi Xing, or Wubi Hua under Chinese - Simplified, or Cangjie, Dayi Pro, Jianyi, Pinyin, or Zhuyin under Chinese - Traditional. In OS X you can type accents and other characters with "option key" combinations without changing the keyboard layout. Also, you can press each letter for a few seconds and this will open up a menu box similar to the iOS version. Try it up! As a native dictionary I use JEDict (free), plus a few that I consult on web such as Denshi Jisho. To use Katakana you can hit Ctr+K, which converts things directly to the script. Finally, I recommend writing Japanese text in a Japanese font, because most Western sources do not have the characters ans this can always be an issue. Read me... ## It seems Apple took down the iOS version of Chrome rather quickly. Tantrum? I heard via Cult of Mac that the iOS version of Chrome was available in the AppStore. My friend downloaded it successfully and kept on going about how quickly it was. And indeed it was. After dinner I tried to download it too, but I found that whenever I tried to I kept on getting a "The item you tried to buy is no longer available". Did Apple just threw the toys out of the pram? Have you been able to download the app? Here is the article from Cult of Mac: Chrome for iOS UPDATE: 29th June, 2012 It looks like this was a glitch with the app store... I managed to download it just now. Read me... ## Happy birthday Turing Today, a 100 years ago Alan Turing was born. As a form of celebration Google has put a functioning Turing machine as their latest doodle. A Turing machine is a device that uses a tape with symbols that are manipulated according to certain rules and as you can imagine it was proposed by Turing in 1936. Read me... ## Setting up Posterous in Tweetbot for iPad UPDATE: Sadly this post is now obsolete with the shutting down of Posterous... I have recently started using Tweetbot as a Twitter client and I must say that I am quite pleased with the way it handles things like mentions, RTs and particularly the display of media such as photos and video. It seems to be quite easy to use and setup multiple accounts. However, there was something that I didn't quite like... I tend to use Posterous to upload pictures and other media. I prefer this to services such as Twitpic or Moby, and as such I was expecting Tweetbot to handle Posterous as easily as these other services. Although in their site Tweetbot mention that they support Posterous, once in the application it was nowhere to be seen. If like me you want to use Posterous, do not despair, it is just a matter of configuring the "Custom" service. Here is what you need to do: 1. In Tweetbot, open the Settings (at the bottom of the navigation bar on the left hand side). 2. Under account settings, tap your username and tap in either the "Image Upload" to "Video Upload" (changing one will make the service available in the other). 3. Scroll to the bottom of the menu and select "Custom" 4. You will be asked to enter an API endpoint, enter one to the two following options: • https://posterous.com/api2/upload.xml • https://posterous.com/api2/upload.json And you are ready to go! Please note that this assumes that you already have a Posterous account and that it knows about your Twitter identity. If it doesn't, Posterous will create a new account for you. For more info about the API, visit this page. Read me... ## I hadn't seen one of these for ages... And these are brand new! ## Saving screenshots with a useful name (Mac) Are you a Mac user? Do you end up taking one, two, thousands of screenshots? Well, then you would agree with me when I say that having them automatically saved with a long and not very useful name is a bit of a pain. It would be great if you could have a choice over the name used to save your screenshots. Well, there is! Here is how: Open a terminal and type the following (using your own useful text string in between the quotes): defaults write com.apple.screencapture name "A useful name" Then, restart the user interface system with the following command: killall SystemUIServer And that is it! Enjoy! Read me... ## Working Collaboratively Online: Wunderkit and Hojoki Working collaboratively is nothing new... The challenge of continuing doing so with an ever-increasing use of online tools could definitely make life much easier. However, should you not be careful, you can quickly get a large number of new accounts in services that only a few people use. You can tackle collaborative work using things such as email, but I am sure that you can agree with me that by the second iteration of doing and undoing tracked changes (once you have managed to convince other to track them that is...) becomes a bit tiring. In that respect, tools such as google docs have a distinctive advantage. More recently I have come across a couple of new takes on the subject, one is Wunderkit and the other one is Hojoki. I started having a look at both of them, so this post is more about first impressions rather than fixed recommendations. Should you have any views on this, please do let me know. Wunderkit This platform is brought to us by 6 Wunderkinder, a Berlin startup that also created Wunderlist (which is a good to-do application). Wunderkit lets you create projects that then can be shared with other people. The application lets you connect to Twitter and Facebook. Your contacts are treated in a similar way to followers in Twitter and you can invite contacts to your projects. You are supposed to be able to discover other people, but I must admit that the process was a bit cumbersome. Once you have created a project and invited some people, your followers can post messages, comment on tasks, setup discussions and send status updates. A very interesting aspect of Wunderkit us that it includes some applications that can be very useful: • A progress tracker: You can easily see that is the status of the project and can easily see what peopler have been discussing as well as the activities that your collaborators have been working on. • A to-do list: The to-do list lets you set up tasks and lists. You can assign these tasks to specific members and setup due dates. I wish they could synchronise these lists with Wunderlist.... but never mind. • A notepad: This is a useful addition to the task lists ass you can add ideas, notes, scripts, etc, to your project. Another useful thing about this application is the fact that not only does it live in the web, but the 6 Wunderkinder have created mobile applications that let you take your projects and lists with you. They also have a desktop application, but it seems that currently these additions are only for Apple devices. The accounts are free and should you need more support you can get a pro account. So far so good. Hojoki Tho other tool I wanted to talk about is Hojoki. Hojoki is also the creation of a German team and the prospect is a very interesting one. The main premise of the application is the accessibility, in a single place, of a number of existing outlets you already use: Dropbox, Google Docs, Github, Highrise, Mendeley, etc... Once you connect your different services, Hojoki creates a single feed that gets updated as soon as team members create actions such as saving or creating files, submitting updates, etc. It also integrated with Twitter, but should you be following a lot of people, this can be a bit too much! You can also setup workspaces and It is a good idea and it exploits the cloud features of many applications. Currently it only works from a web browser although they say that a mobile app is a bit of a work in progress. Accounts are also free. Well, all you have to do now is give them a go and let me know what you think. We might even be able to start a project using one of these tools. Read me... ## Uploading videos to Vimeo Now that you have created your videos with either your PC or your Mac, you are ready to share them with the world. I find Vimeo very easy to use and quite flexible in terms of content, size of files and things of that sort. In this video I show you try quickly how to create an account and how to upload your masterpiece. As usual, let me know what you think. http://vimeo.com/36840631 Read me... ## Videocasting with a Mac Continuing with the subject of capturing video, in this tutorial we will cover some tools to capture your screen using a Mac. The tools are Quicktime Player, MPEG Streamclip and iMovie. These tools either come with Mac (Quicktime and iMovie) or are available from the web. Enjoy and keep in touch! http://vimeo.com/36830569 Read me... ## Videocasting with a PC Talking to some people about screen capturing and video tutorials, I came across the fact that, although there is some interest in the activity, there is the idea that you need sophisticated tools to create even the simplest video presentation. In this video I show how some simple videos can be produced by capturing screenshots using a PC with windows installed. The tools that I use are CamStudio and Freemake Video Converter, which are readily available in the web. As usual, any comments are more than welcome. Enjoy! http://vimeo.com/36821557 Read me... ## Structured Documents in LaTeX Continuing with the brief introduction to LaTeX that I posted recently, in this video I discuss the use of LaTeX to produce a document that has a structure similar to that of a book for example. The idea is to build a master file that controls the flow of the document and separates each "Chapter" in separate files. This provides the author with a lot of flexibility in terms of organising content and makes large documents far more manageable than when using a single LaTeX file. Enjoy and any feedback, comments or suggestions are more than welcome. http://vimeo.com/36550754 Read me... ## Using LaTeX to write mathematics I have been meaning to do something like this for a long time and finally got the courage to do it. A lot of times I get completely horrified by the way in which some documents that contain mathematical notations are mangled (quite literally) by using MS Word. It helps sometimes that some people have access to MathType but still... So, in this video I intend to provide some help to those that are interested in using LaTeX to include mathematics and produce their documents. LaTeX is freely available for various platforms. You can obtain MikTeX for Windows here, and MacTeX for Mac here. There are a great variety of editors to choose from; in this video I recommend TeXmaker, which I believe provides quite a lot of help to those of us that still are attached to the pointing and clicking of MS Word. Let me know what you think! Any feedback is always welcome. http://vimeo.com/36401920 Read me... ## iBook Author So, I just found out about Apple announcing iBooks Author which according to the information they provide "is an amazing new app that allows anyone to create beautiful Multi-Touch textbooks" and is a free download from the Mac App Store. Installation was not too slow, considering that perhaps lots of other users were doing exactly the same. I had a quick go at selecting a template and it really seemed to be quite straightforward to use. It does look like a combination of Pages and Keynote. I will have to play more with it, but something that I did find disappointing was the lack of support to handle mathematics. I am not after LaTeX (I already use that quite a lot), but it would be nice to be able to handle equations natively. I do hope someone at Apple is reading! Read me... ## Access your Library folder in Mac OS Lion I am not entirely sure I do think of OS Lion as the best operating system ever. It does have some nice features, but it also has some annoyances. One of them is the seemingly missing Library folder. As a matter of fact, the folder is not missing, but by default, Apple now hides this folder to prevent users from messing up with it. But do not despair there are ways to get to this folder, either temporarily or on a more permanent basis. Here is how: Temporary access: 1. In finder, access the GO menu 2. You can make the Library folder visible by hitting the Option Key 3. And that's it, you can now open the Library folder Permanent access: 1. Open up a Terminal 2. Type the following command: chflags nohidden ~/Library 3. If you want to undo this, simply type: chflags hidden ~/Library Enjoy! Read me... ## Apple Knowledge Navigator In 1987 Apple released this video about a hypothetical devices called Knowledge Navigator. This can be seen as the idea behind Siri, the personal assistant recently announced by Apple. Read me... ## Lion and Air Display don't like each other I am generally quite happy with using a Mac and things seem to be going quite well with my machine. Nonetheless, I could not resist upgrading my operating system from Leopard to Lion... after all, Apple markets is as "the most advanced desktop operating system". The update itself happened without a glitch, but the machine seemed to have become more sluggish. I assumed it was the number of applications that I had installed and the fact that some of them, such as Maple 9.5 and the version of PhotoShop that I had relied on the usage of Rosetta to work. I got rid of the newly obsolete software, but this did not sort the issues. One of the more annoying issues, even more than the lack of malleability in Launchpad, was the very insufferable fact that the screensaver acquired a mind of its own: it would just spring into action on its own even when I was typing or using the mouse... After searching for a solution, the only thing that worked was to turn the screensaver off... Now, this is not ideal. But now I think I have found the answer: the problem was the limitation that Air Display has when installed in Lion. Avatron, the makers of Air Display (a screen extension software) know about this and although they mentioned that only certain models are affected, I found that as soon as I got rid of Air Display not only my machine did not run into troubles with the screensaver but also woke up from the horrendous sluggishness it had been suffering. How to uninstall Air Display: • Go to Applications -> Utilities • Run the "Uninstall Air Display" • The machine will automatically re-start • Et voilà Read me... ## Furigana (ふりがな) in Mac First of all, I guess I must explain what ふりがな (furigana) are. Japanese uses characters of Chinese origin called Kanji. Because of the way they have been adopted into Japanese, a single character is more often than not used to write a variety of words and this means that the kanji acquires different ways to be read depending on the word. Deciding which reading is meant depends on context, intended meaning, the use in conjunction to other kanji, etc. The readings are usually categorised as either onyomiー音読み (literally, sound reading) or kunyomiー訓読み (literally, meaning reading). So, what about these furigana? Well, since the reading of kanji can get a bit tricky when you are learning to read them, sometimes small hiragana are used to indicate the phonetic reading intended (see the picture above). Furigana are commonly used for children, who might not recognise kanji, but are able to read the word when written in hiragana. It is also common to see them used in textbooks for learners of Japanese as a second language. Japanese adults make use of them on words written in uncommon or difficult-to-read kanji. About a month ago I had to prepare a speech to be given in my Japanese lesson and I had the idea that it would be great to add furigana to my script. But how do you place furigana along your sentences? Well, here are some instructions to add furigana to kanji in Word 2011 for Mac (I also managed to do it in LaTeX, but I will create a separate entry to explain that - if you need this info, please contact me and I will be happy to help): Under the Microsoft Office folder in you Applications directory, there must be a folder called "Additional Tools". Inside this folder there is a directory called "Microsoft Language Register", open the "Microsoft Language Register" application that lives there and select "Japanese" from the dropdown menu and click OK. What this does is to enable some advanced features such as furigana writing, vertical text and character combination when using Japanese. Open Word and start typing something in hiragana. You can convert the text into kanji by hitting the spacebar. Here is where the magic comes: highlight the kanji that needs a furigana entry, click "Format" in the menu bar (at the top) and select "Phonetic Guide" and there you go! ちょっといいですね！ UPDATE: I finally created a post about using furigana in LaTeX. Find the post here. Read me... ## Mathematically inclined CAPTCHA I'm sure you have encountered CAPTCHAS before. You might not know them with that name, but they have become a familiar feature of many websites. So, you want to book some tickets for a gig of your favourite band? Do you want to sign up to a new social network? Or simply interested in recovering your lost password? Well, you are more than likely to have used a CAPTCHA. A CAPTCHA is a way to identify that the request to the services mentioned above (and many others) is not generated by a computer. This usually asking the user to complete a simple test for a human being but harder to replicate by a computer. One such task is character recognition. The text is supposed to be so distorted that a computer might have trouble identifying them, nonetheless a human being would be able to solve the problem in a very straightforward manner. Recently this has been put to a good use with the use of reCAPTCHA, which is a service that helps digitise printed material. In many occasions the quality of some words is not good and therefore OCR (Optical Character Recognition) software struggles. However, many CAPTCHAS are solved by humans every single day and this is a resource that reCAPTCHA is chanelling. The idea is to send words that the computer is having problems identifying. So, if the computer cannot do it, how does the system know that you have given the correct answer??? Well, you are provided with two words one known and the other one is the word that needs resolving. If the answer for the known one is correct the system assumes that the second one is also correct. The key is that you don't know which word is which. If many people are providing the same answer to that unknown word, then it is highly likely that it has been identified. All of this is great, but what is the connection with the mathematically inclined CAPTCHA. Well, recently a friend of mine came across the following CAPTCHA. That is an excellent way to prove that you are not a bot, and that you are definitely a geek! Well done! Read me... ## Pretend you are Hercule Poirot... I really enjoyed this error message in a LaTeX file that my office mate came across yesterday while preparing some slides. Usually the errors are quite obvious so there is no need to check the console. This time it was something a bit obscure... so much so that LaTeX suggested: Pretend that you're Hercule Poirot: Examine all the clues, and deduce the truth by order and method. Great! Where is my hat and fake moustache??? Read me... ## Pac-Man animated with humans The Original Human PAC-MAN Performance by Guillaume Reymond Read me... ## How to resize a window in Mac It may sound a bit strange to have a post about how to resize a window. "You just select the lower right-hand side corner and drag it!" I hear you say. But what happens when the window is so tall that you can’t even get to the resize handle? Well, here is the answer: t may not be obvious, but the solution is right there in front of you! Clicking the green zoom (+) button on the toolbar of any window will automatically resize it to best fit your current screen resolution! Read me... ## Downgrade iPhone 3GS from iOS 4 back to 3.1.3 I have been very happy with the performance of my iPhone, but I could not help noticing that after upgrading the 3GS to iOS 4, the phone not only slowed down, but effectively stood still. Not really what you want when you are in need of getting directions, finding the name of that actor in that film, or simply making a phonecall. So, if you are in that boat, here is a recipe to downgrade your device and recover some functionality! You will need the following ingredients: 1. iPhone 3GS 2. iTunes 3. Cable to connect iPhone to iTunes 4. A copy of iOS 3.1.3 5. RecBoot 6. Some patience Preparation 1 - Get iOS 3.1.3 ready This sounds like a tricky one, but do not panic, it might well be that you do indeed have a copy of the iOS available in your hard drive, check in: ~/Library/iTunes/iPhone Software Updates On Windows, your iPhone OS updates should be stored in: C:Documents and Settings[username]Application DataApple ComputeriTunesiPhone Software Updates If you see a file inside this folder corresponding to iPhone1,1_3.1.3_7E18_Restore.ipsw or iPhone1,2_3.1.3_7E18_Restore.ipsw those are likely the restore images you need. If you don't see anything that resembles the 3.1.3 OS or you just want a freshly downloaded one, iClarified has a list of iPhone firmware files. Just find 3.1.3 for your phone and download it to a place in your hard drive that you can remember. Preparation 2 - RecBoot Later on in the process, you will need RecBoot to be able to tell iTunes to free your iPhone after downgrading. You can download it here (available for Mac and Windows). Preparation 3 - Put your iPhone into DFU mode You need to put your iPhone into Device Firmware Update (or DFU) mode in order to downgrade to 3.1.3., here is how: 1. Plug in your iPhone. 2. Power it down by holding the sleep/lock button at the top and sliding to power off. 3. Once it's powered down, press and hold both the sleep/lock button and the home button for ten seconds. 4. After ten seconds, release the power button but continue holding down the home button. 5. If you did it right, iTunes will pop up a window telling you that it's detected an iPhone in recovery mode and your iPhone's screen will be black. If it didn't work, start from the beginning and try again. Preparation 4 - Downgrade to 3.1.3 It is now time to do the downgrading. Dismiss the iTunes alert that told you you're in recovery mode. Select the iPhone in the iTunes sidebar 1. Hold Cmd and click the Restore button 2. iTunes will pop up a window prompting you to choose a file. Navigate to the location of the 3.1.3 OS file you obtained in preparation 1. 3. Select that file, and iTunes will start the OS restore process. You will now use the bit of patience as thos takes a few minutes 4. When it's finished, you'll receive an error message and your iPhone will boot up with a "Connect to iTunes" screen. Preparation 5 - Recovering the iPhone This is where RecBoot becomes useful. Open RecBoot, and click "Exit Recovery Mode". After a few seconds the software should prompt your iPhone to leave the plug-me-into-iTunes mood and there you go, you have a freshly downgraded iPhone device! Serve cold and enjoy! Read me... ## Get rid of Ctrl-M characters Have you found yourself opening an old text file in your shinny new Mac and strange characters such as ^M appear all over the place? Do not panic, nothing that this post would not help to fix. Those strange characters come from the format in which different operating systems encode things like carriage returns at the end of line. Some applications won't recognise the carriage returns and will display a file as a single line, interspersed with Ctrl-M characters. In Mac OS X, the situation is more complicated given that it is a flavour of Unix itself. In some cases text files have carriage returns and in others they have new lines. For the most part, classic applications still require text files to have carriage returns, while the command-line Unix utilities require new lines (aka line feeds). Mac OS X-native applications are usually capable of interpreting both. There are many ways to resolve the differences in format. There are some Unix command line utilities such as trawk and Perl to do the conversion. From Mac OS X, each can be accessed from the Terminal application. In order to understand some of the syntax used below, it is important to mention that Unix character sequences to identify different types of "spaces" • r: CarriageReturn • n: New Line • t: Tab • v: VerticalTab • f: FormFeed • b: BackSpace ## tr The Unix program tr is used to translate between two sets of characters. Characters specified in one set are converted to the matching character in the second set. Thus, to convert the Ctrl-M of a Mac OS text file to the new line (Ctrl-j) of a Unix text file, at the Unix command line, enter: • tr 'r' 'n' < macfile.txt > unixfile.txt Here, r and n are special escape sequences that tr interprets as Ctrl-M (a carriage return) and Ctrl-j (a new line), respectively. Thus, to convert a Unix text file to a Mac OS text file, enter: • tr 'n' 'r' < unixfile.txt > macfile.txt ## awk To use awk to convert a Mac OS file to Unix, at the Unix prompt, enter: • awk '{ gsub("r", "n"); print$0;}' macfile.txt > unixfile.txt

To convert a Unix file to Mac OS using awk, at the command line, enter:

• awk '{ gsub("n, "r"); print \$0;}' unixfile.txt > macfile.txt

On some systems, the version of awk may be old and not include the function gsub. If so, try the same command, but replace awk with gawk or nawk.

## Perl

To convert a Mac OS text file to a Unix text file using Perl, at the Unix shell prompt, enter:

• perl -p -e 's/r/n/g' < macfile.txt > unixfile.txt

To convert from a Unix text file to a Mac OS text file with Perl, at the Unix shell prompt, enter:

• perl -p -e 's/n/r/g' < unixfile.txt > macfile.txt

I hope this helps.

UPDATE - Thanks to D Asirvadem for some comments and corrections. He adds in regards to "In Mac OS X, the situation is more complicated given that it is a flavour of Unix itself":

Yes and no. I would not put it that way. The problem only happens when you import a text file from an older variant of Unix (eg AIX) to a newer variant of Unix (eg. MacOS or Linux RHEL). It is no more complicated on MacOS.

## OS-tan

You surely must be familiar with the concept of an operating system (OS) for your computer. How has not heard of Windows and its different incarnations? Windows 95, Windows 98, Vista or Windows 7? What about Mac OS X - Panther, Snow Leopard or Lion? And what about Linux and its different distros?

If you are a long term user of Microsoft products you are thus well aware of the moodiness of the operating system, blue-screen-of-death and the endless restarts after a seemingly infinite number of updates. Well, you would not be alone if you were to start antropomorphising your favourite OS, and if you are a Japanese user it doesn't take long for you to start creating manga characters for them, i.e. OS-tans.

Why OS-tan? Well, you probably have heard the Japanese suffix -san (ーさん) used at the end of someone's name. It is an honorific suffix and roughly translates as Mr, Mrs, Miss, or Ms. If you want to be more familiar with someone and what to show that the person is close to you, you might use the diminutive honorific suffix "-chan" (-ちゃん). A common childish mispronunciation of this suffix is "-tan" (-たん) and thus the meaning of OS-tan becomes clear.

OS-tans are personifications of various operating systems, which started with the  common perception of Windows Me as unstable and prone to frequent crashes. Discussions on Futaba Channel likened this to the stereotype of a fickle, troublesome girl and so Me-tan was born. The characters are usually represented by girls, although some male OS-tans exist. In particular the OS-tans for the different  Windows versions are represented by sisters of various ages.

For instance, XP-tan is a dark-haired girl with ribbons in her hair and an "XP" hair ornament typically worn on the left side. Windows XP is criticised for bloating a system and being very pretty without being as useful. Additionally, as a reference to the memory usage of Windows XP, she is often seen eating or holding an empty rice bowl labeled "Memory".  Windows 7 is represented by a character called Nanami Madobe (窓辺ななみ Madobe Nanami). The premium set of the OS  includes a Windows 7 theme featuring 3 Nanami wallpapers, 19 event sound sets, CD with 5 extra Nanami sounds. This makes it the first OS-tan marketed by the company producing the operating system. In addition, the character also got its own Twitter account and Facebook page.

The Mac OS X girl is often portrayed as a catgirl, following with the Apple "wild cat" naming tradition, she wears a platinum white coat and a wireless AirPort device fashioned as a hat. In the Linux case, sometimes a penguin is used as a reference to Tux, but there is also the image of a girl with helmet and flippers. Her helmet usually has horns on it, likely a reference to the GNU software which comprises the common system programs present in nearly all Linux distributions.

There are many more OS-tans and there are even mangas and animations featuring the characters, including supporting ones such as Dr Norton, Firefox-tan and Opera-tan. A list of OS-tans can be found here. So next time your system crashes or you need an extra driver, you can always think of the OS-tan behind the machine.

## Japanese chiisai characters in Katakana

Writting the  "chiisai" Japanese characters when using Hiragana（ひらがな）is quite straight forward. Chiisai (小さい) means small, and it helps to make a different sound by combining a character with a chiisai one.

For instance to write ちょっと, you have to type the following keys: "chotto", which produces the "chiisai yo" and "chiisai tsu" needed to write this Japanese word.

However, it seems that when using Katakana (カタカナ) things become a bit complicated, so here is how to produce the chiisai characters:

ッ (Katakana)

xtu (key sequence)

キャ キュ キョ (Katakana)
kya kyu kyo (key sequence)

シャ シュ ショ (Katakana)
sha shu sho (key sequence)

チャ チュ チョ (Katakana)
cha chu cho (key sequence)

ニャ ニュ ニョ (Katakana)
nya nyu nyo (key sequence)

ヒャ ヒュ ヒョ (Katakana)
hya hyu hyo (key sequence)

ミャ ミュ ミョ (Katakana)
mya myu myo (key sequence)

リャ リュ リョ (Katakana)
rya ryu ryo (key sequence)

ギャ ギュ ギョ (Katakana)
gya gyu gyo (key sequence)

ジャ ジュ ジョ (Katakana)
ja ju jo (key sequence)

ビャ ビュ ビョ (Katakana)
bya byu byo (key sequence)

ピャ ピュ ピョ (Katakana)
pya pyu pyo (key sequence)

Exceptional character.
These are basically used only for technical words.
[Katakana - (key sequence)]
ウィ (uxi)
クァ (kuxa)
クィ (kuxi)
クェ (kuxe)
クォ (kuxo)
ティ (texi)
フュ (fyu)
ディ (dexu)
デュ(dexi)
ヴァ (va)
ヴィ (vi)
ヴェ (ve)
ヴォ (vo)

I came across this trick in Yahoo answers.

## Quick switching for Kana syllabaries in OS X

If you type things in Japanese, you have quite likely come across the problem of switching between Hiragana (ひらがな) and Katakana (カタカナ).
You can switch between Hiragana and Katakana by holding down Shift while typing.

As an example, typing in  わたしはヘススです。requires Hiragana, Katakana, and Hiragana again. The quickest way to write this sentence starting from the US English Keyboard would be as follows (assuming you enabled the Command-Space Bar switching method:

1. Press Command-Space Bar.
2. Type the following keys: "watashiha ".
3. Hold down the Shift key to switch to Katakana, then type "HESUSU ".
4. Let go of Shift to switch back to Hiragana, and type "desu."

## Keyboard Shortcut for "Save as PDF..." in MAC OS X

I have been wondering about the possibility of having a shortcut to "Save as PDF" in OS X. I came across  a hint at MacOSXHints. And even better than that, a walkthrough version in this link.

## Deleting E-mail addresses from MacMail autocomplete list

Like many other emailing applications Mac OS X Mail is pretty good at remembering the email addresses of people you usually communicate with. As soon as you start typing it in the To: field of an email, a few suggestions will come up. It is pretty useful expect when you want to get rid of those addresses that you don't use very often.

Fortunately, there's a way to delete old (or unwanted) addresses from the auto-complete list in Mac OS X Mail. The new address will be remembered automatically, and soon the auto-complete feature is as useful as ever again.

### Delete an Email Address from Auto-Complete in Mac OS X Mail

To remove an email address from the auto-complete list in Mac OS X Mail:

• Start typing the recipient's address or name in a new message.
• Select the desired address from the auto-complete list as if you'd compose an email to them.
• Click the small down arrow in the recipient.
• Select Remove from Previous Recipients List from the menu.

You can also search for the unwanted address directly in the previous recipients list:

• Select Window | Previous Recipients from the menu in Mac OS X Mail.
• Highlight the address you want to remove.
• You can highlight multiple addresses by holding down the Command key.
• Click Remove from List.

### Clean up Mac OS X Mail's Auto-Complete List

To clean up or empty the auto-complete list of previous recipients' addresses in Mac OS X Mail:

• Select Window | Previous Recipients from the menu.
• Click on the Last Used header so the arrow points downward.
• Make sure no entry is highlighted.
• Hold down the Shift key.
• Click on an address last used a year ago.
• Of course, you can choose a different interval and select all addresses not used in the past month, for example.
• Verify all entries not used in the last year are highlighted.
• Click Remove From List.

## Really great day to Bletchley Park @bletchleypark. Worth paying a visit!

Really great day to Bletchley Park @bletchleypark. Worth paying a visit!

## NASA Explores Semantic Search

I came across a news article about NASA using technology from Google and Smartlogic to perform semantic searches of its manned space-flight program.

Smartlogic is a UK based company and the software that NASA is using is called Semaphore which retrieves data semantically; the data is organised semantically and the search is done by parsing each sentence of the query to obtain its meaning.

The original article can be found here.

## A flash version of HELL

A few days ago the great xkcd published a version of Hell based on the famous Tetris game... except that the bottom of the game is not flat, but curved.

I found this amusing and couldn't help tweeting about it, and I was told by a friend that if anything this "only shows our age more than anything else :(".

I guess he is right, but having seen a new flash version of this Helli-sh game is brilliant and goes to show that there are other geeks out there who are also proud of showing their age.

Let me know if you manage to score :)

## Programming Languages

I remember the first time I had the opportunity to program a computer. As you might imagine it was nothing too complicated, after all it was the first time I did anything like that. It was a simple programme of the "Hello World!" type. Written in BASIC (aka Basic All-Purpose Symbolic Instruction Code) it was a programme that printed the sequence of numbers from 1 to 10. Pretty neat, but not very useful. Since then I had a go at a number of programming languages, scripts and tools, going from COBOL and Pascal to C++ and Python.

When people ask me about my favourite programming language, I tend to reply with another question: "What for?". I sincerely believe that there is no such thing as the perfect programming language, and it all the depends on what it is that you need your computer to do. I mean, you would not bang a nail with a spanner, you would rather use a hammer for that. Of course, there is no question about the possibility of using the spanner for that particular task, but you would find that doing so has advantages (it's the tool you already know) and disadvantages (the tool is not designed with that particular purpose in mind).

There is a plethora of programming tools and some of them have been around for years, either because they are indeed very well designed for their purpose, or because the amount to underlying programmes and functions written with them is so overwhelming that it is easier to maintain them alive. Some other languages are more recent and I am sure that some of them will stand the test of time... but not all of them.

Very recenlty, TIOBE Software released their April index ranking the most popular programming languages. They show that the reliable C language is back to number 1. I was not totally surprised by this, I always thought that the popularity of the language would place it among the first 5 top places, along with C++ and Java. What I did not expect to see what to find MATLAB in number 18.

The index is updated once a month. The ratings are based on the number of skilled engineers world-wide, courses and third party vendors. The definition of the TIOBE index can be found here, and the first 20 places are listed below:

Position
Apr 2010
Position
Apr 2009
Delta in Position Programming Language Ratings
Apr 2010
Delta
Apr 2009
Status
1 2 C 18.058% +2.59% A
2 1 Java 18.051% -1.29% A
3 3 C++ 9.707% -1.03% A
4 4 PHP 9.662% -0.23% A
5 5 (Visual) Basic 6.392% -2.70% A
6 7 C# 4.435% +0.38% A
7 6 Python 4.205% -1.88% A
8 9 Perl 3.553% +0.09% A
9 11 Delphi 2.715% +0.44% A
10 8 JavaScript 2.469% -1.21% A
11 42 Objective-C 2.288% +2.15% A
12 10 Ruby 2.221% -0.35% A
13 14 SAS 0.717% -0.07% A
14 12 PL/SQL 0.710% -0.38% A
15 - Go 0.710% +0.71% A
16 15 Pascal 0.648% -0.07% B
17 17 ABAP 0.625% -0.03% B
18 20 MATLAB 0.616% +0.13% B
19 22 ActionScript 0.545% +0.09% B
20 19 Lua 0.521% +0.03% B

### Other programming languages

Well, where does this index place some of the languages that I have used at some point?; here we go: C, C++, VB, Python, Java, Pascal, MATLAB and Perl are all in the first 20 places.

Bourne Shell (26), COBOL (29), Fortran (34 - although they do not mention what flavour: 77,95, etc), Prolog (43 - is anyone using that for anything? seriously?), VBSpcript (50) are all in the first 50 places. They also list (in no particular order) numbers 51 to 100, including: LabView, Maple, Mathematica, R and SPSS.

### Curiosities (or are they?)

Some of you, dear readers, might say that a lot of the languages are not really programming languages. A friend of mine rejected, for example, the idea of MATLAB as a programming language.

"Surely all scripting languages are programming languages, but not all programming languages are scripting languages" I hear you say. Well, as it was pointed out by another friend of mine: "If you really want to hurt yourself look at 'Root'" - a framework developed in 1994 by CERN, which has a scriptable command-line C++ interpreter! Really!

For the hardcore programmer in you, there are some interesting languages out there to have a look at and definitely play with. For example there is Whitespace (it seems that the original link is dead now, please check a Wayback page here) which, unlike any other programming tool, ignores any non-whitespace characters. Only spaces, tabs and linefeeds have meaning. You can see an example here. In a similar fashion, Brainfuck considers only eight commands in the language, namely: > < + - . , [ ] You can see an example here.

Now, if you really want to see how the text messaging culture has made it into the "Hello World!" of computer programming, look no further than LOLCODE, whose commands are expressed in lolcat and as you can imagine, the language is not clearly defined in terms of operator priorities and correct syntax (LOL!). Here is an example:

HAI
CAN HAS STDIO?
PLZ OPEN FILE "LOLCATS.TXT"?
AWSUM THX
VISIBLE FILE
O NOES
INVISIBLE "ERROR!"
KTHXBYE

Other commands include "I HAS A variable", "variable R value" and "BTW" to denote comments!

Honestly, what next?...

## Screenshot in Macs

Here are some useful commands to take snapshots of the screen in a mac:

• Command-Shift-3: Take a screenshot of the screen, and save it as a file on the desktop
• Command-Shift-4, then select an area: Take a screenshot of an area and save it as a file on the desktop
• Command-Shift-4, then space, then click a window: Take a screenshot of a window and save it as a file on the desktop
• Command-Control-Shift-3: Take a screenshot of the screen, and save it to the clipboard
• Command-Control-Shift-4, then select an area: Take a screenshot of an area and save it to the clipboard
• Command-Control-Shift-4, then space, then click a window: Take a screenshot of a window and save it to the clipboard

In Leopard, the following keys can be held down while selecting an area (via Command-Shift-4 or Command-Control-Shift-4):

• Space, to lock the size of the selected region and instead move it when the mouse moves
• Shift, to resize only one edge of the selected region
• Option, to resize the selected region with its center as the anchor point