First steps into the Internet of Things – Meetup 25th August

Hello guys,

I am joining forces with Bob Yelland (IBM) again to organise a joint meetup. I say again as we organised a joint session a few months back between the Big Data Developers in London and the Data+Visual meet up. I even gave a talk on that one, about “Data Visualisation: The good, the bad and the ugly” Unlike the previous one, we are actually physically joining the attendees rather than having parallel sessions.

The event is now live and it will take place on the 25th of August. Shall I see you there?

On the Skills Matter site:  https://skillsmatter.com/meetups/8259-datapalooza-nights-meetup#overview

and the MeetUp site: https://www.meetup.com/Big-Data-Developers-in-London/events/232919166/

 

Datapalooza IoT

Installing Spark 1.6.1 on a Mac with Scala 2.11

I have recently gone through the process of installing Spark in my mac for testing and development purposes. I also wanted to make sure I could use the installation not only with Scala, but also with PySpark through a Jupyter notebook.

If you are interested in doing the same, here are the steps I followed. First of all, here are the packages you will need:

  • Python 2.7 or higher
  • Java SE Development Kit
  • Scala and Scala Build Tool
  • Spark 1.6.1 (at the time of writing)
  • Jupyter Notebook

Python

You can chose the best python distribution that suits your needs. I find Anaconda to be fine for my purposes. You can obtain a graphical installer from https://www.continuum.io/downloads. I am using Python 2.7 at the time of writing.

Java SE Development Kit

You will need to download Oracle Java SE Development Kit 7 or 8 at Oracle JDK downloads page. In my case, at the time of writing I am using 1.7.0_80. You can check the version you have by opening a terminal and typing

java -version

You also have to make sure that the appropriate environment variable is set up. In your

~/.bashr_profile

  add the following lines:

export JAVA_HOME=$(/usr/libexec/java_home)

Scala and Scala Build Tool

In this case, I found it much easier to use Homebrew to install and manage the Scala language. I f you have never used Homebrew, I recommend that you take a look. To install it you have to type the following in your terminal:

ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Once you have Hombrew you can install Scala and the Scala Build Tool as follows:

> brew install scala
> brew install bst

You may want to create appropriate environments in your 

~/.bashr_profile

 :

export SCALA_HOME=/usr/local/bin/scala
export PATH=$PATH:$SCALA_HOME/bin

Spark 1.6.1

Obtain Spark from https://spark.apache.org/downloads.html

Note that for building Spark with Scala 2.11 you will need to download the Spark source code and build it appropriately.

Download_Spark

Once you have downloaded the tgz file, unzip it into an appropriate location (your home directory for example) and navigate to the unzipped folder (for example

~/spark-1.6.1

 )

To build Spark with Scala 2.11 you need to type the following commands:

> ./dev/change-version-to-2.11.sh
> build/sbt clean assembly

This may take a while, so sit tight! When finished, you can check that everything is working by launching either the Scala shell:

> ./bin/spark-shell

or the Python shell:

> ./bin/pyspark

Once again there are some environment variables that are recommended:

export SPARK_PATH=~/spark-1.6.1
export PYSPARK_DRIVER_PYTHON="jupyter" 
export PYSPARK_DRIVER_PYTHON_OPTS="notebook" 
alias sparknb='$SPARK_PATH/bin/pyspark --master local[2]'

The last line is an alias that will enable us to launch a Jupyter notebook with PySpark. Totally optional!

Jupyter Notebook

If all is working well you are ready to go. Source your

bash_profile

  and  launch a Jupyter notebook:

> sparknb

Et voilà!