CoreML – Building the model for Boston Prices

In the last post we have taken a look at the Boston Prices dataset loaded directly from Scikit-learn. In this post we are going to build a linear regression model and convert it to a .mlmodel to be used in an iOS app.

We are going to need some modules:

import coremltools
import pandas as pd
from sklearn import datasets, linear_model
from sklearn.model_selection import train_test_split
from sklearn import metrics
import numpy as np

The cormeltools is the module that will enable the conversion to use our model in iOS.

Let us start by defining a main function to load the dataset:

def main():
    print('Starting up - Loading Boston dataset.')
    boston = datasets.load_boston()
    boston_df = pd.DataFrame(boston.data)
    boston_df.columns = boston.feature_names
    print(boston_df.columns)

In the code above we have loaded the dataset and created a pandas dataframe to hold the data and the names of the columns. As we mentioned in the previous post, we are going to use only the crime rate and the number of rooms to create our model:

    print("We now choose the features to be included in our model.")
    X = boston_df[['CRIM', 'RM']]
    y = boston.target

Please note that we are separating the target variable from the predictor variables. Although this dataset in not too large, we are going to follow best practice and split the data into training and testing sets:

    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=7)

We will only use the training set in the creation of the model and will test with the remaining data points.

    my_model = glm_boston(X_train, y_train)

The line of code above assumes that we have defined the function glm_boston as follows:

def glm_boston(X, y):
    print("Implementing a simple linear regression.")
    lm = linear_model.LinearRegression()
    gml = lm.fit(X, y)
    return gml

Notice that we are using the LinearRegression implementation in Scikit-learn. Let us go back to the main function we are building and extract the coefficients for our linear model. Refer to the CoreML – Linear Regression post to remember that type of model that we are building is of the form $y = \alpha + \beta_1 x_1 + \beta_2 x_2 + \epsilon$ :

    coefs = [my_model.intercept_, my_model.coef_]
    print("The intercept is {0}.".format(coefs[0]))
    print("The coefficients are {0}.".format(coefs[1]))

We can also take a look at some metrics that tell let us evaluate our model against the test data:

    # calculate MAE, MSE, RMSE
    print("The mean absolute error is {0}.".format(
        metrics.mean_absolute_error(y_test, y_pred)))
    print("The mean squared error is {0}.".format(
        metrics.mean_squared_error(y_test, y_pred)))
    print("The root mean squared error is {0}.".format(
        np.sqrt(metrics.mean_squared_error(y_test, y_pred))))

CoreML conversion

And now for the big moment: We are going to convert our model to an .mlmodel object!! Ready?

    print("Let us now convert this model into a Core ML object:")
    # Convert model to Core ML
    coreml_model = coremltools.converters.sklearn.convert(my_model,
                                        input_features=["crime", "rooms"],
                                        output_feature_names="price")
    # Save Core ML Model
    coreml_model.save("PriceBoston.mlmodel")
    print("Done!")

We are using the sklearn.convert method of coremltools.converters to create the my_model model with the necessary inputs (i.e. crime and rooms) and output (price). Finally we save the model in a file with the name PriceBoston.mlmodel.

Et voilà! In the next post we will start creating an iOS app to use the model we have just built.

You can look at the code (in development) in my github site here.

CoreML – Building the model for Boston Prices

CoreML conversion

Related

1 thought on “CoreML – Building the model for Boston Prices”