Model Evaluation for Regression Algorithm

Table of contents:

  1. Mean Absolute Error (MAE)
  2. Mean Squared Error (MSE)
  3. Root Mean Squared Error (RMSE)
  4. Coefficient of Determination (R2)
  5. Python Code

1.  Mean Absolute Error (MAE)

Mean Absolute Error (MAE) is the mean of the absolute value of the errors:

Mean Absolute Error formula
Mean Absolute Error

2. Mean Squared Error (MSE)

Mean Squared Error (MSE) is the mean of the squared errors:

Mean Squared Error formula
Mean Squared Error

3. Root Mean Squared Error (RMSE)

Root Mean Squared Error (RMSE) is the square root of the mean of the squared errors:

Root Mean Squared Error formula
Root Mean Squared Error

MAE is the easiest to understand, because it’s the average error.
MSE is more popular than MAE, because MSE “punishes” larger errors, which tends to be useful in the real world.
RMSE is even more popular than MSE, because RMSE is interpretable in the “y” units.

4. Coefficient of Determination (R2)

The coefficient of determination (R2 or R-squared) explains goodness of fit for regression line. It describes the proportion of variance of the dependent variable explained by the regression model based on independent variables or explanatory power of the regression model.

Coefficient of Determination R-squared formula
Coefficient of Determination (R2)

Code

Reproducible python code for Model Evaluation: MAE, MSE, RMSE, R-SQUARED

# import libraries
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
plt.style.use('ggplot')
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score,mean_absolute_error, mean_squared_error

# create a dataset
X = range(20,50)
y = np.random.randn(30)*5+range(70,100)

# scatter plot
plt.scatter(x=X, y=y)
plt.title("Ice Cream Sales Data")
plt.ylabel("Ice cream sales")
plt.xlabel("Temperature (°C)")
plt.show()

# train test split
X_train, X_test, y_train, y_test =  train_test_split(X, y, test_size=.3)
X_train = np.array(X_train).reshape(-1, 1)
X_test = np.array(X_test).reshape(-1, 1)

# fir linear regression model
lr = LinearRegression()
lr.fit(X_train,y_train)
y_test_pred = lr.predict(X_test)

# MODEL EVALUATION - MAE, MSE, RMSE, R-SQUARED
print("The Model Evaluation on testing set")
print("--------------------------------------")
print('MAE :', mean_absolute_error(y_test, y_test_pred))
print('MSE :', mean_squared_error(y_test, y_test_pred))
print('RMSE:', np.sqrt(mean_squared_error(y_test, y_test_pred)))
print(f'R2  : {round(r2_score(y_true=y_test,y_pred=y_test_pred),2)}')
Linear Regression Model Evaluation Code
The Model Evaluation on testing set
--------------------------------------
MAE : 3.4469317885024315
MSE : 17.312706352184176
RMSE: 4.160854041201659
R2  : 0.83

Leave a Comment

Keytodatascience Logo

Connect

Subscribe

Join our email list to receive the latest updates.

© 2022 KeyToDataScience