Machine Learning 101: Evaluating regression models, MAE, MSE, RMSE, R-squared explained

by mark · Published 22 July 2020 · Updated 22 July 2020

You’ve probably already started your Data Science journey by now, and have implemented your first Linear Regression model. Great, but how do you know your model is good? How good it is? How good it is compared to other models (spoiler: there’s no right answer)? Today you’ll learn error metrics (some of them) and how to evaluate regression models.

Classification vs Regression

Machine Learning essentially deals with two kinds of problems:

Classification: predicting a class, for example whether a user is male or female (the two classes) given their history of purchased items.
Regression: predicting a value, for example the price (the value) of a used car given the model, the age, the kilometers on the odometer.

It is important to remember that Machine Learning is no magic, ML algorithms are still algorithms: multiple inputs, one output. The most important difference between a traditional algorithm and an ML one is the “experience” the ML algorithm gains during the training phase.

In Classification problems the algorithm tries to predict the class the entry will fall into, it may be two classes (such as the example above, male versus female) or more than two classes. The former is often called Binary Classification the latter is referred to as Multiclass Classification.

In Regression there is no class to predict, instead there is a scale and the algorithm tries to predict the value on that scale. In the example above the price is the sought value.

Distances and error metrics

Most tutorials and bootcamps out there will throw metrics at you without explaining (they imply you know) what they really represent, what intuition is behind those cryptic acronyms such as MAE or RMSE. I do believe it is essential for a data scientist or anyone working with predictive models to understand what these metrics are, what they represent and how to use them.

Back to the first question: “how do you know your model is good?”. The answer to this one is simple: “let’s measure it somehow!”. That’s when you know you need an error measurement or more commonly a metric used to describe “how big the error is“. But how do you know how big it is? That’s when you start thinking about errors in real life.

You may remember when you learned how to use a ruler, and you mistakenly drew a line that was just 1cm shorter or longer than it ought to have been (we’ve all been there). So suppose your perfect line ought to have been 10cm, you drew it 9cm long. You made a mistake, your error was 1cm.

That 1cm is really just a distance. A distance between what it is and what it should have been. That’s it! For regression problems we’ll use distances, to quantify the error and know whether the model is good or not. But why is that?

Because at the end of the day you’re still just drawing lines, maybe in two dimensions, or maybe in n-dimensions. Think of the Linear Regression isn’t it a line that your model is drawing between points? Now that we’ve cleared the intuition behind errors and distances let’s take a look at some metrics (that you probably had already seen).

MAE (Mean Absolute Error)

Before we take a look at the mean, let’s take a look at the Absolute Error:

$$AE = \left |x-\hat{x} \right |$$

Since an error can’t be a negative number the absolute value is used. The $\hat{x}$ is what you measured, the $x$ is what you expected it to be.

Back to the perfect-line you never drew in school, whether you drew it 11cm or 9cm the error is still 1cm. Now imagine you want to know how precise you are at drawing lines. You decide to draw many lines and measure each one and measure their errors (absolute). Then you will take the average of those errors.

$$MAE = \frac{1}{n} \sum_{i=1}^{n} \left |x_i-\hat{x}_i \right |$$

Then you suddenly understood what Mean Absolute Errors is, kudos! It wasn’t that difficult, right? Observe that MAE uses the same unit of measurement of the original measurement.

MSE (Mean Squared Error)

Let’s start using an example, imagine you have to draw four lines: let x=[2, 4, 8, 12] be the real values you expect (in centimeters), let y=[2, 3, 9, 11] and z=[2, 4, 5, 12] be two measurement, you can think of two people drawing the lines. Who’s the best at drawing lines y or z?

Next you calculate the MAE of y and z. Surprise: 0.75(cm) each! You could very well say it’s a tie. However notice that y made three errors of 1cm and z made one error of 3cm.

Y is very precise across the whole four lines, while Z sometimes makes very big mistakes. Using MAE they look the same. You might think that you want to punish Z because he made such a big error. So how do you do this?

Remember that we measure errors as distances, a distance can be thought of a line (a segment). Now if you square that segment you will get an area (of the square with that segment as the side). Now the area of the square with side 1 is still 1, 2 yields 4, 3 yields 9. We’re now ready to define the Mean Squared Error:

$$MSE = \frac{1}{n} \sum_{i=1}^{n} \left (x_i-\hat{x}_i \right )^2$$

Essentially the MSE is a squared version of the MAE. MSE punishes values that are distant from the expected value. The more the distance, the more the punishment. You don’t need the absolute value because you’re squaring the difference.

Let’s now calculate the MSE for y and z from the previous example: y is still 0.75 while z is 2.25. Remind that both y and z had the same MAE: 0.75cm. So in this case y MAE is equal to y MSE. Wrong!

Notice that I didn’t state the unit of measurement in MSE, can you guess what it is? The answer is $cm^2$. That is simply because you squared each measurement, and each unit has been squared as well. So MSE is $0.75cm^2$ for y and $2.25cm^2$ for z. Since you were drawing lines, you can’t really know if that is good or bad (MSE in this case are areas). Of course z is worse because its MSE is higher, but you still can’t quantify how much the error is compared to the lines. Takeaway: MSE doesn’t use the same unit of measurement f your data!

RMSE (Root Mean Squared Error)

RMSE is essentially MSE under a square root. What does it accomplish? The same as MSE but you get an error with the same unit of measurement.

$$RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} \left (x_i-\hat{x}_i \right )^2}$$

Let’s now calculate RMSE for y and z from the previous example: RMSE for y is ~0.8660cm while RMSE for z is 1.5cm. Since you extracted the root of MSE, the measurement returned in the same unit of measurement of your data. You can now say for sure: y is the winner.

R^2 (R-squared)

The last metric I want to show you is $r^2$ also known as coefficient of determination. Along with RMSE it is probably the most used metric when it comes to regression problems. While MSE, RMSE and MAE can assume different values depending on the input, r squared will always be a value between 0 and 1, the higher the value, the better the model. $r^2$ is not an error metric although it is often (mis)used as one. I won’t delve too deep in this one as you need some solid basics in statistics, but most of the times you’ll see $r^2$ described as the ratio of explained variance to the total variance.

$$r^2 = \frac{ESS}{TSS} = 1 – \frac{RSS}{TSS}$$

$$ESS = \sum_{i=1}^{n} (\hat{y}_i – \bar{y})^2$$

$$TSS = \sum_{i=1}^{n} (y_i – \bar{y})^2$$

$$RSS = \sum_{i=1}^{n} (y_i -\hat{y}_i)^2$$

In short:

$$r^2 = \sum_{i=1}^{n} \frac{ (\hat{y}_i – \bar{y})^2}{(y_i – \bar{y})^2} = 1 – \sum_{i=1}^{n} \frac{(y_i – \hat{y}_i)^2}{(y_i – \bar{y})^2}$$

Now it may seem daunting at first, but it really isn’t. Keep in mind:

$y_i$ is the real value,
$\bar{y}$ is the mean,
$\hat{y}_i$ is the predicted value.

If you look at the first formula (ESS/TSS) you can easily see that it is essentially the squared Absolute Error of $\hat{y}$ and the mean of $y$ divided by the Absolute Error $y$ and its mean. Meaning that it is the distance between the mean and the predicted values divided by the distance between each value and its mean value. It is not an easy concept to grasp without solid statistics foundations so don’t worry for now.

$r^2$ is the only metric presented in this article that uses a scale between 0 and 1. Because it doesn’t represent an error it is not an error metric. $r^2$ is often used as an error metric, but the truth is it only says “how much of x you can infer from y” and it can be very good ~1 or very bad ~0.

Conclusion

You now know of the most used error metrics (+ $r^2$) used in machine learning and data science for regression problems. While there are more error metrics out there, the ones presented in this article are by far the most used. When it comes to regression problems you’ve built an understanding of what MAE, MSE and RMSE represent, are used for and how to calculate them. Lastly you’re now aware of $r^2$ which is not an error metric but it is often used to evaluate regression models.

Image courtesy of mark | marksei

Author
Recent Posts

mark

The IT guy with a slight look of boredom in his eyes. Freelancer. Current interests: Kubernetes, Tensorflow, shiny new things.

Cookie	Duration	Description
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_60468161_1	past	Set by Google to distinguish users.
_ga_DR9SCJ09BV	2 years	This cookie is installed by Google Analytics.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.

Cookie	Duration	Description
edgebucket	session	Reddit sets this cookie to save the information about a log-on Reddit user, for the purpose of advertisement recommendations and updating the content.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
test_cookie	14 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
csv	2 years	No description available.
GoogleAdServingTest	session	No description
wp_api	past	No description
wp_api_sec	past	No description
_pk_id.1.95fa	1 year 27 days	No description
_pk_ses.1.95fa	29 minutes	No description
__smSessionId	9 hours	No description available.
__smToken	1 year	This cookie is set by the Sumo. This cookie is used for verifying whether the user is logged in or not.

Machine Learning 101: Evaluating regression models, MAE, MSE, RMSE, R-squared explained

Classification vs Regression

Distances and error metrics

MAE (Mean Absolute Error)

MSE (Mean Squared Error)

RMSE (Root Mean Squared Error)

R^2 (R-squared)

Conclusion

You may also like...

Leave a ReplyCancel reply

Recent Posts

Recent Comments

Categories

Latest tutorials

Machine Learning 101: Evaluating regression models, MAE, MSE, RMSE, R-squared explained

Classification vs Regression

Distances and error metrics

MAE (Mean Absolute Error)

MSE (Mean Squared Error)

RMSE (Root Mean Squared Error)

R^2 (R-squared)

Conclusion

Related posts:

You may also like...

How to set up a Data Science environment on Windows using Anaconda

Machine Learning 101: Supervised, Unsupervised, Reinforcement

Machine Learning 101: Linear Regression in Python

Leave a ReplyCancel reply

Recent Posts

Recent Comments

Categories

Latest tutorials