avatarMazen Ahmed

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

2514

Abstract

Wind Speed and Pressure to <b>𝑥</b>₁,<b>𝑥</b><b> </b>and<b> 𝑥</b><b>.</b></p><p id="4955">Just as with Simple Linear Regression we must ensure that our variables <b>𝑥₁,𝑥₂</b> and <b>𝑥₃</b> <b>form a linear relationship with y</b>, if not we will be producing a very inaccurate model.</p><p id="bcb9">Lets plot each of our variables against Humidity:</p><figure id="57e8"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*pJMsyO8oWxM_dsKCyTrVag.png"><figcaption></figcaption></figure><figure id="2c4e"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*X9t6bkpiD02_HB21vIkUag.png"><figcaption></figcaption></figure><figure id="527d"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*N5edkapTr4QQgllJ6JnpYg.png"><figcaption></figcaption></figure><ul><li>Temperature and Humidity form a <b>strong linear relationship</b></li><li>Wind Speed and Humidity form a <b>linear relationship</b></li><li>Pressure and Humidity <b>do not </b>form a linear relationship</li></ul><p id="8c12">We therefore <b>can not use Pressure</b> (𝑥₃) in our multiple linear regression model.</p><h1 id="843b">Plotting our Data</h1><p id="2e72">Let’s now plot both Temperature (𝑥₁)<b> </b>and Wind Speed (𝑥₂) against Humidity.</p><figure id="a6ed"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*ljcN0A-BGVvJyECLishpmA.png"><figcaption></figcaption></figure><p id="f174">We can see that our data follows a roughly linear relationship, that is we can fit a <a href="https://www.mathsisfun.com/geometry/plane.html">plane</a> on our data that captures the relationship between Temperature, Wind-speed(𝑥₁, 𝑥₂) and Humidity (y).</p><figure id="cc39"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*SvbkV7LEaCOEGCBlwri54Q.png"><figcaption></figcaption></figure><h1 id="4e6a">Calculating the Regression Model</h1><p id="9732">Because we are dealing with more than one 𝑥 variable our linear regression model takes the form:</p><figure id="e6e3"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*m_hxpbVmdHtnTM7TNdwjaA.png"><figcaption></figcaption></figure><p id="98e9">Just as with simple linear regression in order to find our parameters θ₀, θ₁ and θ₂ we need to <b>minimise our cost function:</b></p><figure id="1321"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*X3U95aXCjEHFBMVwKMe51w.png"><figcaption></figcaption></figure><p id="af6f">We do this using the gradient descent algorithm:</p><figure id="8f5a"><img sr

Options

c="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*2_9ay037IIKYwUMKpFjflg.png"><figcaption></figcaption></figure><p id="97d2">This algorithm is explained in more detail <a href="https://readmedium.com/understanding-gradient-descent-de136ed5e962">here</a></p><p id="5497">After running our gradient descent algorithm we find our optimal parameters to be <b>θ₀ = 1.14 , θ₁ = -0.031 and θ₂ =-0.004</b></p><p id="f16f">Giving our final regression model:</p><figure id="fff7"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*L1qlvrEbqARG7GX91DMCQQ.png"><figcaption></figcaption></figure><p id="8145">We can then use this regression model to make predictions for Humidity (ŷ) given any Temperature (𝑥₁) or Wind speed value(𝑥₂).</p><p id="0209">In general models that contain more variables tend to be more accurate since we are incorporating more factors that have an effect on Humidity.</p><p id="d463" type="7">_________________________________________</p><h1 id="c02b">Potential Problems</h1><p id="fe4f">When including <b>more and more variables</b> in our model we run into a few problems:</p><ul><li>For example certain variables may become redundant. E.g look at our regression line above, θ₂ =0.004, multiplying our wind speed (𝑥₂) by 0.004 barely changes our predicted value for humidity ŷ, which makes wind speed less useful to use in our model.</li><li>Another example is the scale of our data, i.e we can expect temperature to have a range of say -10 to 100, but pressure may have a range of 1000 to 1100. Using different scales of data can heavily affect the accuracy of our model.</li></ul><p id="507f">How we solve these issues will be covered in future episodes.</p><h2 id="2fcb">Prev Episode | Next Episode</h2><h2 id="302d">If you have any questions please leave them below!</h2> <figure id="c3ee"> <div> <div> <img class="ratio" src="http://placehold.it/16x9"> <iframe class="" src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FVDCGEWwG1BA%3Ffeature%3Doembed&amp;display_name=YouTube&amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DVDCGEWwG1BA&amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FVDCGEWwG1BA%2Fhqdefault.jpg&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=youtube" allowfullscreen="" frameborder="0" height="480" width="854"> </div> </div> </figure></iframe></div></div></figure></article></body>

Understanding Multiple Linear Regression

With video explanation | Data Series | Episode 4.4

We have taken a look at Simple Linear Regression in Episode 4.1 where we had one variable x to predict y, but what if now we have multiple variables, not just x, but 𝑥₁,𝑥₂ and 𝑥₃ … to predict y — how would we approach this problem? I hope to explain in this article.

Simple Linear Regression Recap

From Episode 4.1 we had our data of temperature and humidity:

We plotted our Data, found and found a linear relationship — making linear regression suitable:

We then calculated our regression line:

using gradient descent to find our parameters θ₀ and θ₁.

We then used the regression line calculated to make predictions for Humidity given any Temperature value.

What is Multiple Linear Regression?

Multiple linear regression takes the exact same concept as simple linear regression but applies it to multiple variables. So instead of just looking at temperature to predict humidity, we can look at other factors such as wind speed or pressure.

We are still trying to predict Humidity so this remains as y.

We rename Temperature, Wind Speed and Pressure to 𝑥₁,𝑥 and 𝑥.

Just as with Simple Linear Regression we must ensure that our variables 𝑥₁,𝑥₂ and 𝑥₃ form a linear relationship with y, if not we will be producing a very inaccurate model.

Lets plot each of our variables against Humidity:

  • Temperature and Humidity form a strong linear relationship
  • Wind Speed and Humidity form a linear relationship
  • Pressure and Humidity do not form a linear relationship

We therefore can not use Pressure (𝑥₃) in our multiple linear regression model.

Plotting our Data

Let’s now plot both Temperature (𝑥₁) and Wind Speed (𝑥₂) against Humidity.

We can see that our data follows a roughly linear relationship, that is we can fit a plane on our data that captures the relationship between Temperature, Wind-speed(𝑥₁, 𝑥₂) and Humidity (y).

Calculating the Regression Model

Because we are dealing with more than one 𝑥 variable our linear regression model takes the form:

Just as with simple linear regression in order to find our parameters θ₀, θ₁ and θ₂ we need to minimise our cost function:

We do this using the gradient descent algorithm:

This algorithm is explained in more detail here

After running our gradient descent algorithm we find our optimal parameters to be θ₀ = 1.14 , θ₁ = -0.031 and θ₂ =-0.004

Giving our final regression model:

We can then use this regression model to make predictions for Humidity (ŷ) given any Temperature (𝑥₁) or Wind speed value(𝑥₂).

In general models that contain more variables tend to be more accurate since we are incorporating more factors that have an effect on Humidity.

_________________________________________

Potential Problems

When including more and more variables in our model we run into a few problems:

  • For example certain variables may become redundant. E.g look at our regression line above, θ₂ =0.004, multiplying our wind speed (𝑥₂) by 0.004 barely changes our predicted value for humidity ŷ, which makes wind speed less useful to use in our model.
  • Another example is the scale of our data, i.e we can expect temperature to have a range of say -10 to 100, but pressure may have a range of 1000 to 1100. Using different scales of data can heavily affect the accuracy of our model.

How we solve these issues will be covered in future episodes.

Prev Episode | Next Episode

If you have any questions please leave them below!

Machine Learning
Data Science
AI
How To Learn Data Science
Recommended from ReadMedium