avatarBenjamin Obi Tayo Ph.D.

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

3339

Abstract

he time, so instead of a linear fit to the data we should use a quadratic one:</p><figure id="ae36"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*YQOg345lDO5kYEmdd9vtVw.png"><figcaption></figcaption></figure><p id="32cd">To illustrate the bias-variance problem, let's return to the position data and fit a linear, a quadratic, and a degree = 10 polynomial to the data:</p><div id="eba8"><pre><span class="hljs-attribute">plt</span>.figure(figsize=(<span class="hljs-number">15</span>,<span class="hljs-number">5</span>))</pre></div><div id="71a9"><pre><span class="hljs-comment"># fig 1</span> plt.subplot(131) <span class="hljs-comment">#perform linear fit using pylab</span> degree = 1 <span class="hljs-attribute">model</span>=pylab.polyfit(t,h,degree) <span class="hljs-attribute">est_h</span>=pylab.polyval(model,t)</pre></div><div id="7c79"><pre><span class="hljs-attr">#calculating R-squared value R2</span> = <span class="hljs-number">1</span> - <span class="hljs-comment">((h-est_h)</span><span class="hljs-number">2</span>).sum<span class="hljs-comment">()</span>/<span class="hljs-comment">((h-h.mean()</span>)<span class="hljs-number">2</span>).sum<span class="hljs-comment">()</span></pre></div><div id="712e"><pre><span class="hljs-comment">#plot of observed and modeled data</span> pylab.scatter(t,h, <span class="hljs-attribute">c</span>=<span class="hljs-string">'b'</span>, <span class="hljs-attribute">label</span>=<span class="hljs-string">'observed'</span>) pylab.plot(t,est_h, <span class="hljs-attribute">c</span>=<span class="hljs-string">'r'</span>, <span class="hljs-attribute">label</span>=<span class="hljs-string">'predicted:'</span> + <span class="hljs-string">' R2'</span> <span class="hljs-string">'='</span>+ <span class="hljs-string">' '</span> + str(round(R2,4))) pylab.xlabel(<span class="hljs-string">'t(s)'</span>) pylab.ylabel(<span class="hljs-string">'h(m)'</span>) pylab.title(<span class="hljs-string">'linear model is not good (underfit)'</span>) pylab.legend()</pre></div><div id="241a"><pre><span class="hljs-comment"># fig 2</span> plt.subplot(132) <span class="hljs-comment">#perform quadratic fit using pylab</span> degree = 2 <span class="hljs-attribute">model</span>=pylab.polyfit(t,h,degree) <span class="hljs-attribute">est_h</span>=pylab.polyval(model,t)</pre></div><div id="d2c0"><pre><span class="hljs-attr">#calculating R-squared value R2</span> = <span class="hljs-number">1</span> - <span class="hljs-comment">((h-est_h)</span><span class="hljs-number">2</span>).sum<span class="hljs-comment">()</span>/<span class="hljs-comment">((h-h.mean()</span>)<span class="hljs-number">2</span>).sum<span class="hljs-comment">()</span></pre></div><div id="b65c"><pre><span class="hljs-comment">#plot of observed and modeled data</span> pylab.scatter(t,h, <span class="hljs-attribute">c</span>=<span class="hljs-string">'b'</span>, <span class="hljs-attribute">label</span>=<span class="hljs-string">'observed'</span>) pylab.plot(t,est_h, <span class="hljs-attribute">c</span>=<span class="hljs-string">'r'</span>, <span class="hljs-attribute">label</span>=<span class="hljs-string">'predicted:'</span> + <span class="hljs-string">' R2'</span> <span class="hljs-string">'='</span>+ <span class="hljs-string">' '</span> + str(round(R2,4))) pylab.xlabel(<span class=

Options

"hljs-string">'t(s)'</span>) pylab.ylabel(<span class="hljs-string">'h(m)'</span>) pylab.title(<span class="hljs-string">'quadratic model is what we need'</span>) pylab.legend()</pre></div><div id="8992"><pre><span class="hljs-comment"># fig 3</span> plt.subplot(133) <span class="hljs-comment">#perform higher-degree fit using pylab</span> degree = 10 <span class="hljs-attribute">model</span>=pylab.polyfit(t,h,degree) <span class="hljs-attribute">est_h</span>=pylab.polyval(model,t)</pre></div><div id="8947"><pre><span class="hljs-attr">#calculating R-squared value R2</span> = <span class="hljs-number">1</span> - <span class="hljs-comment">((h-est_h)</span><span class="hljs-number">2</span>).sum<span class="hljs-comment">()</span>/<span class="hljs-comment">((h-h.mean()</span>)<span class="hljs-number">2</span>).sum<span class="hljs-comment">()</span></pre></div><div id="ff2a"><pre><span class="hljs-comment">#plot of observed and modeled data</span> pylab.scatter(t,h, <span class="hljs-attribute">c</span>=<span class="hljs-string">'b'</span>, <span class="hljs-attribute">label</span>=<span class="hljs-string">'observed'</span>) pylab.plot(t,est_h, <span class="hljs-attribute">c</span>=<span class="hljs-string">'r'</span>, <span class="hljs-attribute">label</span>=<span class="hljs-string">'predicted:'</span> + <span class="hljs-string">' R2'</span> <span class="hljs-string">'='</span>+ <span class="hljs-string">' '</span> + str(round(R2,4))) pylab.xlabel(<span class="hljs-string">'t(s)'</span>) pylab.ylabel(<span class="hljs-string">'h(m)'</span>) pylab.title(<span class="hljs-string">'degree=10 captures random error (overfit)'</span>) pylab.legend()</pre></div><div id="d686"><pre>pylab.<span class="hljs-keyword">show</span>()</pre></div><figure id="1d67"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*wgSKQVuwiQe4Ba_WP47u4w.png"><figcaption><b>Illustration of bias error (underfit) and variance error (overfit).</b></figcaption></figure><p id="7410">For the linear fit, the R2 (R-Squared) parameter is 0.3953 which is pretty low. If we had a good fit, we would expect the R2 value to be closer to 1.0. For the quadratic fit, the R2 value is 0.8895. The quadratic fit is thus a considerable improvement over the linear one. Using a degree = 10 polynomial, we find an R2 value which is equal to 1.0. We see that the higher degree polynomial is capturing both real and random effects. In this problem, our knowledge of mechanics dictates that there is no advantage in going to higher-order approximations beyond the quadratic model.</p><p id="4660">In summary, we’ve discussed the bias-variance problem using a very simple example. We’ve seen that the primary factor in determining a good fit is the validity of the functional form to which you’re fitting. Certainly, theoretical or analytic information about the physical problem should be incorporated into the model whenever it’s available. Generally, a simple model with fewer model parameters is always easier to interpret compared to an overly complex model.</p><h1 id="6941">References</h1><ol><li><a href="https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff">Bias-variance tradeoff on Wikipedia.</a></li><li>A First Course in Computational Physics by Paul L. DeVries, John Wiley & Sons, 1994.</li></ol></article></body>

Tackling Bias-Variance Problems | Towards AI

Bias-Variance Tradeoff Illustration Using Pylab

In statistics and machine learning, the bias-variance tradeoff is the property of a set of predictive models whereby models with a lower bias in parameter estimation have a higher variance of the parameter estimates across samples and vice versa. The bias-variance dilemma or problem is the conflict in trying to simultaneously minimize these two sources of error that prevent supervised learning algorithms from generalizing beyond their training set:

  • The bias is an error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to miss the relevant relations between features and target outputs (underfitting).
  • The variance is an error from sensitivity to small fluctuations in the training set. High variance can cause an algorithm to model the random noise in the training data, rather than the intended outputs (overfitting).

In this article, we illustrate the bias-variance problem using PyLab. Using an example, we discuss the concepts of underfitting (bias error) and overfitting (variance error).

Example: Position of an Object Hurled Upwards into the Air

We consider an experiment in which an object has been hurled into the air and its position measured as a function of time. The data obtained from this hypothetical experiment are shown below:

#import necessary libraries
import pylab
import numpy as np
import matplotlib.pyplot as plt
#create the dataset
t = np.linspace(0,1,11)
h = np.array([1.67203, 1.79792, 2.37791,2.66408,2.11245, 2.43969,1.88843, 1.59447,1.79634,1.07810,0.21066])
The measured height of a particle at various times.

From our studies of mechanics, we know that the height should vary as the square of the time, so instead of a linear fit to the data we should use a quadratic one:

To illustrate the bias-variance problem, let's return to the position data and fit a linear, a quadratic, and a degree = 10 polynomial to the data:

plt.figure(figsize=(15,5))
# fig 1
plt.subplot(131)
#perform linear fit using pylab
degree = 1
model=pylab.polyfit(t,h,degree)
est_h=pylab.polyval(model,t)
#calculating R-squared value
R2 = 1 - ((h-est_h)**2).sum()/((h-h.mean())**2).sum()
#plot of observed and modeled data
pylab.scatter(t,h, c='b', label='observed')
pylab.plot(t,est_h, c='r', label='predicted:' + ' R2' '='+ ' ' + str(round(R2,4)))
pylab.xlabel('t(s)')
pylab.ylabel('h(m)')
pylab.title('linear model is not good (underfit)')
pylab.legend()
# fig 2
plt.subplot(132)
#perform quadratic fit using pylab
degree = 2
model=pylab.polyfit(t,h,degree)
est_h=pylab.polyval(model,t)
#calculating R-squared value
R2 = 1 - ((h-est_h)**2).sum()/((h-h.mean())**2).sum()
#plot of observed and modeled data
pylab.scatter(t,h, c='b', label='observed')
pylab.plot(t,est_h, c='r', label='predicted:' + ' R2' '='+ ' ' + str(round(R2,4)))
pylab.xlabel('t(s)')
pylab.ylabel('h(m)')
pylab.title('quadratic model is what we need')
pylab.legend()
# fig 3
plt.subplot(133)
#perform higher-degree fit using pylab
degree = 10
model=pylab.polyfit(t,h,degree)
est_h=pylab.polyval(model,t)
#calculating R-squared value
R2 = 1 - ((h-est_h)**2).sum()/((h-h.mean())**2).sum()
#plot of observed and modeled data
pylab.scatter(t,h, c='b', label='observed')
pylab.plot(t,est_h, c='r', label='predicted:' + ' R2' '='+ ' ' + str(round(R2,4)))
pylab.xlabel('t(s)')
pylab.ylabel('h(m)')
pylab.title('degree=10 captures random error (overfit)')
pylab.legend()
pylab.show()
Illustration of bias error (underfit) and variance error (overfit).

For the linear fit, the R2 (R-Squared) parameter is 0.3953 which is pretty low. If we had a good fit, we would expect the R2 value to be closer to 1.0. For the quadratic fit, the R2 value is 0.8895. The quadratic fit is thus a considerable improvement over the linear one. Using a degree = 10 polynomial, we find an R2 value which is equal to 1.0. We see that the higher degree polynomial is capturing both real and random effects. In this problem, our knowledge of mechanics dictates that there is no advantage in going to higher-order approximations beyond the quadratic model.

In summary, we’ve discussed the bias-variance problem using a very simple example. We’ve seen that the primary factor in determining a good fit is the validity of the functional form to which you’re fitting. Certainly, theoretical or analytic information about the physical problem should be incorporated into the model whenever it’s available. Generally, a simple model with fewer model parameters is always easier to interpret compared to an overly complex model.

References

  1. Bias-variance tradeoff on Wikipedia.
  2. A First Course in Computational Physics by Paul L. DeVries, John Wiley & Sons, 1994.
Machine Learning
Linear Regression
Python
Bias Variance Tradeoff
Supervised Learning
Recommended from ReadMedium