Free AI web copilot to create summaries, insights and extended knowledge, download it at here

Abstract

re3.pdf">Source</a></figcaption></figure><h1 id="a626">Newton’s method</h1><figure id="854a"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*lHrG-LPICoS5q6u_SVx8Ww.png"><figcaption><a href="https://www.mathworks.com/matlabcentral/fileexchange/52362-newton-s-method">Source</a></figcaption></figure>To solve f(x)=0, we can continue the follow iteration until it converges.<figure id="3eb8"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*fXEYRo51PVU9BhlBg-Gipw.png"><figcaption></figcaption></figure>It can be used to minimize f(x):<figure id="12f4"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*EBfkfv8o_Q1HXboPNkIeDA.png"><figcaption><a href="http://fourier.eng.hmc.edu/e176/lectures/NM/node25.html">Source</a></figcaption></figure>The iteration used will be:<figure id="d192"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*quWLV8Jhk29Uso_VGTHnEg.jpeg"><figcaption></figcaption></figure><h1 id="f5f1">Nonlinear least-squares problem with Gauss-Newton method</h1>(<a href="https://www8.cs.umu.se/kurser/5DA001/HT07/lectures/lsq-handouts.pdf">Example credit for the Gauss-Newton method</a>)Suppose we have a function g(t) model by parameters x1 and x2.<figure id="88dd"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*KLRkI5CzN49AY58jVHM3vw.jpeg"><figcaption></figcaption></figure>And we have training samples for t with output label y.<figure id="c3e9"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*pgPkNympJmBBVe84Oqrnrw.jpeg"><figcaption></figcaption></figure>We can define a function r which measures the error between our model and the label:<figure id="67ce"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*IsmkGb-7ylMl4DV1jXiryg.jpeg"><figcaption></figcaption></figure>Now, our objective is to fit our model g with samples to minimize the following objective:<figure id="acbe"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*y

Options

gyzKPVn4pJ-HFpAA61VRg.png"><figcaption></figcaption></figure>In this section, we will apply the Nonlinear least-squares problem with Gauss-Newton method.With Newton’s method on optimization:<figure id="f0f3"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*bFFLHOlRrIdcoa8XytduqA.jpeg"><figcaption></figcaption></figure>According to the equation above, we can rewrite the Δx as the searching direction p. The equations becomes:<figure id="0add"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*K4wCvkQr8tUxUz5PiW5Ukw.png"><figcaption></figcaption></figure>Compute the first and second order derivative of f:<figure id="7331"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*S7TlpRU1fxqP66f4ZuNZjA.jpeg"><figcaption></figcaption></figure>In the Gauss-Newton method, we approximate the second derivative above without the Q term. The Q term is smaller than the first term and if the problem has zero residual, r = Q = 0.Gauss-Newton method applies this approximation to the Newton method.<figure id="4f81"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*9iZe4P3zbrRa4rqk2slYmw.jpeg"><figcaption></figcaption></figure>The Gauss-Newton method search direction is:<figure id="14ce"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*n531VdNCW3tebSCyWChpjg.png"><figcaption></figcaption></figure>Let’s have an example, we want to determine the growth rate of antelope. We develop a model g and fit data into the model to compute the residual error r:<figure id="2217"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*9IEHyRhfyW37LasjGdom9A.jpeg"><figcaption></figcaption></figure><figure id="48f7"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*JmHkQ2owbCLaueD7eG8wJA.png"><figcaption><a href="https://www8.cs.umu.se/kurser/5DA001/HT07/lectures/lsq-handouts.pdf">Source</a></figcaption></figure></article></body>

RL — Optimization Algorithms

This article contains the optimization algorithms often mentioned in RL.

Trust region method

There are two major optimization methods: line search and trust region. Gradient descent is a line search. We determine the descending direction first and we take a step in that direction. In gradient descent, the step size is the gradient × learning rate.

In the trust region, we determine the maximum step size that we want to explore and then we locate the optimal point within this trust region. Let’s start with an initial maximum step size δ as the radius of the trust region (the yellow circle). Our objective is to find the optimal point for m within the radius δ.

The trust region can be expanded or shrink in runtime to adjust to the curvature of the surface and there are many possibilities. In the traditional trust region method, since we approximate the objective function f with m, one possibility is to shrink the trust region if m is a poor approximator of f at the optimal point. On the contrary, if the approximation is good, we expand it. The following is the trust region method which dynamically adjusts the trust region size according to the criteria just mentioned.

Newton’s method

To solve f(x)=0, we can continue the follow iteration until it converges.

It can be used to minimize f(x):

The iteration used will be:

Nonlinear least-squares problem with Gauss-Newton method

(Example credit for the Gauss-Newton method)

Suppose we have a function g(t) model by parameters x1 and x2.

And we have training samples for t with output label y.

We can define a function r which measures the error between our model and the label:

Now, our objective is to fit our model g with samples to minimize the following objective:

In this section, we will apply the Nonlinear least-squares problem with Gauss-Newton method.

With Newton’s method on optimization:

According to the equation above, we can rewrite the Δx as the searching direction p. The equations becomes:

Compute the first and second order derivative of f:

In the Gauss-Newton method, we approximate the second derivative above without the Q term. The Q term is smaller than the first term and if the problem has zero residual, r = Q = 0.

Gauss-Newton method applies this approximation to the Newton method.

The Gauss-Newton method search direction is:

Let’s have an example, we want to determine the growth rate of antelope. We develop a model g and fit data into the model to compute the residual error r: