-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
6 changed files
with
295 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
+++ | ||
title = "Automatic Differentiation" | ||
author = ["Houjun Liu"] | ||
draft = false | ||
+++ | ||
|
||
## Forward Accumulation {#forward-accumulation} | ||
|
||
First, make a computation graph. | ||
|
||
Consider \\(\ln (ab + \max (a,2))\\) | ||
|
||
{{< figure src="/ox-hugo/2024-04-04_09-40-27_screenshot.png" >}} | ||
|
||
Say we want \\(\pdv{f}{a}(3,2)\\). | ||
|
||
Let's begin by tracking, left to right, both the **value** of each node and its **derivative**. | ||
|
||
Layer 1: | ||
|
||
- \\(b = 2, \pdv{b}{a} = 0\\) | ||
- \\(a = 3, \pdv{a}{a} = 1\\) | ||
|
||
Layer 2: | ||
|
||
- \\(c\_1 = a\times b = 6, \pdv{c\_1}{a} = b\pdv{a}{a} + a \pdv{a}{b} = 2\\) | ||
|
||
and so on; until we get to \\(c\_4\\) | ||
|
||
|
||
## Dual Number Method {#dual-number-method} | ||
|
||
|
||
### Dual Number {#dual-number} | ||
|
||
Consider: | ||
|
||
\begin{equation} | ||
a+b \epsilon | ||
\end{equation} | ||
|
||
Let's declare: | ||
|
||
\begin{equation} | ||
\epsilon^{2} = 0 | ||
\end{equation} | ||
|
||
The standard field operations still apply: | ||
|
||
\begin{equation} | ||
(a+b\epsilon) + (c+d\epsilon) = (a+c) + (b+d) \epsilon | ||
\end{equation} | ||
|
||
|
||
### The Method {#the-method} | ||
|
||
We can write down a usual Taylor expansion: | ||
|
||
\begin{equation} | ||
f(a+b\epsilon) = \sum\_{k=0}^{\infty} \frac{f^{(k)}}{k!} (a+b \epsilon - a)^{k} | ||
\end{equation} | ||
|
||
IMPORTANTLY: | ||
|
||
\begin{equation} | ||
f(a+1\epsilon) = f(a) + f'(a) \epsilon | ||
\end{equation} | ||
|
||
This means that we can use [Dual Number](#dual-number)s to directly compute derivatives. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
+++ | ||
title = "Derivatives, Bracketing, Descent, and Approximation" | ||
author = ["Houjun Liu"] | ||
draft = false | ||
+++ | ||
|
||
- [Formal Formulation of Optimization]({{< relref "KBhsu_cs361_apr022024.md#formal-formulation-of-optimization" >}}) | ||
- [constraint]({{< relref "KBhsu_cs361_apr022024.md#constraint" >}}) | ||
- types of conditions | ||
- [FONC]({{< relref "KBhsu_cs361_apr042024.md#first-order-necessary-condition" >}}) and [SONC]({{< relref "KBhsu_cs361_apr042024.md#second-order-necessary-condition" >}}) | ||
- [Derivatives]({{< relref "KBhsu_cs361_apr042024.md#derivative" >}}) | ||
- [Directional Derivatives]({{< relref "KBhsu_cs361_apr042024.md#directional-derivative" >}}) | ||
- numerical methods | ||
- [Finite-Difference Method]({{< relref "KBhsu_cs361_apr042024.md#finite-difference-method" >}}) | ||
- [Complex-Difference Method]({{< relref "KBhsu_cs361_apr042024.md#complex-difference-method" >}}) | ||
- exact methods: autodiff | ||
- [Forward Accumulation]({{< relref "KBhautomatic_differentiation.md#forward-accumulation" >}}) | ||
- cooool: [Dual Number Method]({{< relref "KBhautomatic_differentiation.md#dual-number-method" >}}) | ||
- [Bracketing]({{< relref "KBhsu_cs361_apr042024.md#bracketing" >}}) | ||
- [Fibonacci Search]({{< relref "KBhsu_cs361_apr042024.md#fibonacci-search" >}}) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
+++ | ||
title = "Single-Objective Optimization" | ||
author = ["Houjun Liu"] | ||
draft = false | ||
+++ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,176 @@ | ||
+++ | ||
title = "SU-CS361 APR042024" | ||
author = ["Houjun Liu"] | ||
draft = false | ||
+++ | ||
|
||
## optimization inequalities cannot be strict {#optimization-inequalities-cannot-be-strict} | ||
|
||
Consider: | ||
|
||
\begin{align} | ||
\min\_{x}&\ x \\\\ | ||
s.t.\ & x > 1 | ||
\end{align} | ||
|
||
this has **NO SOLUTION**. (1,1) wouldn't actually be in [feasible set]({{< relref "KBhsu_cs361_apr022024.md#formal-formulation-of-optimization" >}}). So, we usually specify optimization without a strict inequality. | ||
|
||
So, instead, we write: | ||
|
||
\begin{align} | ||
\min\_{x}&\ x \\\\ | ||
s.t.\ & x \geq 1 | ||
\end{align} | ||
|
||
|
||
## Univariate Conditions {#univariate-conditions} | ||
|
||
|
||
### First order Necessary Condition {#first-order-necessary-condition} | ||
|
||
\begin{equation} | ||
\nabla f(x^{\*}) = 0 | ||
\end{equation} | ||
|
||
|
||
### Second order necessary condition {#second-order-necessary-condition} | ||
|
||
\begin{equation} | ||
\nabla^{2}f(x^{\*}) \geq 0 | ||
\end{equation} | ||
|
||
|
||
## Derivative {#derivative} | ||
|
||
\begin{equation} | ||
f'(x) = \frac{\Delta f(x)}{\Delta x} | ||
\end{equation} | ||
|
||
Or gradient; our convention is that gradients are a **COLUMN** vector--- | ||
|
||
\begin{equation} | ||
\nabla f(x) = \mqty(\pdv{f(x)}{x\_1} \\\ \pdv{f(x)}{x\_2} \\\ \dots \\\ \pdv{f(x)}{x\_{n}}) | ||
\end{equation} | ||
|
||
Hessian matrix (2nd order partial); its just this, where columns are the second index and rows are the first index. | ||
|
||
|
||
## Directional Derivative {#directional-derivative} | ||
|
||
\begin{align} | ||
\nabla\_{s} f(x) &= \lim\_{h \to 0} \frac{f(x+hs) - f(x)}{h} \\\\ | ||
&= \lim\_{h \to 0} \frac{f(x+\frac{hs}{2}) - f(x- \frac{hs}{2})}{h} | ||
\end{align} | ||
|
||
i.e. this is "derivative along a direction" | ||
|
||
|
||
## Numerical Method {#numerical-method} | ||
|
||
|
||
### Finite-Difference Method {#finite-difference-method} | ||
|
||
Recall the Taylor Series about \\(f(x+h)\\): | ||
|
||
\begin{equation} | ||
f(x+h) = f(x) + \frac{f'(x)}{1} h + \frac{f''(x)}{2!} h^{2} + \dots | ||
\end{equation} | ||
|
||
Moving it around to get \\(f'(x)\\) by itself: | ||
|
||
\begin{equation} | ||
f'(x)h = f(x+h) - f(x) - \frac{f''(x)}{2!} h^{2} + \dots | ||
\end{equation} | ||
|
||
So: | ||
|
||
\begin{equation} | ||
f'(x) \approx \frac{f(x+h)-f(x)}{h} + \dots | ||
\end{equation} | ||
|
||
where $...$ errors at \\(O(h)\\). | ||
|
||
|
||
#### Two Sources of Error {#two-sources-of-error} | ||
|
||
- \\(f(x+h) - f(x)\\) cancels out at small values of \\(x\\) and \\(h\\), because of **floating point errors** | ||
- The \\(O(h)\\) term which is not accounted for in the end | ||
|
||
|
||
### Complex-Difference Method {#complex-difference-method} | ||
|
||
Consider a Taylor approximation of complex difference: | ||
|
||
\begin{equation} | ||
f(x + ih) = f(x) + ih f'(x) - h^{2} \frac{f''(x)}{2!} - ih^{3} \frac{f'''(x)}{3!} + \dots | ||
\end{equation} | ||
|
||
Let's again try to get \\(f'(x)\\) by itself; rearranging and thinking for a bit, we will get every other term on the expression above: | ||
|
||
\begin{equation} | ||
f'(x) = \frac{\Im (f(x+ih))}{h} + \dots | ||
\end{equation} | ||
|
||
Where the $...$ errors is at \\(O(h^{2})\\) | ||
|
||
**NOTICE**: we no longer have the cancellation error because we no longer have subtraction. | ||
|
||
|
||
## [Automatic Differentiation]({{< relref "KBhautomatic_differentiation.md" >}}) {#automatic-differentiation--kbhautomatic-differentiation-dot-md} | ||
|
||
See [Automatic Differentiation]({{< relref "KBhautomatic_differentiation.md" >}}) | ||
|
||
|
||
## Bracketing {#bracketing} | ||
|
||
Given a unimodal function, the global minimum is guaranteed to be within \\([a,c]\\) with \\(b \in [a,c]\\) if we have that \\(f(a) > f(b) < f( c)\\). | ||
|
||
So let's find this bracket. | ||
|
||
|
||
### Unimodality {#unimodality} | ||
|
||
A function \\(f\\) is unimodal if: | ||
|
||
\\(\exists\\) unique \\(x^{\*}\\) such that \\(f\\) is monotonically decreasing for \\(x \leq x^{\*}\\) and monotonically increasing for \\(x \geq x^{\*}\\) | ||
|
||
|
||
### Bracketing Procedure {#bracketing-procedure} | ||
|
||
If we don't know anything, we might as well start with \\(a=-1, b=0, c=1\\). | ||
|
||
One of three things: | ||
|
||
- we already satisfy \\(f(a) > f(b) < f( c)\\), well, we are done | ||
- if our left side \\(f(a)\\) is too low, we will move \\(a\\) to the left without moving $c$---doubling the step size every time until it works | ||
- if our right side is too low to the other thing, move it too, ... | ||
|
||
|
||
#### Fibonacci Search {#fibonacci-search} | ||
|
||
Say you wanted to evaluate your sequence a finite number of times to maximally lower the interval. | ||
|
||
<!--list-separator--> | ||
|
||
- Two Evaluations | ||
|
||
At two evaluations, you should pick two points right down the middle very close together; this will cut your interval in half. | ||
|
||
<!--list-separator--> | ||
|
||
- \\(n\\) evaluations | ||
|
||
Evaluate intervals with lengths | ||
|
||
\begin{equation} | ||
F\_{n} = | ||
\begin{cases} | ||
1, n\leq 2 \\\\ | ||
F\_{n-1} + F\_{n-2} | ||
\end{cases} | ||
\end{equation} | ||
|
||
|
||
#### Golden Section Search {#golden-section-search} | ||
|
||
Shrink instead by golden ratio, which is constast. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.