Skip to content

Commit

Permalink
kb autocommit
Browse files Browse the repository at this point in the history
  • Loading branch information
Jemoka committed Apr 4, 2024
1 parent ed0c671 commit a9f8f9f
Show file tree
Hide file tree
Showing 6 changed files with 295 additions and 1 deletion.
69 changes: 69 additions & 0 deletions content/posts/KBhautomatic_differentiation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
+++
title = "Automatic Differentiation"
author = ["Houjun Liu"]
draft = false
+++

## Forward Accumulation {#forward-accumulation}

First, make a computation graph.

Consider \\(\ln (ab + \max (a,2))\\)

{{< figure src="/ox-hugo/2024-04-04_09-40-27_screenshot.png" >}}

Say we want \\(\pdv{f}{a}(3,2)\\).

Let's begin by tracking, left to right, both the **value** of each node and its **derivative**.

Layer 1:

- \\(b = 2, \pdv{b}{a} = 0\\)
- \\(a = 3, \pdv{a}{a} = 1\\)

Layer 2:

- \\(c\_1 = a\times b = 6, \pdv{c\_1}{a} = b\pdv{a}{a} + a \pdv{a}{b} = 2\\)

and so on; until we get to \\(c\_4\\)


## Dual Number Method {#dual-number-method}


### Dual Number {#dual-number}

Consider:

\begin{equation}
a+b \epsilon
\end{equation}

Let's declare:

\begin{equation}
\epsilon^{2} = 0
\end{equation}

The standard field operations still apply:

\begin{equation}
(a+b\epsilon) + (c+d\epsilon) = (a+c) + (b+d) \epsilon
\end{equation}


### The Method {#the-method}

We can write down a usual Taylor expansion:

\begin{equation}
f(a+b\epsilon) = \sum\_{k=0}^{\infty} \frac{f^{(k)}}{k!} (a+b \epsilon - a)^{k}
\end{equation}

IMPORTANTLY:

\begin{equation}
f(a+1\epsilon) = f(a) + f'(a) \epsilon
\end{equation}

This means that we can use [Dual Number](#dual-number)s to directly compute derivatives.
20 changes: 20 additions & 0 deletions content/posts/KBhderivatives_descent_and_approximation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
+++
title = "Derivatives, Bracketing, Descent, and Approximation"
author = ["Houjun Liu"]
draft = false
+++

- [Formal Formulation of Optimization]({{< relref "KBhsu_cs361_apr022024.md#formal-formulation-of-optimization" >}})
- [constraint]({{< relref "KBhsu_cs361_apr022024.md#constraint" >}})
- types of conditions
- [FONC]({{< relref "KBhsu_cs361_apr042024.md#first-order-necessary-condition" >}}) and [SONC]({{< relref "KBhsu_cs361_apr042024.md#second-order-necessary-condition" >}})
- [Derivatives]({{< relref "KBhsu_cs361_apr042024.md#derivative" >}})
- [Directional Derivatives]({{< relref "KBhsu_cs361_apr042024.md#directional-derivative" >}})
- numerical methods
- [Finite-Difference Method]({{< relref "KBhsu_cs361_apr042024.md#finite-difference-method" >}})
- [Complex-Difference Method]({{< relref "KBhsu_cs361_apr042024.md#complex-difference-method" >}})
- exact methods: autodiff
- [Forward Accumulation]({{< relref "KBhautomatic_differentiation.md#forward-accumulation" >}})
- cooool: [Dual Number Method]({{< relref "KBhautomatic_differentiation.md#dual-number-method" >}})
- [Bracketing]({{< relref "KBhsu_cs361_apr042024.md#bracketing" >}})
- [Fibonacci Search]({{< relref "KBhsu_cs361_apr042024.md#fibonacci-search" >}})
26 changes: 25 additions & 1 deletion content/posts/KBhoptimization_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,30 @@ aa222.stanford.edu
## Lectures {#lectures}


## Basic, Single-Objective Optimization {#basic-single-objective-optimization}
## Derivatives, Bracketing, Descent, and Approximations {#derivatives-bracketing-descent-and-approximations}

Topics: [Derivatives, Bracking, Descent, and Approximation]({{< relref "KBhderivatives_descent_and_approximation.md" >}})


### Lectures {#lectures}

- [SU-CS361 APR022024]({{< relref "KBhsu_cs361_apr022024.md" >}})
- [SU-CS361 APR042024]({{< relref "KBhsu_cs361_apr042024.md" >}})


## Direct Optimization {#direct-optimization}


## Stochastic, Population, and Expressions {#stochastic-population-and-expressions}


## Constraints {#constraints}


## Sampling and Surrogates {#sampling-and-surrogates}


## Optimization Under Uncertainty {#optimization-under-uncertainty}


## What's Next {#what-s-next}
5 changes: 5 additions & 0 deletions content/posts/KBhsingle_objective_optimization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
+++
title = "Single-Objective Optimization"
author = ["Houjun Liu"]
draft = false
+++
176 changes: 176 additions & 0 deletions content/posts/KBhsu_cs361_apr042024.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
+++
title = "SU-CS361 APR042024"
author = ["Houjun Liu"]
draft = false
+++

## optimization inequalities cannot be strict {#optimization-inequalities-cannot-be-strict}

Consider:

\begin{align}
\min\_{x}&\ x \\\\
s.t.\ & x > 1
\end{align}

this has **NO SOLUTION**. (1,1) wouldn't actually be in [feasible set]({{< relref "KBhsu_cs361_apr022024.md#formal-formulation-of-optimization" >}}). So, we usually specify optimization without a strict inequality.

So, instead, we write:

\begin{align}
\min\_{x}&\ x \\\\
s.t.\ & x \geq 1
\end{align}


## Univariate Conditions {#univariate-conditions}


### First order Necessary Condition {#first-order-necessary-condition}

\begin{equation}
\nabla f(x^{\*}) = 0
\end{equation}


### Second order necessary condition {#second-order-necessary-condition}

\begin{equation}
\nabla^{2}f(x^{\*}) \geq 0
\end{equation}


## Derivative {#derivative}

\begin{equation}
f'(x) = \frac{\Delta f(x)}{\Delta x}
\end{equation}

Or gradient; our convention is that gradients are a **COLUMN** vector---

\begin{equation}
\nabla f(x) = \mqty(\pdv{f(x)}{x\_1} \\\ \pdv{f(x)}{x\_2} \\\ \dots \\\ \pdv{f(x)}{x\_{n}})
\end{equation}

Hessian matrix (2nd order partial); its just this, where columns are the second index and rows are the first index.


## Directional Derivative {#directional-derivative}

\begin{align}
\nabla\_{s} f(x) &= \lim\_{h \to 0} \frac{f(x+hs) - f(x)}{h} \\\\
&= \lim\_{h \to 0} \frac{f(x+\frac{hs}{2}) - f(x- \frac{hs}{2})}{h}
\end{align}

i.e. this is "derivative along a direction"


## Numerical Method {#numerical-method}


### Finite-Difference Method {#finite-difference-method}

Recall the Taylor Series about \\(f(x+h)\\):

\begin{equation}
f(x+h) = f(x) + \frac{f'(x)}{1} h + \frac{f''(x)}{2!} h^{2} + \dots
\end{equation}

Moving it around to get \\(f'(x)\\) by itself:

\begin{equation}
f'(x)h = f(x+h) - f(x) - \frac{f''(x)}{2!} h^{2} + \dots
\end{equation}

So:

\begin{equation}
f'(x) \approx \frac{f(x+h)-f(x)}{h} + \dots
\end{equation}

where $...$ errors at \\(O(h)\\).


#### Two Sources of Error {#two-sources-of-error}

- \\(f(x+h) - f(x)\\) cancels out at small values of \\(x\\) and \\(h\\), because of **floating point errors**
- The \\(O(h)\\) term which is not accounted for in the end


### Complex-Difference Method {#complex-difference-method}

Consider a Taylor approximation of complex difference:

\begin{equation}
f(x + ih) = f(x) + ih f'(x) - h^{2} \frac{f''(x)}{2!} - ih^{3} \frac{f'''(x)}{3!} + \dots
\end{equation}

Let's again try to get \\(f'(x)\\) by itself; rearranging and thinking for a bit, we will get every other term on the expression above:

\begin{equation}
f'(x) = \frac{\Im (f(x+ih))}{h} + \dots
\end{equation}

Where the $...$ errors is at \\(O(h^{2})\\)

**NOTICE**: we no longer have the cancellation error because we no longer have subtraction.


## [Automatic Differentiation]({{< relref "KBhautomatic_differentiation.md" >}}) {#automatic-differentiation--kbhautomatic-differentiation-dot-md}

See [Automatic Differentiation]({{< relref "KBhautomatic_differentiation.md" >}})


## Bracketing {#bracketing}

Given a unimodal function, the global minimum is guaranteed to be within \\([a,c]\\) with \\(b \in [a,c]\\) if we have that \\(f(a) > f(b) < f( c)\\).

So let's find this bracket.


### Unimodality {#unimodality}

A function \\(f\\) is unimodal if:

\\(\exists\\) unique \\(x^{\*}\\) such that \\(f\\) is monotonically decreasing for \\(x \leq x^{\*}\\) and monotonically increasing for \\(x \geq x^{\*}\\)


### Bracketing Procedure {#bracketing-procedure}

If we don't know anything, we might as well start with \\(a=-1, b=0, c=1\\).

One of three things:

- we already satisfy \\(f(a) > f(b) < f( c)\\), well, we are done
- if our left side \\(f(a)\\) is too low, we will move \\(a\\) to the left without moving $c$---doubling the step size every time until it works
- if our right side is too low to the other thing, move it too, ...


#### Fibonacci Search {#fibonacci-search}

Say you wanted to evaluate your sequence a finite number of times to maximally lower the interval.

<!--list-separator-->

- Two Evaluations

At two evaluations, you should pick two points right down the middle very close together; this will cut your interval in half.

<!--list-separator-->

- \\(n\\) evaluations

Evaluate intervals with lengths

\begin{equation}
F\_{n} =
\begin{cases}
1, n\leq 2 \\\\
F\_{n-1} + F\_{n-2}
\end{cases}
\end{equation}


#### Golden Section Search {#golden-section-search}

Shrink instead by golden ratio, which is constast.
Binary file added static/ox-hugo/2024-04-04_09-40-27_screenshot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit a9f8f9f

Please sign in to comment.