kb autocommit

Jemoka · Apr 4, 2024 · a9f8f9f · a9f8f9f
1 parent ed0c671
commit a9f8f9f
Show file tree

Hide file tree

Showing 6 changed files with 295 additions and 1 deletion.
diff --git a/content/posts/KBhautomatic_differentiation.md b/content/posts/KBhautomatic_differentiation.md
@@ -0,0 +1,69 @@
++++
+title = "Automatic Differentiation"
+author = ["Houjun Liu"]
+draft = false
++++
+
+## Forward Accumulation {#forward-accumulation}
+
+First, make a computation graph.
+
+Consider \\(\ln (ab + \max (a,2))\\)
+
+{{< figure src="/ox-hugo/2024-04-04_09-40-27_screenshot.png" >}}
+
+Say we want \\(\pdv{f}{a}(3,2)\\).
+
+Let's begin by tracking, left to right, both the **value** of each node and its **derivative**.
+
+Layer 1:
+
+-   \\(b = 2, \pdv{b}{a} = 0\\)
+-   \\(a = 3, \pdv{a}{a} = 1\\)
+
+Layer 2:
+
+-   \\(c\_1 = a\times b = 6, \pdv{c\_1}{a} = b\pdv{a}{a} + a \pdv{a}{b} = 2\\)
+
+and so on; until we get to \\(c\_4\\)
+
+
+## Dual Number Method {#dual-number-method}
+
+
+### Dual Number {#dual-number}
+
+Consider:
+
+\begin{equation}
+a+b \epsilon
+\end{equation}
+
+Let's declare:
+
+\begin{equation}
+\epsilon^{2} = 0
+\end{equation}
+
+The standard field operations still apply:
+
+\begin{equation}
+(a+b\epsilon) + (c+d\epsilon) = (a+c) + (b+d) \epsilon
+\end{equation}
+
+
+### The Method {#the-method}
+
+We can write down a usual Taylor expansion:
+
+\begin{equation}
+f(a+b\epsilon) = \sum\_{k=0}^{\infty} \frac{f^{(k)}}{k!} (a+b \epsilon - a)^{k}
+\end{equation}
+
+IMPORTANTLY:
+
+\begin{equation}
+f(a+1\epsilon) = f(a) + f'(a) \epsilon
+\end{equation}
+
+This means that we can use [Dual Number](#dual-number)s to directly compute derivatives.
diff --git a/content/posts/KBhderivatives_descent_and_approximation.md b/content/posts/KBhderivatives_descent_and_approximation.md
@@ -0,0 +1,20 @@
++++
+title = "Derivatives, Bracketing, Descent, and Approximation"
+author = ["Houjun Liu"]
+draft = false
++++
+
+-   [Formal Formulation of Optimization]({{< relref "KBhsu_cs361_apr022024.md#formal-formulation-of-optimization" >}})
+-   [constraint]({{< relref "KBhsu_cs361_apr022024.md#constraint" >}})
+-   types of conditions
+    -   [FONC]({{< relref "KBhsu_cs361_apr042024.md#first-order-necessary-condition" >}}) and [SONC]({{< relref "KBhsu_cs361_apr042024.md#second-order-necessary-condition" >}})
+-   [Derivatives]({{< relref "KBhsu_cs361_apr042024.md#derivative" >}})
+    -   [Directional Derivatives]({{< relref "KBhsu_cs361_apr042024.md#directional-derivative" >}})
+    -   numerical methods
+        -   [Finite-Difference Method]({{< relref "KBhsu_cs361_apr042024.md#finite-difference-method" >}})
+        -   [Complex-Difference Method]({{< relref "KBhsu_cs361_apr042024.md#complex-difference-method" >}})
+    -   exact methods: autodiff
+        -   [Forward Accumulation]({{< relref "KBhautomatic_differentiation.md#forward-accumulation" >}})
+        -   cooool: [Dual Number Method]({{< relref "KBhautomatic_differentiation.md#dual-number-method" >}})
+-   [Bracketing]({{< relref "KBhsu_cs361_apr042024.md#bracketing" >}})
+    -   [Fibonacci Search]({{< relref "KBhsu_cs361_apr042024.md#fibonacci-search" >}})
diff --git a/content/posts/KBhoptimization_index.md b/content/posts/KBhoptimization_index.md
@@ -55,6 +55,30 @@ aa222.stanford.edu
 ## Lectures {#lectures}
 
 
-## Basic, Single-Objective Optimization {#basic-single-objective-optimization}
+## Derivatives, Bracketing, Descent, and Approximations {#derivatives-bracketing-descent-and-approximations}
+
+Topics: [Derivatives, Bracking, Descent, and Approximation]({{< relref "KBhderivatives_descent_and_approximation.md" >}})
+
+
+### Lectures {#lectures}
 
 -   [SU-CS361 APR022024]({{< relref "KBhsu_cs361_apr022024.md" >}})
+-   [SU-CS361 APR042024]({{< relref "KBhsu_cs361_apr042024.md" >}})
+
+
+## Direct Optimization {#direct-optimization}
+
+
+## Stochastic, Population, and Expressions {#stochastic-population-and-expressions}
+
+
+## Constraints {#constraints}
+
+
+## Sampling and Surrogates {#sampling-and-surrogates}
+
+
+## Optimization Under Uncertainty {#optimization-under-uncertainty}
+
+
+## What's Next {#what-s-next}
diff --git a/content/posts/KBhsingle_objective_optimization.md b/content/posts/KBhsingle_objective_optimization.md
@@ -0,0 +1,5 @@
++++
+title = "Single-Objective Optimization"
+author = ["Houjun Liu"]
+draft = false
++++
diff --git a/content/posts/KBhsu_cs361_apr042024.md b/content/posts/KBhsu_cs361_apr042024.md
@@ -0,0 +1,176 @@
++++
+title = "SU-CS361 APR042024"
+author = ["Houjun Liu"]
+draft = false
++++
+
+## optimization inequalities cannot be strict {#optimization-inequalities-cannot-be-strict}
+
+Consider:
+
+\begin{align}
+\min\_{x}&\ x \\\\
+s.t.\ & x > 1
+\end{align}
+
+this has **NO SOLUTION**. (1,1) wouldn't actually be in [feasible set]({{< relref "KBhsu_cs361_apr022024.md#formal-formulation-of-optimization" >}}). So, we usually specify optimization without a strict inequality.
+
+So, instead, we write:
+
+\begin{align}
+\min\_{x}&\ x \\\\
+s.t.\ & x \geq  1
+\end{align}
+
+
+## Univariate Conditions {#univariate-conditions}
+
+
+### First order Necessary Condition {#first-order-necessary-condition}
+
+\begin{equation}
+\nabla f(x^{\*}) = 0
+\end{equation}
+
+
+### Second order necessary condition {#second-order-necessary-condition}
+
+\begin{equation}
+\nabla^{2}f(x^{\*}) \geq 0
+\end{equation}
+
+
+## Derivative {#derivative}
+
+\begin{equation}
+f'(x) = \frac{\Delta f(x)}{\Delta x}
+\end{equation}
+
+Or gradient; our convention is that gradients are a **COLUMN** vector---
+
+\begin{equation}
+\nabla f(x) = \mqty(\pdv{f(x)}{x\_1} \\\ \pdv{f(x)}{x\_2} \\\ \dots \\\ \pdv{f(x)}{x\_{n}})
+\end{equation}
+
+Hessian matrix (2nd order partial); its just this, where columns are the second index and rows are the first index.
+
+
+## Directional Derivative {#directional-derivative}
+
+\begin{align}
+\nabla\_{s} f(x) &= \lim\_{h \to 0} \frac{f(x+hs) - f(x)}{h}  \\\\
+&= \lim\_{h \to 0} \frac{f(x+\frac{hs}{2}) - f(x- \frac{hs}{2})}{h}
+\end{align}
+
+i.e. this is "derivative along a direction"
+
+
+## Numerical Method {#numerical-method}
+
+
+### Finite-Difference Method {#finite-difference-method}
+
+Recall the Taylor Series about \\(f(x+h)\\):
+
+\begin{equation}
+f(x+h) = f(x) + \frac{f'(x)}{1} h + \frac{f''(x)}{2!} h^{2} + \dots
+\end{equation}
+
+Moving it around to get \\(f'(x)\\) by itself:
+
+\begin{equation}
+f'(x)h = f(x+h) - f(x) - \frac{f''(x)}{2!} h^{2} + \dots
+\end{equation}
+
+So:
+
+\begin{equation}
+f'(x) \approx \frac{f(x+h)-f(x)}{h} + \dots
+\end{equation}
+
+where $...$ errors at \\(O(h)\\).
+
+
+#### Two Sources of Error {#two-sources-of-error}
+
+-   \\(f(x+h) - f(x)\\) cancels out at small values of \\(x\\) and \\(h\\), because of **floating point errors**
+-   The \\(O(h)\\) term which is not accounted for in the end
+
+
+### Complex-Difference Method {#complex-difference-method}
+
+Consider a Taylor approximation of complex difference:
+
+\begin{equation}
+f(x + ih) = f(x) + ih f'(x) - h^{2} \frac{f''(x)}{2!} - ih^{3} \frac{f'''(x)}{3!} + \dots
+\end{equation}
+
+Let's again try to get \\(f'(x)\\) by itself; rearranging and thinking for a bit, we will get every other term on the expression above:
+
+\begin{equation}
+f'(x) = \frac{\Im (f(x+ih))}{h} + \dots
+\end{equation}
+
+Where the $...$ errors is at \\(O(h^{2})\\)
+
+**NOTICE**: we no longer have the cancellation error because we no longer have subtraction.
+
+
+## [Automatic Differentiation]({{< relref "KBhautomatic_differentiation.md" >}}) {#automatic-differentiation--kbhautomatic-differentiation-dot-md}
+
+See [Automatic Differentiation]({{< relref "KBhautomatic_differentiation.md" >}})
+
+
+## Bracketing {#bracketing}
+
+Given a unimodal function, the global minimum is guaranteed to be within \\([a,c]\\) with \\(b \in [a,c]\\) if we have that \\(f(a) > f(b) < f( c)\\).
+
+So let's find this bracket.
+
+
+### Unimodality {#unimodality}
+
+A function \\(f\\) is unimodal if:
+
+\\(\exists\\) unique \\(x^{\*}\\) such that \\(f\\) is monotonically decreasing for \\(x \leq x^{\*}\\) and monotonically increasing for \\(x \geq x^{\*}\\)
+
+
+### Bracketing Procedure {#bracketing-procedure}
+
+If we don't know anything, we might as well start with \\(a=-1, b=0, c=1\\).
+
+One of three things:
+
+-   we already satisfy \\(f(a) > f(b) < f( c)\\), well, we are done
+-   if our left side \\(f(a)\\) is too low, we will move \\(a\\) to the left without moving $c$---doubling the step size every time until it works
+-   if our right side is too low to the other thing, move it too, ...
+
+
+#### Fibonacci Search {#fibonacci-search}
+
+Say you wanted to evaluate your sequence a finite number of times to maximally lower the interval.
+
+<!--list-separator-->
+
+-  Two Evaluations
+
+    At two evaluations, you should pick two points right down the middle very close together; this will cut your interval in half.
+
+<!--list-separator-->
+
+-  \\(n\\) evaluations
+
+    Evaluate intervals with lengths
+
+    \begin{equation}
+    F\_{n} =
+    \begin{cases}
+    1, n\leq 2 \\\\
+    F\_{n-1} + F\_{n-2}
+    \end{cases}
+    \end{equation}
+
+
+#### Golden Section Search {#golden-section-search}
+
+Shrink instead by golden ratio, which is constast.
diff --git a/static/ox-hugo/2024-04-04_09-40-27_screenshot.png b/static/ox-hugo/2024-04-04_09-40-27_screenshot.png