deploy: ed0c671

Jemoka · Apr 3, 2024 · 140eb46 · 140eb46
1 parent bc249eb
commit 140eb46
Show file tree

Hide file tree

Showing 123 changed files with 123 additions and 123 deletions.
diff --git a/index.json b/index.json
diff --git a/posts/kbhaction_value_function/index.html b/posts/kbhaction_value_function/index.html
@@ -7,7 +7,7 @@
 &ldquo;the utility that gains the best action-value&rdquo;"><meta name=author content="Houjun Liu"><link rel=stylesheet href=/css/global.css><link rel=stylesheet href=/css/syntax.css></head><body><div class=center-clearfix><header><span id=header-name onclick='window.location.href="/"' style=cursor:pointer>Houjun Liu</span><div id=socialpanel><a href=https://www.jemoka.com/search/ class=header-social id=header-search><i class="ic fa-solid fa-magnifying-glass"></i></i></a>
 <a href=https://github.com/Jemoka/ class=header-social id=header-github><i class="ic fa-brands fa-github"></i></a>
 <a href=https://maly.io/@jemoka class=header-social id=header-twitter><i class="ic fa-brands fa-mastodon"></i></a>
-<a href=https://www.reddit.com/user/Jemoka/ class=header-social id=header-reddit><i class="ic fa-brands fa-reddit"></i></a></div></header><div id=title><h1>action-value function</h1><span class=tagbox></span></div><aside id=toc><h1 id=toc-title>table of contents</h1><nav id=TableOfContents><ul><li><a href=#value-function--kbhaction-value-function-dot-md><a href=HAHAHUGOSHORTCODE16s1HBHB>value function</a></a></li><li><a href=#value-function-policy>value-function policy</a></li><li><a href=#advantage>advantage</a></li></ul></nav></aside><main><article><div><p>Quality of taking a particular value at a function&mdash;&ldquo;expected discounted return when following a <a href=/posts/kbhpolicy/>policy</a> from \(S\) and taking \(a\)&rdquo;:</p><p>\begin{equation}
+<a href=https://www.reddit.com/user/Jemoka/ class=header-social id=header-reddit><i class="ic fa-brands fa-reddit"></i></a></div></header><div id=title><h1>action-value function</h1><span class=tagbox></span></div><aside id=toc><h1 id=toc-title>table of contents</h1><nav id=TableOfContents><ul><li><a href=#value-function--kbhaction-value-function-dot-md><a href=HAHAHUGOSHORTCODE15s1HBHB>value function</a></a></li><li><a href=#value-function-policy>value-function policy</a></li><li><a href=#advantage>advantage</a></li></ul></nav></aside><main><article><div><p>Quality of taking a particular value at a function&mdash;&ldquo;expected discounted return when following a <a href=/posts/kbhpolicy/>policy</a> from \(S\) and taking \(a\)&rdquo;:</p><p>\begin{equation}
 Q(s,a) = R(s,a) + \gamma \sum_{s&rsquo;} T(s&rsquo;|s,a) U(s&rsquo;)
 \end{equation}</p><p>where, \(T\) is the transition probability from \(s\) to \(s&rsquo;\) given action \(a\).</p><h2 id=value-function--kbhaction-value-function-dot-md><a href=/posts/kbhaction_value_function/>value function</a></h2><p>Therefore, the <a href=/posts/kbhutility_theory/>utility</a> of being in a state (called the <a href=/posts/kbhaction_value_function/>value function</a>) is:</p><p>\begin{equation}
 U(s) = \max_{a} Q(s,a)

diff --git a/posts/kbhalpha_vector/index.html b/posts/kbhalpha_vector/index.html
@@ -9,7 +9,7 @@
 At every belief \(b\), there is a policy which has the highest \(U(b)\) at that \(b\) given be the alpha vector formulation."><meta name=author content="Houjun Liu"><link rel=stylesheet href=/css/global.css><link rel=stylesheet href=/css/syntax.css></head><body><div class=center-clearfix><header><span id=header-name onclick='window.location.href="/"' style=cursor:pointer>Houjun Liu</span><div id=socialpanel><a href=https://www.jemoka.com/search/ class=header-social id=header-search><i class="ic fa-solid fa-magnifying-glass"></i></i></a>
 <a href=https://github.com/Jemoka/ class=header-social id=header-github><i class="ic fa-brands fa-github"></i></a>
 <a href=https://maly.io/@jemoka class=header-social id=header-twitter><i class="ic fa-brands fa-mastodon"></i></a>
-<a href=https://www.reddit.com/user/Jemoka/ class=header-social id=header-reddit><i class="ic fa-brands fa-reddit"></i></a></div></header><div id=title><h1>alpha vector</h1><span class=tagbox></span></div><aside id=toc><h1 id=toc-title>table of contents</h1><nav id=TableOfContents><ul><li><a href=#additional-information>Additional Information</a><ul><li><a href=#top-action>top action</a></li><li><a href=#optimal-value-function-for-pomdp--kbhconditional-plan-dot-md--with-alpha-vector--kbhalpha-vector-dot-md><a href=HAHAHUGOSHORTCODE65s9HBHB>optimal value function for POMDP</a> with <a href=HAHAHUGOSHORTCODE65s10HBHB>alpha vector</a></a></li><li><a href=#one-step-lookahead-in-pomdp>one-step lookahead in POMDP</a></li><li><a href=#alpha-vector--kbhalpha-vector-dot-md--pruning><a href=HAHAHUGOSHORTCODE65s16HBHB>alpha vector</a> pruning</a></li></ul></li></ul></nav></aside><main><article><div><p>Recall, from <a href=/posts/kbhconditional_plan/#id-6f19368f-74b5-4606-a882-ec9bc5619873-conditional-plan-evaluation>conditional plan evaluation</a>, we had that:</p><p>\begin{equation}
+<a href=https://www.reddit.com/user/Jemoka/ class=header-social id=header-reddit><i class="ic fa-brands fa-reddit"></i></a></div></header><div id=title><h1>alpha vector</h1><span class=tagbox></span></div><aside id=toc><h1 id=toc-title>table of contents</h1><nav id=TableOfContents><ul><li><a href=#additional-information>Additional Information</a><ul><li><a href=#top-action>top action</a></li><li><a href=#optimal-value-function-for-pomdp--kbhconditional-plan-dot-md--with-alpha-vector--kbhalpha-vector-dot-md><a href=HAHAHUGOSHORTCODE64s9HBHB>optimal value function for POMDP</a> with <a href=HAHAHUGOSHORTCODE64s10HBHB>alpha vector</a></a></li><li><a href=#one-step-lookahead-in-pomdp>one-step lookahead in POMDP</a></li><li><a href=#alpha-vector--kbhalpha-vector-dot-md--pruning><a href=HAHAHUGOSHORTCODE64s16HBHB>alpha vector</a> pruning</a></li></ul></li></ul></nav></aside><main><article><div><p>Recall, from <a href=/posts/kbhconditional_plan/#id-6f19368f-74b5-4606-a882-ec9bc5619873-conditional-plan-evaluation>conditional plan evaluation</a>, we had that:</p><p>\begin{equation}
 U^{\pi}(b) = \sum_{s}^{} b(s) U^{\pi}(s)
 \end{equation}</p><p>let&rsquo;s write it as:</p><p>\begin{equation}
 U^{\pi}(b) = \sum_{s}^{} b(s) U^{\pi}(s) = {\alpha_{\pi}}^{\top} b

diff --git a/posts/kbhangelman_syndrome/index.html b/posts/kbhangelman_syndrome/index.html
@@ -3,4 +3,4 @@
 cause of Angelman Syndrome Angelman Syndrome is primarily caused by the UBE3A and the ubiquitin proteasome system. Poly-ubiquitin chain asks to discard cells."><meta name=author content="Houjun Liu"><link rel=stylesheet href=/css/global.css><link rel=stylesheet href=/css/syntax.css></head><body><div class=center-clearfix><header><span id=header-name onclick='window.location.href="/"' style=cursor:pointer>Houjun Liu</span><div id=socialpanel><a href=https://www.jemoka.com/search/ class=header-social id=header-search><i class="ic fa-solid fa-magnifying-glass"></i></i></a>
 <a href=https://github.com/Jemoka/ class=header-social id=header-github><i class="ic fa-brands fa-github"></i></a>
 <a href=https://maly.io/@jemoka class=header-social id=header-twitter><i class="ic fa-brands fa-mastodon"></i></a>
-<a href=https://www.reddit.com/user/Jemoka/ class=header-social id=header-reddit><i class="ic fa-brands fa-reddit"></i></a></div></header><div id=title><h1>Angelman Syndrome</h1><span class=tagbox></span></div><aside id=toc><h1 id=toc-title>table of contents</h1><nav id=TableOfContents><ul><li><a href=#cause-of-angelman-syndrome--kbhangelman-syndrome-dot-md>cause of <a href=HAHAHUGOSHORTCODE83s1HBHB>Angelman Syndrome</a></a></li></ul></nav></aside><main><article><div><p><a href=/posts/kbhangelman_syndrome/>Angelman Syndrome</a> is a syndrome is ~1 in 15000, clinically recognizable, developmental delay syndrome.</p><h2 id=cause-of-angelman-syndrome--kbhangelman-syndrome-dot-md>cause of <a href=/posts/kbhangelman_syndrome/>Angelman Syndrome</a></h2><p><a href=/posts/kbhangelman_syndrome/>Angelman Syndrome</a> is primarily caused by the <a href>UBE3A</a> and the <a href>ubiquitin proteasome system.</a> Poly-<a href>ubiquitin</a> chain asks to discard cells.</p></div></article></main><footer><p id=footer>&copy; 2019-2024 Houjun Liu. Licensed CC BY-NC-SA 4.0.</p></footer></div></body></html>
+<a href=https://www.reddit.com/user/Jemoka/ class=header-social id=header-reddit><i class="ic fa-brands fa-reddit"></i></a></div></header><div id=title><h1>Angelman Syndrome</h1><span class=tagbox></span></div><aside id=toc><h1 id=toc-title>table of contents</h1><nav id=TableOfContents><ul><li><a href=#cause-of-angelman-syndrome--kbhangelman-syndrome-dot-md>cause of <a href=HAHAHUGOSHORTCODE77s1HBHB>Angelman Syndrome</a></a></li></ul></nav></aside><main><article><div><p><a href=/posts/kbhangelman_syndrome/>Angelman Syndrome</a> is a syndrome is ~1 in 15000, clinically recognizable, developmental delay syndrome.</p><h2 id=cause-of-angelman-syndrome--kbhangelman-syndrome-dot-md>cause of <a href=/posts/kbhangelman_syndrome/>Angelman Syndrome</a></h2><p><a href=/posts/kbhangelman_syndrome/>Angelman Syndrome</a> is primarily caused by the <a href>UBE3A</a> and the <a href>ubiquitin proteasome system.</a> Poly-<a href>ubiquitin</a> chain asks to discard cells.</p></div></article></main><footer><p id=footer>&copy; 2019-2024 Houjun Liu. Licensed CC BY-NC-SA 4.0.</p></footer></div></body></html>
diff --git a/posts/kbhapproximate_inference/index.html b/posts/kbhapproximate_inference/index.html
@@ -7,7 +7,7 @@
 Step 2: sample from \(B,S\) We sample \(B\). We sampled that \(B=1\) today. We sample \(S\). We sampled that \(S=0\) today. Step 3: sample from \(E\) We sample \(E\) GIVEN what we already sampled, that \(B=1, S=0\), we sampled that that \(E = 1\) Step 4: sample from \(D, C\) We sample \(D\) given that \(E=1\) as we sampled."><meta name=author content="Houjun Liu"><link rel=stylesheet href=/css/global.css><link rel=stylesheet href=/css/syntax.css></head><body><div class=center-clearfix><header><span id=header-name onclick='window.location.href="/"' style=cursor:pointer>Houjun Liu</span><div id=socialpanel><a href=https://www.jemoka.com/search/ class=header-social id=header-search><i class="ic fa-solid fa-magnifying-glass"></i></i></a>
 <a href=https://github.com/Jemoka/ class=header-social id=header-github><i class="ic fa-brands fa-github"></i></a>
 <a href=https://maly.io/@jemoka class=header-social id=header-twitter><i class="ic fa-brands fa-mastodon"></i></a>
-<a href=https://www.reddit.com/user/Jemoka/ class=header-social id=header-reddit><i class="ic fa-brands fa-reddit"></i></a></div></header><div id=title><h1>approximate inference</h1><span class=tagbox></span></div><aside id=toc><h1 id=toc-title>table of contents</h1><nav id=TableOfContents><ul><li><a href=#direct-sampling--kbhdirect-sampling-dot-md><a href=HAHAHUGOSHORTCODE89s0HBHB>Direct Sampling</a></a><ul><li><a href=#example>Example</a></li></ul></li><li><a href=#likelihood-weighted-sampling--kbhdirect-sampling-dot-md><a href=HAHAHUGOSHORTCODE89s6HBHB>Likelihood Weighted Sampling</a></a><ul><li><a href=#example>Example</a></li></ul></li></ul></nav></aside><main><article><div><h2 id=direct-sampling--kbhdirect-sampling-dot-md><a href=/posts/kbhdirect_sampling/>Direct Sampling</a></h2><p><a href=/posts/kbhdirect_sampling/>Direct Sampling</a> is an <a href=/posts/kbhapproximate_inference/>approximate inference</a> method where we pull samples from the given <a href=/posts/kbhjoint_probability_distribution/>joint probability distribution</a>.</p><h3 id=example>Example</h3><p>Suppose we are interested in:</p><figure><img src=/ox-hugo/2023-10-05_09-19-51_screenshot.png></figure><p>where we dare \(P(B^{1}|D^{1},C^{1})\).</p><h4 id=step-1-sort>Step 1: sort</h4><p>We obtain a <a href=/posts/kbhtopological_sort/>topological sort</a> of this network:</p><p>\begin{equation}
+<a href=https://www.reddit.com/user/Jemoka/ class=header-social id=header-reddit><i class="ic fa-brands fa-reddit"></i></a></div></header><div id=title><h1>approximate inference</h1><span class=tagbox></span></div><aside id=toc><h1 id=toc-title>table of contents</h1><nav id=TableOfContents><ul><li><a href=#direct-sampling--kbhdirect-sampling-dot-md><a href=HAHAHUGOSHORTCODE88s0HBHB>Direct Sampling</a></a><ul><li><a href=#example>Example</a></li></ul></li><li><a href=#likelihood-weighted-sampling--kbhdirect-sampling-dot-md><a href=HAHAHUGOSHORTCODE88s6HBHB>Likelihood Weighted Sampling</a></a><ul><li><a href=#example>Example</a></li></ul></li></ul></nav></aside><main><article><div><h2 id=direct-sampling--kbhdirect-sampling-dot-md><a href=/posts/kbhdirect_sampling/>Direct Sampling</a></h2><p><a href=/posts/kbhdirect_sampling/>Direct Sampling</a> is an <a href=/posts/kbhapproximate_inference/>approximate inference</a> method where we pull samples from the given <a href=/posts/kbhjoint_probability_distribution/>joint probability distribution</a>.</p><h3 id=example>Example</h3><p>Suppose we are interested in:</p><figure><img src=/ox-hugo/2023-10-05_09-19-51_screenshot.png></figure><p>where we dare \(P(B^{1}|D^{1},C^{1})\).</p><h4 id=step-1-sort>Step 1: sort</h4><p>We obtain a <a href=/posts/kbhtopological_sort/>topological sort</a> of this network:</p><p>\begin{equation}
 B, S, E, D, C
 \end{equation}</p><h4 id=step-2-sample-from-b-s>Step 2: sample from \(B,S\)</h4><ul><li>We sample \(B\). We sampled that \(B=1\) today.</li><li>We sample \(S\). We sampled that \(S=0\) today.</li></ul><h4 id=step-3-sample-from-e>Step 3: sample from \(E\)</h4><ul><li>We sample \(E\) <strong>GIVEN</strong> what we already sampled, that \(B=1, S=0\), we sampled that that \(E = 1\)</li></ul><h4 id=step-4-sample-from-d-c>Step 4: sample from \(D, C\)</h4><ul><li>We sample \(D\) given that \(E=1\) as we sampled.</li><li>We sample \(C\) given that \(E=1\) as we sampled.</li></ul><h4 id=repeat>Repeat</h4><p>Repeat steps 2-4</p><h4 id=step-n-analyze>Step n: Analyze</h4><table><thead><tr><th>B</th><th>S</th><th>E</th><th>D</th><th>C</th></tr></thead><tbody><tr><td>1</td><td>0</td><td>1</td><td>0</td><td>1</td></tr><tr><td>0</td><td>1</td><td>1</td><td>0</td><td>0</td></tr><tr><td>1</td><td>1</td><td>1</td><td>1</td><td>0</td></tr><tr><td>0</td><td>0</td><td>1</td><td>1</td><td>0</td></tr><tr><td>1</td><td>0</td><td>1</td><td>1</td><td>1</td></tr></tbody></table><p>We desire to know \(P(b^{1}|d^{1}, c^{1})\). Looks like, given this table, it would be \(100\%\).</p><h2 id=likelihood-weighted-sampling--kbhdirect-sampling-dot-md><a href=/posts/kbhdirect_sampling/#likelihood-weighted-sampling>Likelihood Weighted Sampling</a></h2><p><a href=/posts/kbhdirect_sampling/#likelihood-weighted-sampling>Likelihood Weighted Sampling</a> is a sampling approach whereby you force values that you wont, and then weight the results by the chance of it happening.</p><p>This is <strong>super useful</strong> when our envidence is unlikely.</p><h3 id=example>Example</h3><p>Suppose again you are interested in \(P(b^{1}|d^{1}, c^{1})\). In this case, we only sample \(B,S,E\):</p><table><thead><tr><th>B</th><th>S</th><th>E</th></tr></thead><tbody><tr><td>0</td><td>1</td><td>0</td></tr><tr><td>1</td><td>0</td><td>1</td></tr></tbody></table><p>Now, for each of these results, we the compute the chance of our priors happening given the samples.</p><ul><li>Row 1: \(p(d^{1}|e^{0})p(c^{1}|e^{0})\)</li><li>Row 2: \(p(d^{1}|e^{1})p(c^{1}|e^{1})\)</li></ul><p>Let&rsquo;s say:</p><ul><li>Row 1: \(p(d^{1}|e^{0})p(c^{1}|e^{0})=0.3\)</li><li>Row 2: \(p(d^{1}|e^{1})p(c^{1}|e^{1})=0.9\)</li></ul><p>Finally, to compute \(p(b^{1}|d^{1}c^{1})\):</p><p>\begin{equation}
 \frac{0.9}{0.9+0.3}

diff --git a/posts/kbhargmax/index.html b/posts/kbhargmax/index.html
@@ -6,6 +6,6 @@
 additional information argmax of log see argmax of log"><meta name=author content="Houjun Liu"><link rel=stylesheet href=/css/global.css><link rel=stylesheet href=/css/syntax.css></head><body><div class=center-clearfix><header><span id=header-name onclick='window.location.href="/"' style=cursor:pointer>Houjun Liu</span><div id=socialpanel><a href=https://www.jemoka.com/search/ class=header-social id=header-search><i class="ic fa-solid fa-magnifying-glass"></i></i></a>
 <a href=https://github.com/Jemoka/ class=header-social id=header-github><i class="ic fa-brands fa-github"></i></a>
 <a href=https://maly.io/@jemoka class=header-social id=header-twitter><i class="ic fa-brands fa-mastodon"></i></a>
-<a href=https://www.reddit.com/user/Jemoka/ class=header-social id=header-reddit><i class="ic fa-brands fa-reddit"></i></a></div></header><div id=title><h1>argmax</h1><span class=tagbox></span></div><aside id=toc><h1 id=toc-title>table of contents</h1><nav id=TableOfContents><ul><li><a href=#finding-argmax>finding argmax</a><ul><li><a href=#direct-optimization--kbhoptimization-dot-md>direct <a href=HAHAHUGOSHORTCODE95s0HBHB>optimization</a></a></li><li><a href=#gradient-ascent>gradient ascent</a></li></ul></li><li><a href=#additional-information>additional information</a><ul><li><a href=#argmax-of-log--kbhmaximum-likelihood-parameter-learning-dot-md><a href=HAHAHUGOSHORTCODE95s1HBHB>argmax of log</a></a></li></ul></li></ul></nav></aside><main><article><div><p>function that returns the input that maximizes the expression.</p><h2 id=finding-argmax>finding argmax</h2><h3 id=direct-optimization--kbhoptimization-dot-md>direct <a href=/posts/kbhoptimization/>optimization</a></h3><p>Typical maximization system. Take derivative, set it to 0, solve, plug in, solve. THis is pretty bad during times are not differentiable.</p><h3 id=gradient-ascent>gradient ascent</h3><p>We take steps following the direction</p><p>\begin{equation}
+<a href=https://www.reddit.com/user/Jemoka/ class=header-social id=header-reddit><i class="ic fa-brands fa-reddit"></i></a></div></header><div id=title><h1>argmax</h1><span class=tagbox></span></div><aside id=toc><h1 id=toc-title>table of contents</h1><nav id=TableOfContents><ul><li><a href=#finding-argmax>finding argmax</a><ul><li><a href=#direct-optimization--kbhoptimization-dot-md>direct <a href=HAHAHUGOSHORTCODE94s0HBHB>optimization</a></a></li><li><a href=#gradient-ascent>gradient ascent</a></li></ul></li><li><a href=#additional-information>additional information</a><ul><li><a href=#argmax-of-log--kbhmaximum-likelihood-parameter-learning-dot-md><a href=HAHAHUGOSHORTCODE94s1HBHB>argmax of log</a></a></li></ul></li></ul></nav></aside><main><article><div><p>function that returns the input that maximizes the expression.</p><h2 id=finding-argmax>finding argmax</h2><h3 id=direct-optimization--kbhoptimization-dot-md>direct <a href=/posts/kbhoptimization/>optimization</a></h3><p>Typical maximization system. Take derivative, set it to 0, solve, plug in, solve. THis is pretty bad during times are not differentiable.</p><h3 id=gradient-ascent>gradient ascent</h3><p>We take steps following the direction</p><p>\begin{equation}
 \theta_{1j} = \theta_{0j} + \eta \pdv{LL(\theta_{0})}{\theta_{0j}}
 \end{equation}</p><h2 id=additional-information>additional information</h2><h3 id=argmax-of-log--kbhmaximum-likelihood-parameter-learning-dot-md><a href=/posts/kbhmaximum_likelihood_parameter_learning/#argmax-of-log>argmax of log</a></h3><p>see <a href=/posts/kbhmaximum_likelihood_parameter_learning/#argmax-of-log>argmax of log</a></p></div></article></main><footer><p id=footer>&copy; 2019-2024 Houjun Liu. Licensed CC BY-NC-SA 4.0.</p></footer></div></body></html>