From fadbaae81d8514bed79831a45e43fbce517fc0a5 Mon Sep 17 00:00:00 2001 From: Houjun Liu Date: Sat, 5 Oct 2024 18:26:28 -0700 Subject: [PATCH] kb autocommit --- content/posts/KBhbinary_operation.md | 2 +- content/posts/KBhcombinator_calculus.md | 5 + content/posts/KBhconfluence.md | 4 +- content/posts/KBhmatricies.md | 6 +- content/posts/KBhquestions_for_omer.md | 3 + content/posts/KBhquotient_group.md | 4 +- .../posts/KBhregular_expression_complexity.md | 6 +- content/posts/KBhsu_cs120_oct012024.md | 36 +++ content/posts/KBhsu_cs238_sep252024.md | 2 +- content/posts/KBhsu_cs242.md | 4 +- content/posts/KBhsu_cs242_oct032024.md | 216 ++++++++++++++++++ content/posts/KBhsum_of_subsets.md | 4 +- 12 files changed, 277 insertions(+), 15 deletions(-) create mode 100644 content/posts/KBhsu_cs120_oct012024.md create mode 100644 content/posts/KBhsu_cs242_oct032024.md diff --git a/content/posts/KBhbinary_operation.md b/content/posts/KBhbinary_operation.md index a0ab4bd2a..87c91a69d 100644 --- a/content/posts/KBhbinary_operation.md +++ b/content/posts/KBhbinary_operation.md @@ -10,4 +10,4 @@ A [binary operation]({{< relref "KBhbinary_operation.md" >}}) means that you are f: (\mathbb{F},\mathbb{F}) \to \mathbb{F} \end{equation} -This is also [closed]({{< relref "KBhclosed.md" >}}), but [binary operation]({{< relref "KBhbinary_operation.md" >}})s dons't have to be. \ No newline at end of file +This is also [closed]({{< relref "KBhclosed.md" >}}), but [binary operation]({{< relref "KBhbinary_operation.md" >}})s doesn't have to be. diff --git a/content/posts/KBhcombinator_calculus.md b/content/posts/KBhcombinator_calculus.md index 3c7d412b2..587417cab 100644 --- a/content/posts/KBhcombinator_calculus.md +++ b/content/posts/KBhcombinator_calculus.md @@ -12,6 +12,11 @@ combinator is a variable free programming language; it is a turing complete comp - allows for illustration of ideas +## combinator {#combinator} + +a [combinator](#combinator) is a function with no [free variables]({{< relref "KBhsu_cs242_oct032024.md#free-variables" >}}) + + ## Why do we care? {#why-do-we-care} - no variables! its entirely compositional diff --git a/content/posts/KBhconfluence.md b/content/posts/KBhconfluence.md index bb416e2e3..3b5074eaf 100644 --- a/content/posts/KBhconfluence.md +++ b/content/posts/KBhconfluence.md @@ -4,7 +4,7 @@ author = ["Houjun Liu"] draft = false +++ -could a different choice of evaluation order change the terminating result of the program. +Could a different choice of evaluation order change the **terminating result** of the program; note that this says nothing about whether or not particular evaluation order terminates. ## constituents {#constituents} @@ -35,7 +35,7 @@ go build a grid using the one step diamond property {{< figure src="/ox-hugo/2024-10-01_09-49-32_screenshot.png" >}} -### SKI exhibits [one-step diamond property](#one-step-diamond-property) {#ski-exhibits-one-step-diamond-property--org237410e} +### SKI exhibits [one-step diamond property](#one-step-diamond-property) {#ski-exhibits-one-step-diamond-property--org19aa3ca} We can't naively apply [one-step diamond property](#one-step-diamond-property). diff --git a/content/posts/KBhmatricies.md b/content/posts/KBhmatricies.md index 7418862ac..3444e5a8e 100644 --- a/content/posts/KBhmatricies.md +++ b/content/posts/KBhmatricies.md @@ -74,7 +74,7 @@ According to Jana, a third grader can add and scalar multiply [matricies]({{< re However, what's interesting is the fact that they actually work: - Suppose \\(S,T \in \mathcal{L}(V,W)\\), then \\(\mathcal{M}(S+T) = \mathcal{M}(S)+\mathcal{M}(T)\\) -- Suppose \\(\lambda \in \mathbb{F}, T \in \mathcal{L}(V,W)\\), then \\(\mathcal{M}(\lambdaT) = \lambda \mathcal{M}(T)\\) +- Suppose \\(\lambda \in \mathbb{F}, T \in \mathcal{L}(V,W)\\), then \\(\mathcal{M}(\lambda T) = \lambda \mathcal{M}(T)\\) The verification of this result, briefly, is that: @@ -119,9 +119,9 @@ See [determinants]({{< relref "KBhdeterminants.md" >}}) See [Gaussian elimination]({{< relref "KBhgaussian_elimination.md" >}}) -### [diagonal matrix]({{< relref "KBhdiagonal_matrix.md#diagonal-matrix" >}}) {#diagonal-matrix--kbhdiagonal-matrix-dot-md} +### [diagonal matrix]({{< relref "KBhdiagonal_matrix.md" >}}) {#diagonal-matrix--kbhdiagonal-matrix-dot-md} -see [diagonal matrix]({{< relref "KBhdiagonal_matrix.md#diagonal-matrix" >}}) +see [diagonal matrix]({{< relref "KBhdiagonal_matrix.md" >}}) ### [upper-triangular matricies]({{< relref "KBhupper_triangular_matrix.md" >}}) {#upper-triangular-matricies--kbhupper-triangular-matrix-dot-md} diff --git a/content/posts/KBhquestions_for_omer.md b/content/posts/KBhquestions_for_omer.md index 9bc1da164..82e2ca9c5 100644 --- a/content/posts/KBhquestions_for_omer.md +++ b/content/posts/KBhquestions_for_omer.md @@ -4,6 +4,9 @@ author = ["Houjun Liu"] draft = false +++ +## Week 3 {#week-3} + + ## Week 2 {#week-2} - is concatenation commutative? no, right? but the symbol used \\(\cdot\\) is typically communative diff --git a/content/posts/KBhquotient_group.md b/content/posts/KBhquotient_group.md index bf8497e16..5841a347d 100644 --- a/content/posts/KBhquotient_group.md +++ b/content/posts/KBhquotient_group.md @@ -23,7 +23,7 @@ We can use the [subgroup]({{< relref "KBhsubgroup.md" >}}) above to mask out a [ For instance, the \\(\mod 3\\) quotient group is written as: \begin{equation} -\mathbb{Z}} / 3 \mathbb{Z} +\mathbb{Z} / 3 \mathbb{Z} \end{equation} -Each element in this new group is a set; for instance, in \\(\mathbb{Z} / 3\mathbb{Z}\\), \\(0\\) is actually the set \\(\\{\dots -6, -3, 0, 3, 6, \dots\\}\\) (i.e. the [subgroup]({{< relref "KBhsubgroup.md" >}}) that we were masking by). Other elements in the quotient space ("1", a.k.a. \\(\\{ \dots, -2, 1, 4, 7 \dots \\}\\), or "2", a.k.a. \\(\\{\dots, -1, 2, 5, 8 \dots \\}\\)) are called "cosets" of \\(3 \mathbb{Z}\\). You will notice they are not a [subgroup]({{< relref "KBhsubgroup.md" >}})s. \ No newline at end of file +Each element in this new group is a set; for instance, in \\(\mathbb{Z} / 3\mathbb{Z}\\), \\(0\\) is actually the set \\(\\{\dots -6, -3, 0, 3, 6, \dots\\}\\) (i.e. the [subgroup]({{< relref "KBhsubgroup.md" >}}) that we were masking by). Other elements in the quotient space ("1", a.k.a. \\(\\{ \dots, -2, 1, 4, 7 \dots \\}\\), or "2", a.k.a. \\(\\{\dots, -1, 2, 5, 8 \dots \\}\\)) are called "cosets" of \\(3 \mathbb{Z}\\). You will notice they are not a [subgroup]({{< relref "KBhsubgroup.md" >}})s. diff --git a/content/posts/KBhregular_expression_complexity.md b/content/posts/KBhregular_expression_complexity.md index cadf3728c..1308896a1 100644 --- a/content/posts/KBhregular_expression_complexity.md +++ b/content/posts/KBhregular_expression_complexity.md @@ -15,7 +15,7 @@ Let \\(\Sigma\\) be an alphabet, we define the [regular expression]({{< relref " - for all \\(\sigma \in \Sigma\\), \\(\sigma\\) is a regexp - \\(\varepsilon\\) (empty string) is a regexp -- \\(\varnothing\\) (empty set) is a regexp +- \\(\emptyset\\) (empty set) is a regexp If \\(R\_1\\), \\(R\_2\\) are regexps, then: @@ -33,9 +33,9 @@ Operator precidence: ### regexp representing [language]({{< relref "KBhalphabet.md" >}})s {#regexp-representing-language--kbhalphabet-dot-md--s} -The regexp \\(\sigma \in \Sigma\\) represents the language \\(\qty {\sigma}\\), the regexp \\(\varepsilon\\) represents the language \\(\qty {\epsilon }\\), the regexp \\(\varnothing\\) represents the language \\(\varnothing\\). +The regexp \\(\sigma \in \Sigma\\) represents the language \\(\qty {\sigma}\\), the regexp \\(\varepsilon\\) represents the language \\(\qty {\epsilon }\\), the regexp \\(\emptyset\\) represents the language \\(\emptyset\\). -for \\(R\_1, R\_2\\) being [regular expression]({{< relref "KBhregular_expression_complexity.md" >}})s representing a particular [regular language]({{< relref "KBhregular_language.md" >}}) $L_1, L_2$… +for \\(R\_1, R\_2\\) being [regular expression]({{< relref "KBhregular_expression_complexity.md" >}})s representing a particular [regular language]({{< relref "KBhregular_language.md" >}}) \\(L\_1, L\_2\dots\\) - \\((R\_1 \cdot R\_2)\\) represents concatenation \\(L\_2 \cdot L\_2\\) in the language - \\((R\_1 + R\_2)\\) represents union \\(L\_1 \cup L\_2\\) in the language diff --git a/content/posts/KBhsu_cs120_oct012024.md b/content/posts/KBhsu_cs120_oct012024.md new file mode 100644 index 000000000..e2b4c6966 --- /dev/null +++ b/content/posts/KBhsu_cs120_oct012024.md @@ -0,0 +1,36 @@ ++++ +title = "SU-CS120 OCT012024" +author = ["Houjun Liu"] +draft = false ++++ + +## specification gaming {#specification-gaming} + +[specification gaming](#specification-gaming), or reward hacking, is the phenomina where a system runs suboptimally because it exploited an underspecified part of the reward. + + +## challenges {#challenges} + +- sparse rewards +- partial obervability +- dynamic rewards (and reward shifting) +- sim-to-real transfer is hard +- computational costs +- [specification gaming](#specification-gaming) + + +## AI alignment {#ai-alignment} + +[AI alignment](#ai-alignment) ensures that AI systems are aligned with human values and interests. + +there is a spectrum of unexpected solutions: undesirable novel solutions an desirable novel solutions + + +## Problems with RLHF {#problems-with-rlhf} + +- RLHF degrates model quality + + +## Goodharting {#goodharting} + +Overfitting!! is an example of goodharting. diff --git a/content/posts/KBhsu_cs238_sep252024.md b/content/posts/KBhsu_cs238_sep252024.md index cec479abf..c7c5935a0 100644 --- a/content/posts/KBhsu_cs238_sep252024.md +++ b/content/posts/KBhsu_cs238_sep252024.md @@ -365,7 +365,7 @@ Note that the denominator is exactly the same P(p^{2} | h^{3}) = \frac{P(p^{2}, h^{3})}{P(h^{3})} \end{equation} -Our numerator is \\(P(p^{2}, h^{3}) = P(h^{3}|p^{2}}) P(h^{3})\\). The left value is \\(1\\), and the right value is still \\(\frac{1}{3}\\). Plugging it in: +Our numerator is \\(P(p^{2}, h^{3}) = P(h^{3}|p^{2}}) P(p^{2})\\). The left value is \\(1\\), and the right value is still \\(\frac{1}{3}\\). Plugging it in: \begin{equation} P(p^{2} | h^{3}) = \frac{1 \frac{1}{3}}{\frac{1}{2} \frac{1}{3} + 1 \frac{1}{3}} = \frac{2}{3} diff --git a/content/posts/KBhsu_cs242.md b/content/posts/KBhsu_cs242.md index 0c8373a53..48502137f 100644 --- a/content/posts/KBhsu_cs242.md +++ b/content/posts/KBhsu_cs242.md @@ -20,6 +20,8 @@ draft = false - [SU-CS242 SEP242024]({{< relref "KBhsu_cs242_sep242024.md" >}}) -### [Combinator Calculus]({{< relref "KBhsu_cs242_sep262024.md#combinator-calculus" >}}) {#combinator-calculus--kbhsu-cs242-sep262024-dot-md} +### [Combinator Calculus]({{< relref "KBhcombinator_calculus.md" >}}) {#combinator-calculus--kbhcombinator-calculus-dot-md} - [SU-CS242 SEP262024]({{< relref "KBhsu_cs242_sep262024.md" >}}) +- [SU-CS242 OCT012024]({{< relref "KBhsu_cs242_oct012024.md" >}}) +- [SU-CS242 OCT032024]({{< relref "KBhsu_cs242_oct032024.md" >}}) diff --git a/content/posts/KBhsu_cs242_oct032024.md b/content/posts/KBhsu_cs242_oct032024.md new file mode 100644 index 000000000..c0444588e --- /dev/null +++ b/content/posts/KBhsu_cs242_oct032024.md @@ -0,0 +1,216 @@ ++++ +title = "SU-CS242 OCT032024" +author = ["Houjun Liu"] +draft = false ++++ + +## Lambda Calculus {#lambda-calculus} + +Like [SKI Calculus]({{< relref "KBhcombinator_calculus.md" >}}), its a language of functions; unlike [SKI Calculus]({{< relref "KBhcombinator_calculus.md" >}}), there are variables. + +\begin{equation} +e\to x \mid \lambda x.e \mid ee \mid (e) +\end{equation} + +meaning, we have: + +- a variable \\(x\\) +- an **abstraction** \\(\lambda x . e\\) (a function definition) +- an **application** \\(e\_1 e\_2\\) + + +### abstraction {#abstraction} + +```python +def f(x) = e +``` + +can be written as: + +\begin{equation} +\lambda x.e +\end{equation} + +this is just an anonymous function---it can be returned, etc. + + +### syntax {#syntax} + +Association to the left: \\(f x y z = ((f(x))y)z\\). An lambda abstraction extends as far right as possible: + +\begin{equation} +\lambda x.x \lambda y.y \to \lambda x.(x \lambda y . y) +\end{equation} + + +### substitution {#substitution} + +variables requires us being able to substitute things. + +- \\(x [x:=e] = e\\) --- similar to \\(I\\) +- \\(y [x:=e] = y\\) --- similar to \\(K\\) +- \\((e\_1 e\_2) [x := e] = (e\_1 [x:= e]) (e\_2 [x:= e])\\) --- similar to \\(S\\) +- \\((\lambda x . e\_1) [x := e] = \lambda x . e\_1\\) --- shadowing; that is, during a function application if shadowing occurs, we don't use substitution +- \\((\lambda y . e\_1) [x:= e] = \lambda y . (e\_1 [x:= e])\\), if \\(x \neq y\\) and \\(y\not \in FV(e)\\); that is, we can only substitute if the contents of our substitution is not going to be changed by new variable bindings + - if we got caught by this rule, use an [alpha reduction](#alpha-reduction) + + +#### the last rule?! {#the-last-rule} + +\\((\lambda y . e\_1) [x:= e] = \lambda y . (e\_1 [x:= e])\\), if \\(x \neq y\\) and \\(y\\) doesn't appear free in \\(e\\) + +Consider: + +\begin{equation} +(\lambda y . x) [x:= y] +\end{equation} + +it would not make sense to substitute \\(x\\) inside the function for \\(y\\). + + + +- free variables + + The free variables are variables not bound in an application: + + - \\(FV(x) = \\{x\\}\\) + - \\(FV(e\_1 e\_2) = FV(e\_1) \cup FV(e\_2)\\) + - \\(FV(\lambda x.e) = FV(e) - \qty {x}\\) + + +### reductions {#reductions} + +we can mostly ignore [alpha reduction](#alpha-reduction) and [eta reduction](#eta-reduction) be rephrasing as "rename variables to avoid collision whenever needed". + + +#### beta reduction {#beta-reduction} + +\\((\lambda x.e\_1)e\_2 \to e\_1[x:=e\_2]\\), we replace every occurrence of \\(x\\) in \\(e\_1\\) with \\(e\_2\\); we call \\(x\\) the **formal parameter**, and \\(e\_2\\) the **actual argument** + + +#### alpha reduction {#alpha-reduction} + +we can rename variables to avoid collision; \\((\lambda y.e\_1) [ x: = e) = \lambda z.((e\_1 [y:=z])[x := e])\\), if \\(x \neq y\\), and \\(z\\) is fresh and never used + + +#### eta reduction {#eta-reduction} + +\\(e = \lambda x . e x\\), \\(x \not \in FV(e)\\) + + +### programming time {#programming-time} + + +#### non terminating expression {#non-terminating-expression} + +\begin{equation} +(\lambda x . x x) (\lambda x . x x) = (\lambda x . x x) (\lambda x . x x) +\end{equation} + + +#### y-combinator {#y-combinator} + +\begin{equation} +Y = \lambda f . (\lambda x . f (x x)) (\lambda x . f ( x x)) +\end{equation} + +let's do it a few times: + +\begin{equation} +Y g a \to (\lambda x . g (x x)) (\lambda x g ( x x)) a \to g ((\lambda x g ( x x)) (\lambda x g ( x x))) a +\end{equation} + + +#### booleans {#booleans} + +- \\(True\ x\ y = x\\) +- \\(False \ x\ y = y\\) + +to abstract this to combinators, we just put \\(\lambda\\) in front of each argument and we are done: + +- \\(True= \lambda x . \lambda y. x\\) +- \\(False = \lambda x. \lambda y .y\\) + +we can recycle the combinator-based definitions for SKI to deal with the rest of the boolean logic: [conditionals]({{< relref "KBhcombinator_calculus.md#conditionals" >}}) + + +#### pairs {#pairs} + +- \\(pair = \lambda a. \lambda b . \lambda f . f a b\\) +- \\(fst = \lambda x. \lambda y. x\\) +- \\(snd = \lambda x. \lambda y. y\\) + + +#### numbers {#numbers} + +- \\(0 f x = x\\), so \\(0 = \lambda f . \lambda x . x\\) +- \\(succ\ n\ f\ x = f(nfx)\\), so \\(succ = \lambda n . \lambda f . \lambda x. f (n f x)\\) + + +#### factorial {#factorial} + +p = λp. pair (mul (p fst) (p snd)) (succ (p snd)) +! = λn.(n p (pair one one) fst) + + +#### Algebraic Data Type {#algebraic-data-type} + +```nil +Type T = constructor1 Type11 Type12 ... Type1n | + constructor2 Type21 Type22 ... Type 2m | + ... +``` + +- **algebraic**: because the constructor packages up the arguments +- the constructor is a "tad" naming the case of the ADT being used +- **deconstructors** recovers the arguments of the constructor to use it + + + +```nil +Type List = nil | + cons Nat List +``` + + + +- encoding ADTs in lambda calculus + + Consider an [ADT](#algebraic-data-type) \\(T\\) with \\(n\\) constructors; let each constructor have \\(k\\) arguments; so---here's an example constructor: + + \begin{equation} + \lambda a\_1 . \lambda a\_2 \dots \lambda a\_{k}. \lambda f\_1 . \lambda f\_2 \dots \lambda f\_{n} f\_i a\_1 a\_2 \dots a\_{k} + \end{equation} + + each constructor must have \\(n\\) + + + +- natural numbers + + ```nil + Type Nat = succ Nat | + 0 + ``` + + we have two constructors--- + + \begin{equation} + succ = \lambda n . \lambda f. \lambda x. f(n f x) + \end{equation} + + \begin{equation} + 0 = \lambda f . \lambda x. x + \end{equation} + + +### examples {#examples} + +- identity \\(I\\): \\(\lambda x . x\\) +- constant \\(K\\): \\(\lambda z . \lambda y . z\\) + + +### numbers {#numbers} + + +### {#d41d8c} diff --git a/content/posts/KBhsum_of_subsets.md b/content/posts/KBhsum_of_subsets.md index 611fe58a0..e11e4772e 100644 --- a/content/posts/KBhsum_of_subsets.md +++ b/content/posts/KBhsum_of_subsets.md @@ -98,7 +98,7 @@ a\_1u\_1+ \dots + a\_{m}u\_{m} + b\_1v\_1 + \dots + b\_{j}v\_{j} =-(c\_1w\_1 + \ Recall that \\(u\_1 \dots v\_{j}\\) are all [vector]({{< relref "KBhvector.md" >}})s in \\(U\_1\\). Having written \\(-(c\_1w\_1 + \dots + c\_{k}w\_{k})\\) as a [linear combination]({{< relref "KBhlinear_combination.md" >}}) thereof, we say that \\(-(c\_1w\_1 + \dots + c\_{k}w\_{k}) \in U\_1\\) due to closure. But also, \\(w\_1 \dots w\_{k} \in U\_2\\) as they form a [basis]({{< relref "KBhbasis.md" >}}) of \\(U\_2\\). Hence, \\(-(c\_1w\_1 + \dots + c\_{k}w\_{k}) \in U\_2\\). So, \\(-(c\_1w\_1 + \dots + c\_{k}w\_{k}) \in U\_1 \cap U\_2\\). -And we said that \\(u\_1, \dots u\_{m}\\) are a [basis]({{< relref "KBhbasis.md" >}}) for \\(U\_1 \cap U\_{2}\\). Therefore, we can write the \\(c\_{i}\\) sums as a [linear combination]({{< relref "KBhlinear_combination.md" >}}) of $u$s: +And we said that \\(u\_1, \dots u\_{m}\\) are a [basis]({{< relref "KBhbasis.md" >}}) for \\(U\_1 \cap U\_{2}\\). Therefore, we can write the \\(c\_{i}\\) sums as a [linear combination]({{< relref "KBhlinear_combination.md" >}}) of \\(u\\): \begin{equation} d\_1u\_1 \dots + \dots + d\_{m}u\_{m} = (c\_1w\_1 + \dots + c\_{k}w\_{k}) @@ -123,4 +123,4 @@ recall \\(u\_1 \dots v\_{j}\\) is the [basis]({{< relref "KBhbasis.md" >}}) of \ Having shown that the list of \\(u\_1, \dots v\_1, \dots w\_1 \dots w\_{k}\\) [spans]({{< relref "KBhspan.md#spans" >}}) \\(U\_1+U\_2\\) and is [linearly independent]({{< relref "KBhlinear_independence.md" >}}) within it, it is a basis. -It does indeed have length \\(m+j+k\\), completing the proof. \\(\blacksquare\\) \ No newline at end of file +It does indeed have length \\(m+j+k\\), completing the proof. \\(\blacksquare\\)