Skip to content

Commit

Permalink
describe experimental setup
Browse files Browse the repository at this point in the history
  • Loading branch information
breandan committed Dec 4, 2024
1 parent ed933f6 commit 6aec385
Show file tree
Hide file tree
Showing 4 changed files with 10 additions and 5 deletions.
Binary file modified latex/thesis/Thesis.pdf
Binary file not shown.
2 changes: 1 addition & 1 deletion latex/thesis/content/Ch3_Deterministic_Repair.tex
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ \chapter{\rm\bfseries Deterministic Program Repair}

\section{Levenshtein Automata}

Levenshtein automata are finite automata that recognize all and only strings within a given edit distance of a reference string by permitting insertions, deletions, and substitutions. For example, suppose we have a string \texttt{( ) )}, and wish to find nearby repairs. To represent the language of small edits, there is an automaton, called the Levenshtein automaton, recognizing every single string that can be formed by inserting, substituting or deleting a parenthesis. We depict this automaton in Figure~\ref{fig:lev_automaton}.
Levenshtein automata are finite automata that recognize all and only strings within a given edit distance of a reference string by permitting insertions, deletions, and substitutions. For example, suppose we have a string \texttt{( ) )}, and wish to find nearby repairs. To represent the language of nearby edits, there is an automaton, called the Levenshtein automaton, recognizing every single string that can be formed by inserting, substituting or deleting a parenthesis. We depict this automaton in Figure~\ref{fig:lev_automaton}.

\begin{figure}[h!]
\input{content/figures/lev1_simp}
Expand Down
12 changes: 8 additions & 4 deletions latex/thesis/content/Ch4_Probabilistic_Repair.tex
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,14 @@ \chapter{\rm\bfseries Probabilistic Program Repair}

We will consider two kinds of probabilistic models: a constrained Markov model and an unconstrained transformer-based neural network trained on program repair, then evaluate the performance of these models on a syntax repair benchmark consisting of pairwise program transformations. As we will show, the constrained Markov model is able to achieve state-of-the-art precision on blind prediction of the lexical sequence.

Here we give each model 5k+ syntax repairs of varying lengths and Levenshtein distances and measure the precision at varying cutoffs. For example, if the ground truth syntax repair was contained in the top 10 results for half of the repair instances, the model's P@10 would be 50\%.

\begin{figure}[H]
\resizebox{.24\textwidth}{!}{\input{../popl2025/len_dist_tidy}}
\resizebox{.24\textwidth}{!}{\input{../popl2025/len_dist_bifi_all}}
\resizebox{.24\textwidth}{!}{\input{../popl2025/len_dist_s2p}}
\resizebox{.24\textwidth}{!}{\input{../popl2025/len_dist_bifi}}
\caption{Tidyparse, Seq2Parse and BIFI repair precision across length and edits.}
\caption{Total repair precision across the entire test set.}
\end{figure}

If we give it an equivalent number of samples, the constrained Markov model attains an even wider margin.
Expand All @@ -24,18 +26,20 @@ \chapter{\rm\bfseries Probabilistic Program Repair}
\resizebox{.24\textwidth}{!}{\input{../popl2025/len_dist_bifi_all}}
\resizebox{.24\textwidth}{!}{\input{../popl2025/len_dist_tidy200}}
\resizebox{.24\textwidth}{!}{\input{../popl2025/len_dist_tidy20k}}
\caption{Tidyparse, Seq2Parse and BIFI repair precision across length and edits.}
\caption{Sample efficiency increases sharply at larger precision intervals.}
\end{figure}

Now, we measure latency.
Next, we measure latency, which attains state-of-the-art precision at about 10 seconds, and additional time results in higher precision.

\begin{figure}[H]
\begin{center}
% \resizebox{.19\textwidth}{!}{\input{bar_hillel_repair.tex}}
\resizebox{.24\textwidth}{!}{\input{../popl2025/bar_hillel_repair_1}}
\resizebox{.24\textwidth}{!}{\input{../popl2025/bar_hillel_repair_2}}
\resizebox{.24\textwidth}{!}{\input{../popl2025/bar_hillel_repair_3}}
% \resizebox{.24\textwidth}{!}{\input{bar_hillel_repair_5}}
%\resizebox{.3\textwidth}{!}{\input{repair1_plot.tex}}
%\resizebox{.307\textwidth}{!}{\input{repair2_plot.tex}}
\caption{Latency benchmarks. Note the varying axis ranges. The red line marks Seq2Parse and the orange line marks BIFI's Precision@1 on the same repairs.}\label{fig:human}
\end{center}
\caption{Latency benchmarks. Note the varying axis ranges. The red line marks Seq2Parse and the orange line marks BIFI's Precision@1.}\label{fig:human}
\end{figure}
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import ai.hypergraph.kaliningraph.tensor.UTMatrix
import ai.hypergraph.kaliningraph.types.*

// Generalized regular expression: https://planetmath.org/generalizedregularexpression
// Parsing with derivatives: https://matt.might.net/papers/might2011derivatives.pdf
sealed class GRE(vararg val args: GRE) {
companion object { operator fun invoke(s: Σᐩ) = ONE(s) }

Expand Down

0 comments on commit 6aec385

Please sign in to comment.