Parameter dependency

Parameter dependency is the name given to a tricky issue that comes up in many settings where one is dealing with higher-order abstract syntax. It can be particularly devious in CLF encodings.

Introduction

Consider the following nearly-standard SSOS encoding of the untyped lambda calculus, environment-semantics style. (It's only "nearly" standard because I was too lazy to define frames, instead using poor-type-theorist's frames, which are just linear functions exp -o exp. There are a number of reasons you wouldn't want to do this in practice!)

[example1-good.clf]
exp: type. 
app: exp -o exp -o exp.
lam: (exp -> exp) -o exp.

dest: type.
eval: exp -> dest -> type. #mode eval - -.
retn: exp -> dest -> type. #mode retn - -.
cont: (exp -o exp) -> dest -> dest -> type. #mode cont - - -.
bind: exp -> exp -> type. #mode bind - -.

ev-lam: 
 eval (lam \!x. E !x) D
 -o {retn (lam \!x. E !x) D}.

ev-app:
 eval (app E1 E2) D
 -o {Exists d1.
     eval E1 d1
     * cont (\e1. app e1 E2) d1 D}.

ev-app1:
 cont (\e1. app e1 E2) D1 D
 -o retn (lam \!x. E !x) D1
 -o {Exists d2.
     eval E2 d2
     * cont (\e2. app (lam \!x. E !x) e2) d2 D}.

ev-app2:
 cont (\e2. app (lam \!x. E !x) e2) D2 D
 -o retn V2 D2
 -o {Exists x.
     eval (E !x) D
     * !bind x V2}.

ev-bind:
 bind X V 
 -o eval X D'
 -o {retn V D'}.

Here's the execution trace of the evaluation of app (lam \!x. x) (lam \!y. y), using a stylized notation for frames and expressions, and also using exponentials as a shorthand for judgments (we write !bind x (\y.y) instead of bind x (\y.y) pers).

-- d,                  eval ((\x.x)(\y.y) d 
-- d,d1                eval (\x.x) d, cont ([](\y.y)) d1 d
-- d,d1                retn (\x.x) d, cont ([](\y.y)) d1 d
-- d,d1,d2             eval (\y.y) d, cont ((\x.x)[]) d2 d
-- d,d1,d2             retn (\y.y) d, cont ((\x.x)[]) d2 d
-- d,d1,d2,x           eval x d, !bind x (\y.y) 
-- d,d1,d2,x           retn (\y.y) d, !bind x (\y.y)

Here's the execution of an only slightly more complicated program:

-- d,                  eval ((\x.xx)(\y.y) d 
-- d,d1                eval (\x.xx) d, cont ([](\y.y)) d1 d
-- d,d1                retn (\x.xx) d, cont ([](\y.y)) d1 d
-- d,d1,d2             eval (\y.y) d, cont ((\x.xx)[]) d2 d
-- d,d1,d2             retn (\y.y) d, cont ((\x.xx)[]) d2 d
-- d,d1,d2,x           eval (xx) d, !bind x (\y.y) 
-- d,d1,d2,x,d3        eval x d3, cont ([]x) d3 d, !bind x (\y.y) 
-- d,d1,d2,x,d3        retn (\y.y) d3, cont ([]x) d3 d, !bind x (\y.y) 
-- d,d1,d2,x,d3,d4     eval x d4, cont ((\y.y)[]) d4 d, !bind x (\y.y) 
-- d,d1,d2,x,d3,d4     retn (\y.y) d4, cont ((\y.y)[]) d4 d, !bind x (\y.y) 
-- d,d1,d2,x,d3,d4,y   eval y d, !bind x (\y.y), !bind y (\y.y)
-- d,d1,d2,x,d3,d4,y   retn (\y.y) d, !bind x (\y.y), !bind y (\y.y)
-- d,d1,d2,x,d3,d4,y   retn (\y.y) d, !bind x (\y.y), !bind y (\y.y)

And a third example, demonstrating how environments can "leak" into the final answer (\z.x with x bound to \y.y is actually \z.\y.y).

-- d,                  eval ((\x.\z.x)(\y.y) d 
-- d,d1                eval (\x.\z.x) d, cont ([](\y.y)) d1 d
-- d,d1                retn (\x.\z.x) d, cont ([](\y.y)) d1 d
-- d,d1,d2             eval (\y.y) d, cont ((\x.\z.x)[]) d2 d
-- d,d1,d2             retn (\y.y) d, cont ((\x.\z.x)[]) d2 d
-- d,d1,d2,x           eval (\z.x) d, !bind x (\y.y) 
-- d,d1,d2,x           retn (\z.x) d, !bind x (\y.y)

PART 1: Parameter dependency from incomplete refunctionalization

As Ian Zerny points out, the program above is highly "defunctionalized," as writing programs in a defunctionalized style has been the fashion at least since Frank's 2004 APLAS talk and our subsequent 2009 LICS paper.

We'll get at our example by re-functionalizing the program just a little bit: trying to eliminate the bind predicate and associated ev-bind rule by collapsing it into the ev-app2 rule: whenever we previously had a predicate !bind x V in the context, we now ought to have a premise !eval x D' -o retn V D'.

Of course, because I'm explaining a problem, I'll do this in a way that is subtly wrong.

[example2-buggy.clf]
exp: type. 
app: exp -o exp -o exp.
lam: (exp -> exp) -o exp.

dest: type.
eval: exp -> dest -> type. #mode eval - -.
retn: exp -> dest -> type. #mode retn - -.
cont: (exp -o exp) -> dest -> dest -> type. #mode cont - - -.

ev-lam: 
 eval (lam \!x. E !x) D
 -o {retn (lam \!x. E !x) D}.

ev-app:
 eval (app E1 E2) D
 -o {Exists d1.
     eval E1 d1
     * cont (\e1. app e1 E2) d1 D}.

ev-app1:
 cont (\e1. app e1 E2) D1 D
 -o retn (lam \!x. E !x) D1
 -o {Exists d2.
     eval E2 d2
     * cont (\e2. app (lam \!x. E !x) e2) d2 D}.

ev-app2:
 cont (\e2. app (lam \!x. E !x) e2) D2 D
 -o retn V2 D2
 -o {Exists x.
     eval (E !x) D
     * !(eval x D'
         -o {retn V D'})}.

Our first example actually works! However, our first example should also make us aware that there's a potential problem.

-- d,             eval ((\x.x)(\y.y) d 
-- d,d1           eval (\x.x) d, cont ([](\y.y)) d1 d
-- d,d1           retn (\x.x) d, cont ([](\y.y)) d1 d
-- d,d1,d2        eval (\y.y) d, cont ((\x.x)[]) d2 d
-- d,d1,d2        retn (\y.y) d, cont ((\x.x)[]) d2 d
-- d,d1,d2,x      eval x d, !(eval x D'[d/x1,d1/x2,d2/x3] 
                              -o {retn (\y.y) D'[d/x1,d1/x2,d2/x3]})
    (in this step, D' becomes "x1", so D'[d,d1,d2] becomes "d")
-- d,d1,d2,x      retn (\y.y) d, !bind x (\y.y)

Our new ev-app2 rule has D' free in the premise but not the conclusion, so our operational semantics has to place an unbound existential variables in the context. Existential variables, following contextual modal type theory, always run with their own substitutions. For the purposes of this conversation, that explicit substitution represents that what an existential variable actually "is" is a representation of the fact that we have to instantiate a variable (like D') back where it's implicit Pi-binding lives, in the rule ev-app2, so D' can only mention variables that had been introduced when we encountered the Pi binding for D' (namely, d, d1, and d2).

This will be the source of our first problem, when we try to run the more complicated program:

-- d,                  eval ((\x.xx)(\y.y) d 
-- d,d1                eval (\x.xx) d, cont ([](\y.y)) d1 d
-- d,d1                retn (\x.xx) d, cont ([](\y.y)) d1 d
-- d,d1,d2             eval (\y.y) d, cont ((\x.xx)[]) d2 d
-- d,d1,d2             retn (\y.y) d, cont ((\x.xx)[]) d2 d
-- d,d1,d2,x           eval (xx) d, !(eval x D'[d/x1,d1/x2,d2/x3] 
                                      -o {retn (\y.y) D'[d/x1,d1/x2,d2/x3]})
-- d,d1,d2,x,d3        eval x d3, cont ([]x) d3 d, 
                         !(eval x D'[d/x1,d1/x2,d2/x3] 
                           -o {retn (\y.y) D'[d/x1,d1/x2,d2/x3]})
STUCK!

Recall that, previously, the next step was an application of ev-bind that turned eval x d3 into retn (\y.y) d3. But d3 is not one of the variables mentioned in the explicit substitution for D' - it is not yet introduced when we fire ev-app2 - so we can't get to retn (\y.y) d3, which is the logical next step.

[example2-fixed.clf]
exp: type. 
app: exp -o exp -o exp.
lam: (exp -> exp) -o exp.

dest: type.
eval: exp -> dest -> type. #mode eval - -.
retn: exp -> dest -> type. #mode retn - -.
cont: (exp -o exp) -> dest -> dest -> type. #mode cont - - -.

ev-lam: 
 eval (lam \!x. E !x) D
 -o {retn (lam \!x. E !x) D}.

ev-app:
 eval (app E1 E2) D
 -o {Exists d1.
     eval E1 d1
     * cont (\e1. app e1 E2) d1 D}.

ev-app1:
 cont (\e1. app e1 E2) D1 D
 -o retn (lam \!x. E !x) D1
 -o {Exists d2.
     eval E2 d2
     * cont (\e2. app (lam \!x. E !x) e2) d2 D}.

ev-app2:
 cont (\e2. app (lam \!x. E !x) e2) D2 D
 -o retn V2 D2
 -o {Exists x.
     eval (E !x) D
     * !(Pi d'. 
         eval x d'
         -o {retn V d'})}.

PART 2a: Existential variables are linear resources

There was a subtler bug, as well, one that I'm not sure how to demonstrate with a reasonable forward-chaining program. Because D' is existentially quantified, we can only instantiate it once, so if we'd somehow run into a situation where we needed to return x to both the destinations d1 and d, we would be as hosed for a different reason.

This more subtle bug is actually easier to illustrate with a Twelf program that tries to replace all variable occurrences with some natural number (which natural number isn't important, it's just important that it changes).

[param-bad.elf]
nat: type.
z: nat.
s: nat -> nat.

exp: type.
app: exp -> exp -> exp.
lam: (exp -> exp) -> exp.
n: nat -> exp.

gogo: exp -> nat -> exp -> type.
%mode gogo +A +B -C.

-: gogo (app E1 E2) N (app E1' E2')
    <- gogo E1 (s N) E1'
    <- gogo E2 (s N) E2'.

-: gogo (lam [x] E x) N (lam [x] E' x)
    <- ({x} gogo x M (n M)
        -> gogo (E x) (s N) (E' x)).

-: gogo (n M) N (n M). 

%block govar: some {M: nat} block {x: exp} {d: gogo x M (n M)}.
%worlds (govar) (gogo _ _ _).

%query 1 * gogo (lam [x] app x x) z E.
%query 1 * gogo (lam [x] lam [y] app x (app y y)) z E.
%query 0 * gogo (lam [x] lam [y] app x (app x y)) z E.

There's no totality declaration, as the last %query shows. If we try to conclude %total T (gogo T _ _), Twelf will report the following:

Coverage error --- missing cases:
{X1:nat} {#govar:{x:exp} {d:gogo x X1 (n X1)}} {X2:exp} |- gogo #govar_x z X2,
{X1:nat} {#govar:{x:exp} {d:gogo x X1 (n X1)}} {X2:nat} {X3:exp}
   |- gogo #govar_x (s X2) X3.

The problem is not a parameter dependency issue, it's the more subtle bug that I was mentioning above: M appears as an existential variable - with two lams to traverse, we end up with two existential variables, which I'll call Mx and My - once we get to the app our context contains, x, y, the fact gogo x Mx[] (n Mx[]) and the fact gogo y My[] (n My[]). In the second example, Mx gets unified with s (s (s z)) - 3 - whereas My gets unified with s (s (s (s z))) - 4, and everything is fine. In the second example, we encounter the x in app x y and we need to return 4, but x has already been unified with 3 by the other occurrence of x, so the query as a whole fails.

Conclusion: even if/when we can avoid parameter dependency issues, unification variables in the context act like very unpredictable linear resources. I think the right solution is to always avoid programs that can leak existential variables out of the focusing phase and into the context. The way to do the thing we presumably intended to do with the program above is to make the program more higher order: locally pi-quantifying M. This is essentially the same way we fixed the SSOS program in CLF, of course!

[param-good.elf]
nat: type.
z: nat.
s: nat -> nat.

exp: type.
app: exp -> exp -> exp.
lam: (exp -> exp) -> exp.
n: nat -> exp.

gogo: exp -> nat -> exp -> type.
%mode gogo +A +B -C.

-: gogo (app E1 E2) N (app E1' E2')
    <- gogo E1 (s N) E1'
    <- gogo E2 (s N) E2'.

-: gogo (lam [x] E x) N (lam [x] E' x)
    <- ({x} ({M} gogo x M (n M))
        -> gogo (E x) (s N) (E' x)).

-: gogo (n M) N (n M). 

%block govar: block {x: exp} {d: {M} gogo x M (n M)}.
%worlds (govar) (gogo _ _ _).

%query 1 * gogo (lam [x] app x x) z E.
%query 1 * gogo (lam [x] lam [y] app x (app y y)) z E.
%query 1 * gogo (lam [x] lam [y] app x (app x y)) z E.
%total T (gogo T _ _).

A final point about parameter dependency: sometimes, we want to do something that seems bad because we know what we're doing. Take the program above; we could have just as well written the clause that deals with lam this way, because the whole point of the program is that it substitutes for all the variables:

-: gogo (lam [x] E x) N (lam [x] E')
    <- ({x} ({M} gogo x M (n M))
        -> gogo (E x) (s N) E').

If we'd done so, however, the %total declaration would have flunked us, because it doesn't do any of the analysis that would be necessary to notice that the output can't depend on the parameter x - the only analysis it does do is based on type subordination.

PART 2b: Parameter dependency in Twelf

We can, by making this example more degenerate (nats can depend on exps, which changes the subordination relation significantly), come up with an example that can fail because of parameter dependency, not because of the "linear-ish" behavior of unification. The fix is the same as before: quantify more locally.

[param-goofy-bad1.elf]
nat: type.
z: nat.
s: nat -> nat.

exp: type.
app: exp -> exp -> exp.
lam: (exp -> exp) -> exp.
n: nat -> exp.

nx: exp -> nat -> nat.

gogo: exp -> nat -> exp -> type.
%mode gogo +A +B -C.

-: gogo (app E1 E2) N (app E1' E2')
    <- gogo E1 (s N) E1'
    <- gogo E2 (s N) E2'.

-: gogo (lam [x] E x) N (lam [x] E' x)
    <- ({x} gogo x M (n M)
        -> gogo (E x) (nx x N) (E' x)).

-: gogo (n M) N (n M). 

%block govar: some {M: nat} block {x: exp} {d: gogo x M (n M)}.
%worlds (govar) (gogo _ _ _).

%query 0 * gogo (lam [x] app x x) z E.
%query 0 * gogo (lam [x] lam [y] app x (app y y)) z E.
%query 0 * gogo (lam [x] lam [y] app x (app x y)) z E.

[param-goofy-bad2.elf]
nat: type.
z: nat.
s: nat -> nat.

exp: type.
app: exp -> exp -> exp.
lam: (exp -> exp) -> exp.
n: nat -> exp.

nx: exp -> nat -> nat.

gogo: exp -> nat -> exp -> type.
%mode gogo +A +B -C.

-: gogo (app E1 E2) N (app E1' E2')
    <- gogo E1 (s N) E1'
    <- gogo E2 (s N) E2'.

-: gogo (lam [x] E x) N (lam [x] E' x)
    <- ({x} gogo x (M x) (n (M x))
        -> gogo (E x) (nx x N) (E' x)).

-: gogo (n M) N (n M). 

%block govar: some {M: exp -> nat} block {x: exp} {d: gogo x (M x) (n (M x))}.
%worlds (govar) (gogo _ _ _).

%query 1 * gogo (lam [x] app x x) z E.
%query 0 * gogo (lam [x] lam [y] app x (app y y)) z E.
%query 0 * gogo (lam [x] lam [y] app x (app x y)) z E.

[param-goofy-good.elf]
nat: type.
z: nat.
s: nat -> nat.

exp: type.
app: exp -> exp -> exp.
lam: (exp -> exp) -> exp.
n: nat -> exp.

nx: exp -> nat -> nat.

gogo: exp -> nat -> exp -> type.
%mode gogo +A +B -C.

-: gogo (app E1 E2) N (app E1' E2')
    <- gogo E1 (s N) E1'
    <- gogo E2 (s N) E2'.

-: gogo (lam [x] E x) N (lam [x] E' x)
    <- ({x} ({M} gogo x M (n M))
        -> gogo (E x) (nx x N) (E' x)).

-: gogo (n M) N (n M). 

%block govar: block {x: exp} {d: {M} gogo x M (n M)}.
%worlds (govar) (gogo _ _ _).

%query 1 * gogo (lam [x] app x x) z E.
%query 1 * gogo (lam [x] lam [y] app x (app y y)) z E.
%query 1 * gogo (lam [x] lam [y] app x (app x y)) z E.
%total T (gogo T _ _).

PART 3: Parameter dependency in queries

Another way this bug occurs is in queries. Given example1-good.clf or example2-fixed.clf, the programs corresponding to our two example programs will succeed:

#query * 1 * 1
  Pi d.
  eval (app (lam \!x. x) (lam \!y. y)) d
   -o {retn V d}.
 
#query * 1 * 1
  Pi d.
  eval (app (lam \!x. app x x) (lam \!y. y)) d
   -o {retn V d}.

However, with our third example program, we will fail:

#query * 1 * 1
  Pi d.
  eval (app (lam \!x. lam \!z. x) (lam \!y. y)) d
   -o {retn V d}.

For (\x.x)(\y.y) it was:

-- d,d1,d2,x           retn (\y.y) d, !bind x (\y.y)

For (\x.xx)(\y.y) it was:

-- d,d1,d2,x,d3,d4,y   retn (\y.y) d, !bind x (\y.y), !bind y (\y.y)

For our problematic example, (\x.\z.x)(\y.y), our final state was:

-- d,d1,d2,x           retn (\z.x) d, !bind x (\y.y)

Therein lies the problem: Celf represents V as the existential variable V[]. It's not allowed to mention any of the variables we could have allowed it to refer to the variable d by writing V d in the query, not that this would have helped anything in this particular case. It's not possible to unify V with anything that makes it equal to \z.x, because x is free in that expression.

Provide feedback

Saved searches