Mini-LISP - unification of errors in C and JS implementations #426

OfirMarkowitz1 · 2021-07-28T09:37:55Z

No description provided.

yossigil · 2021-07-28T14:59:29Z

This means in particular, very brief, and not so informational messages; no worry, eventually someone will do the bells and whistles. Not for now please.

OfirMarkowitz1 · 2021-07-28T15:18:56Z

Ok

…

On Wed, 28 Jul 2021 at 17:59, Yossi Gil ***@***.***> wrote: This means in particular, very brief, and not so informational messages; no worry, eventually someone will do the bells and whistles. Not for now please. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#426 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APCETIGFO3VEFYYTCL4YHPLT2ALNZANCNFSM5BD7KAVQ> .

OfirMarkowitz1 · 2021-08-02T11:45:17Z

Mini-LISP C project, file S.h, line 182:

inline bool die(S s) { throw BUG.cons(s); }

Shouldn't it be "s.cons(BUG)", since "BUG" is the kind? This is the convention for all other error kinds.

OfirMarkowitz1 · 2021-08-02T13:52:15Z

Mini-LISP C project, file eval.cc, lines 50-57:

S only(S s) {
    s.pair() || die(s);
    s.cdr().cdr().t() && s.error(REDUNDANT).t();
    S a = evaluate_list(s.cdr());
    a.null() && s.error(MISSING).t();
    a.atom() && s.error(INVALID).t();
    return a.car();
}

This function seems wrong to me in a few levels:

s.error(...).t() - calling t() will do nothing since error throws an exception.
The line s.cdr().cdr().t() && s.error(REDUNDANT).t(); makes sure that s is a list with exactly 2 elements (otherwise REDUNDANT or CDR exception is thrown). It means that a is always a list with 1 element, thus the lines:

    a.null() && s.error(MISSING).t();
    a.atom() && s.error(INVALID).t();

are redundant - a is neither an atom nor nil

OfirMarkowitz1 · 2021-08-02T13:58:45Z

Mini-LISP C project, file eval.cc, lines 59-74:

void checkNumberOfArgs(S s) {
    S f = s.car();
    if (f.eq(QUOTE) || f.eq(CAR) || f.eq(CDR) || // 1 Args:
        f.eq(ATOM) || f.eq(NULL) || f.eq(EVAL)) {
        s.more2() && s.error(REDUNDANT).t();
        s.less2() && s.error(MISSING).t();
    }
    if (f.eq(CONS) || f.eq(SET) || f.eq(EQ) || f.eq(LAMBDA) || f.eq(NLAMBDA)) { // 2 Args:
        s.more3() && s.error(REDUNDANT).t();
        s.less3() && s.error(MISSING).t();
    }
    if (f.eq(NDEFUN) || f.eq(DEFUN)) { // 3 Args:
        s.more4() && s.error(REDUNDANT).t();
        s.less4() && s.error(MISSING).t();
    }
}

This function doesn't make sure that s is an actual list, i.e. "(a b c)" and "(a . (b . (c . d)))" is exactly the same for this function. That's because the "more" and "less" methods only check after how many "cdr" calls we arrive at an atom, but fails to make sure this atom is nil.

OfirMarkowitz1 · 2021-08-02T14:13:46Z

This combined behaviors of the only(S s) and checkNumberOfArgs(S s) functions means that there's an error handling discrepancy between single arg atomics and the rest.
Mini-LISP C project, file eval.cc, lines 76-102:

S evaluate_atomic_function(S s) { M(s);
  checkNumberOfArgs(s);
  const S f = s.car();
  // Atomic functions:
  if (f.eq(CAR))     return only(s).car();
  if (f.eq(CONS))    return s.$2$().eval().cons(s.$3$().eval());
  if (f.eq(SET))     return set(s.$2$().eval(),          s.$3$().eval());
  if (f.eq(EQ))      return s.$2$().eval().eq(s.$3$().eval()) ? T : NIL;
  if (f.eq(COND))    return evaluate_cond(s.cdr());
  if (f.eq(CDR))     return only(s).cdr();
  if (f.eq(ATOM))    return only(s).atom() ? T : NIL;
  if (f.eq(EVAL))    return only(s).eval();
  if (f.eq(ERROR))   return s.error(s.cdr().eval());
  if (f.eq(NULL))    return only(s).null() ? T : NIL;
  if (f.eq(QUOTE))   return s.cdr().car();
  if (f.eq(NLAMBDA)) return list(NLAMBDA, s.$2$(),  s.$3$());
  if (f.eq(LAMBDA))  return list(LAMBDA, s.$2$(), s.$3$());
  if (f.eq(NDEFUN)) { 
    ndefun(s.$2$(), s.$3$(), s.$4$());
    return  s.$2$();
  }
  if (f.eq(DEFUN)) { 
    defun(s.$2$(), s.$3$(), s.$4$());
    return  s.$2$();
  }
  return bug(s);
}

As you can see, the single arg atomics are called with only(S s) that makes sure that s is a list, while the rest don't - see my comment about checkNumberOfArgs(S s). I think the solution is that checkNumberOfArgs(S s) will make sure that s is a list (otherwise it will throw INVALID), and it will also render only(S s) redundant - see my comment about only(S s).

I'm referring you to this problems because I don't want to blindly copy what seems to me as an odd behavior, without making sure this is what you intended.

yossigil · 2021-08-02T16:05:41Z

@DorYeheskel

What do you say to this?

DorYeheskel · 2021-08-02T17:34:53Z

@OfirMarkowitz1
Calling evaluate_atomic_function(S s) means that s is a list (e.g: (car 'a)), so you don't need to check it.
If s is not a list, in the eval function, it will go to the lookup function (see the conditions here)

You can see also here:

(defun evaluate−atomic (S−expression a−list)
     (apply−atomic (car S−expression) (cdr S−expression) a−list))

That cdr is called (which mean s is a list, otherwise, error will occur)

OfirMarkowitz1 · 2021-08-02T23:28:18Z

@DorYeheskel @yossigil
It seems to me the confusion stems from the fact that in the C implementation there's no notion of a pair that is not a list.
If that's true, this is in inconsistent with the book.
Am I correct about this?

And in any case, the only(S s) function is redundant. You should just use eval, i.e.:

if (f.eq(CAR))     return s.$2$().eval();

or

if (f.eq(CAR))     return evaluate_list(s.cdr()).car();

yossigil · 2021-08-03T06:33:57Z

I will let you guys sort it out a little...

OfirMarkowitz1 · 2021-08-03T10:03:01Z

@OfirMarkowitz1
Calling evaluate_atomic_function(S s) means that s is a list (e.g: (car 'a)), so you don't need to check it.
If s is not a list, in the eval function, it will go to the lookup function (see the conditions here)

You can see also here:
(defun evaluate−atomic (S−expression a−list)
     (apply−atomic (car S−expression) (cdr S−expression) a−list))
That cdr is called (which mean s is a list, otherwise, error will occur)

@DorYeheskel
The discussion about pairs in #439 is relevant here. It seems that the current C implementation does not allow pairs that are not lists. If pairs that are not lists will be allowed, my remarks about the implementation of void checkNumberOfArgs(S s) are valid.

OfirMarkowitz1 · 2021-08-03T12:11:14Z

In the function S evaluate_atomic_function(S s), file eval.cc, line 88:

if (f.eq(ERROR))   return s.error(s.cdr().eval());

This isn't working properly since s.cdr().eval() will throw an "INVALID" exception, before the call to error.
There's also tests that rely on this incorrect behavior, see t-alist-global-local.cc, in test SetWithError, specifically line 101:

EXPECT_EXCEPTION(s.eval(), list(b_2, x.q()), INVALID);

I think the correct behavior is throwing an S expression containing the actuals passed to error, after their evaluation:

if (f.eq(ERROR))   return s.error(evaluate_list(s.cdr()));

I guess that's what was meant to be written but eval was called instead of evaluate_list.

OfirMarkowitz1 · 2021-08-03T16:36:29Z

I've resolved issue #433, e.g., finished the error handling in JS. Perhaps I will need to make very few changes in order to close the current issue. I'm waiting for @DorYeheskel and @yossigil final take on this, to corroborate my work. Same with #427

yossigil · 2021-08-08T11:52:08Z

@DorYeheskel are you sure that we do not have pairs in the implementation, I remember vividly supporting these in my parser.

DorYeheskel · 2021-08-08T20:30:36Z

@OfirMarkowitz1
I fixed the error message issues, thanks.

About pairs, see the following examples, using the mini-lisp interpreter:
Lists:
(car (car '(a b))) -> ERROR (car error)
(cdr (car '(a b))) -> ERROR (cdr error)

Pairs:
(car (car '(a.b))) -> A
(cdr (car '(a.b))) -> B

As you can see, (a.b) is a pair in a list (as written here too) and thus car on it will return the pair A.B
and car/cdr on it will return A/B.

If pair was not supported, then (car (car '(a.b)) should raise a car error, which is not the case.

OfirMarkowitz1 · 2021-08-09T10:26:18Z

@DorYeheskel
Firstly, from page 34 in the book:

(car '(b.a)) ⇒ B
(car '(b a)) ⇒ B
(car '(a)) ⇒ A
(car 'a) ⇒ #
(car ()) ⇒ #

A.B is not a pair, but rather (A.B). The first line makes it clear.
Also, see @yossigil remark from #439:

The C implementation recognizes all S expressions. This includes both Pairs and Lists.
By definition, the car of a pair, gives the CAR, i.e., the left child of the pair.
(car '(a.b)) is a. If it is not, it is a bug.

Also, A.B shouldn't be an atom, but rather rejected by the parser, see @yossigil remark from #439:

Well, we encountered a subtle difference between the parser of our implementation and that of common lisp. This suggests even more to avoid working on a common lisp implementation. This also may be related to the problem of unique unambiguous output.
What is (a.b)? Could be a list containing one element, or a pair. I think it is a pair...

Why?

Mini Lisp should be minimal. Identifiers are really simple. Only few characters are excluded; precisely those used for tokens, I think there are total of 6 reserved characters, out of which only 4 are in actual use; there are also spaces. That's it.

So, by this minimalist approach, (a.b) could only be a pair. it cannot be a list, since list elements are separated by spaces. It could not be a list containing one atom, because a.b is not an atom.

I hope this makes sense.

In both cases, the fact that the tests don't reflect this behavior means the tests are wrong, not the other way around.
By the way:

I think there are total of 6 reserved characters, out of which only 4 are in actual use; there are also spaces. That's it.

There should be an additional "token" character, ';', which should be used only for line comments, so 7 reserved characters in total. This character should be added in Tokenizer.h, in line 6: const char tokens[] = "()[].\'";

yossigil · 2021-08-12T17:30:18Z

Seems like an error in the book. Should be fixed.

yossigil · 2021-08-12T17:31:21Z

Comments are not tokens; never...

OfirMarkowitz1 · 2021-09-12T13:09:55Z

I still need to unify these errors.
It would be much more comfortable for me if we gathered all the cases where the user did something wrong, and decide whether it warrants and error, and if it is, which kind.
Discussing it with @yossigil, I understood that our rule of thumb is simplicity. We want the minimal number of exception scenarios in the code, and to let the stack dump do the talking.

Necessary error example - fake lambda tag:

('(fake_lambda_tag (x) (car x)) '[a.b])

This should be an error (and is, both in C and JS implementations), because it doesn't make sense for a wrong tag to act like lambda/nlambda, i.e., have a default behavior.

An existing error example, which I'm not sure is necessary:

((lambda (x) (car x)) '[a.b] 'c)

Currently, this raises a "REDUNDANT" error, but we can decide just to ignore the second variable. I think it's simpler.

Another existing error example:

((lambda (x y) (cons x y)) 'a)

Currently, this raises a "MISSING" error, but I think it also doesn't have to be an error because we can implement currying instead, i.e., this expression evaluation can return (LAMBDA (Y) (CONS (QUOTE A) Y)).

I think a discussion like this would be much more fruitful than for me to blindly copy the C implementation errors. Also, I think it would make a nice addition to the book, which currently contains only some of the present C implementation errors.

yossigil · 2021-09-13T21:29:33Z

I think there are the following errors:

Undefined variable
Missing argument (it boils down to CDR failing on some list)
Redundant argument (it boils down to CDR failing on some list)
Bad function, i.e., an anonymous function that does follow the strict format
Memory overflow (occurs in C, not in JS)
Bug, when something should not happen

This is all I could think of.

OfirMarkowitz1 · 2021-09-14T05:12:40Z

MISSING and REDUNDANT:

Missing argument (it boils down to CDR failing on some list)
Redundant argument (it boils down to CDR failing on some list)

If it boils down to a different error, this means there shouldn't be a REDUNDANT and MISSING error when applying lambda or primitive, and there are:
When binding upon applying lambda, in a-list.cc:

S bind(S formals, S actuals, S list) { 
  if (!formals.null() && actuals.null()) return formals.error(MISSING);
  if (formals.null() && !actuals.null()) return actuals.error(REDUNDANT);
  if (formals.null()) return list;
  return formals.car().cons(actuals.car()).cons(bind(formals.cdr(), actuals.cdr(), list));
}

Before applying primitive, in eval.cc:

void checkNumberOfArgs(S s) {
    S f = s.car();
    push(ARGUMENT, s.cdr());
    if (f.eq(QUOTE) || f.eq(CAR) || f.eq(CDR) || // 1 Args:
        f.eq(ATOM) || f.eq(NULL) || f.eq(EVAL)) {
        s.more2() && s.error(REDUNDANT).t();
        s.less2() && s.error(MISSING).t();
    }
    if (f.eq(CONS) || f.eq(SET) || f.eq(EQ) || f.eq(LAMBDA) || f.eq(NLAMBDA)) { // 2 Args:
        s.more3() && s.error(REDUNDANT).t();
        s.less3() && s.error(MISSING).t();
    }
    if (f.eq(NDEFUN) || f.eq(DEFUN)) { // 3 Args:
        s.more4() && s.error(REDUNDANT).t();
        s.less4() && s.error(MISSING).t();
    }
    remove_element(ARGUMENT);
}

Applying lambda:

Bad function, i.e., an anonymous function that does follow the strict format

S apply(S f, S args) {
  D(f,args);
  f.n3() || f.cons(args).error(INVALID).t();
  D(f.$1$(),f.$2$(),f.$3$(),args,f.$1$().eq(NLAMBDA));
  push(ARGUMENT, args);
  const auto actuals = f.$1$().eq(NLAMBDA)? args : f.$1$().eq(LAMBDA) ? evaluate_list(args) : f.cons(args).error(INVALID);
  remove_element(ARGUMENT);
  alist() = bind(f.$2$(), actuals, alist());
  push(ARGUMENT, actuals);
  const auto result = f.$3$().eval();
  remove_element(ARGUMENT);
  remove_local_binding(f.$2$());
  return result;
}

There's a check that the S expression is a list of length 3, and that the tag is valid. There's no check that the formals are a valid list.
Observing the bind function C implementation above, in case the formals are invalid, it boils down to a CAR error, when the formals are a non-nil atom, MISSING error when its a non-list nested pair (e.g. [a . [b . c]]) with insufficient degree of nesting, and no error will be thrown if it's a non-list nested pair with sufficient degree (the number of actuals or above).
I think it's confusing and non-coherent, and there should be a proper check that the formals are a list, and INVALID error should be thrown if it doesn't.

CAR and CDR on an atom should be included in this list.
Evaluating COND:
A COND error is thrown when one of the arguments passed to COND is an atom - eval.cc:

FUN(S, evaluate_cond, S)  IS( 
    _.null() ?  NIL:
    _.car().atom() ? _.car().error(COND):
      _.car().car().eval().t() ? _.car().cdr().car().eval():
        $$(_.cdr()) ;
)

It can easily be boiled down to CAR error.

The ERROR primitive:
We need to define exactly what happens when the user calls the ERROR primitive. In the book, you could pass the error any number of arguments, but in the C implementation, you can pass 1 argument at most, and it ignores the rest:

  if (f.eq(ERROR)) {
      if (! s.less2()) {
          args_1 = eval_argument(s.$2$());
      }
      push(ARGUMENT, args_1);
      return s.error(args_1); // throw
  }

The evaluation function:
Similar to what I described about the formals, there's no check that the S expression given to the eval function is a proper list (if it's not an atom):

S eval(S s) {
    D(s);
    inc_eval_counter();
    S res = NIL;
    D(s);
    if (s.atom()) {
       res = lookup(s);
       D(s, res);
    } else {
      D(s, s.car(), s.cdr());
      if (atomic(s.car())) {
       D(s.car());
       push(RESCUE, s.car());
       res = evaluate_atomic_function(s);
       remove_element(RESCUE);
       D(s, res);
      } else {
       S f_name = s.car();
       S f_body = s.car().eval();
       push(RESCUE, f_name);
       res = apply_defined_function(f_body, s.cdr());
       remove_element(RESCUE);
       D(s, res);
      }
    }
    D(s,res);
    dec_eval_counter();
    if (get_eval_counter() == 0) reset_set_counter();
    return res;
}

This means that [set . ['a . 'b]] will be equivalent to (set 'a 'b), and it seems weird to me. I think we should check that the evaluation function argument is a proper list, and throw INVALID error, if it's not.

yossigil · 2021-10-07T21:49:04Z

This boils down to the evaluation order: with gnu common lisp, we obtain:

>(car t t a)
Error: UNBOUND-VARIABLE :NAME A

which means that the evaluation algorithm in CL first evaluates all arguments; it then binds them and determines whether their number is correct. Check out also (cons x) which fails on x. I saw no point in using our own evaluation order. For this reason, I re-implemented eval function in the code. I keep updating, so please check this out.

yossigil assigned OfirMarkowitz1 Jul 28, 2021

yossigil added a commit that referenced this issue Jul 28, 2021

A comment on #426

781cc4d

yossigil added a commit that referenced this issue Jul 28, 2021

Another code comment on #426

0c5b5f5

OfirMarkowitz1 mentioned this issue Aug 2, 2021

Mini-LISP JS - Fix C and JS parser discrepancies #439

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mini-LISP - unification of errors in C and JS implementations #426

Mini-LISP - unification of errors in C and JS implementations #426

OfirMarkowitz1 commented Jul 28, 2021

yossigil commented Jul 28, 2021

OfirMarkowitz1 commented Jul 28, 2021 via email

OfirMarkowitz1 commented Aug 2, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 2, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 2, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 2, 2021 •

edited

Loading

yossigil commented Aug 2, 2021

DorYeheskel commented Aug 2, 2021

OfirMarkowitz1 commented Aug 2, 2021 •

edited

Loading

yossigil commented Aug 3, 2021

OfirMarkowitz1 commented Aug 3, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 3, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 3, 2021

yossigil commented Aug 8, 2021

DorYeheskel commented Aug 8, 2021

OfirMarkowitz1 commented Aug 9, 2021 •

edited

Loading

yossigil commented Aug 12, 2021

yossigil commented Aug 12, 2021

OfirMarkowitz1 commented Sep 12, 2021 •

edited

Loading

yossigil commented Sep 13, 2021

OfirMarkowitz1 commented Sep 14, 2021 •

edited

Loading

yossigil commented Oct 7, 2021

Mini-LISP - unification of errors in C and JS implementations #426

Mini-LISP - unification of errors in C and JS implementations #426

Comments

OfirMarkowitz1 commented Jul 28, 2021

yossigil commented Jul 28, 2021

OfirMarkowitz1 commented Jul 28, 2021 via email

OfirMarkowitz1 commented Aug 2, 2021 • edited Loading

OfirMarkowitz1 commented Aug 2, 2021 • edited Loading

OfirMarkowitz1 commented Aug 2, 2021 • edited Loading

OfirMarkowitz1 commented Aug 2, 2021 • edited Loading

yossigil commented Aug 2, 2021

DorYeheskel commented Aug 2, 2021

OfirMarkowitz1 commented Aug 2, 2021 • edited Loading

yossigil commented Aug 3, 2021

OfirMarkowitz1 commented Aug 3, 2021 • edited Loading

OfirMarkowitz1 commented Aug 3, 2021 • edited Loading

OfirMarkowitz1 commented Aug 3, 2021

yossigil commented Aug 8, 2021

DorYeheskel commented Aug 8, 2021

OfirMarkowitz1 commented Aug 9, 2021 • edited Loading

yossigil commented Aug 12, 2021

yossigil commented Aug 12, 2021

OfirMarkowitz1 commented Sep 12, 2021 • edited Loading

yossigil commented Sep 13, 2021

OfirMarkowitz1 commented Sep 14, 2021 • edited Loading

yossigil commented Oct 7, 2021

OfirMarkowitz1 commented Aug 2, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 2, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 2, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 2, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 2, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 3, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 3, 2021 •

edited

Loading

OfirMarkowitz1 commented Aug 9, 2021 •

edited

Loading

OfirMarkowitz1 commented Sep 12, 2021 •

edited

Loading

OfirMarkowitz1 commented Sep 14, 2021 •

edited

Loading