Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify behavior of threading macros with fn, short-fn, etc #1478

Closed
wants to merge 1 commit into from
Closed

Clarify behavior of threading macros with fn, short-fn, etc #1478

wants to merge 1 commit into from

Conversation

na-sa-do
Copy link

See #1477.

@ianthehenry
Copy link
Contributor

To respond to a comment on the previous issue:

I think "working as intended" is a bit strong, given that I can't imagine anyone thought of this exact outcome as desirable, but I'll grant that it isn't a bug.

I think it's very desirable! -> is for rearranging forms, and it rearranged the forms you gave it. If you want to work with expressions-that-evaluate-to-functions, you can wrap them in parentheses, like (-> 3 (|(+ $ 1))), so that the rearranging works in your favor. (But -> is probably just not the right tool for this job.)

This comment doesn't clarify that, though. There is nothing about fn or short-fn that makes -> behave poorly or differently than any other form. Any expression that evaluates to a function (except for a symbol literal) is incompatible with ->, because it doesn't take functions as arguments. as-> is a solution for the particular case you presented in your previous issue, but it doesn't really make sense as a general recommendation.

I get the impression that you're thinking of -> as "a function that takes other functions as arguments and calls them in order," which is absolutely something you can write, but it's just not what -> is.

Copy link
Member

@pepe pepe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👎🏾

@sogaiu
Copy link
Contributor

sogaiu commented Jul 26, 2024

I don't know if Clojure was the originator of -> (and friends such as ->>) but FWIW there is this page that has user-contributions (and comments) from over the years regarding it.

Of note might be this bit:

;; Be cautious with anonymous functions; they must be wrapped in an outer
;; pair of parens.
(-> 10
    #(/ % 2))
;; will throw an exception, but
(-> 10
    (#(/ % 2)))
;; will work fine. Similarly,
(-> 10
    (fn [n] (/ n 2)))
;; will throw an exception, but
(-> 10
    ((fn [n] (/ n 2))))
;; works as intended.

Janet's |(...) construct is mostly analogous [1] to Clojure's #(...) construct.

So in Janet we'd have:

$ janet
Janet 1.35.2-872b39cc linux/x64/gcc - '(doc)' for help
repl:1:> (-> 10 (|(/ $ 2)))
5
repl:2:> (-> 10 ((fn [n] (/ n 2))))
5

Note that Clojure's docstring for -> is:

clojure.core/->
([x & forms])
Macro
  Threads the expr through the forms. Inserts x as the
  second item in the first form, making a list of it if it is not a
  list already. If there are more forms, inserts the first form as the
  second item in second form, etc

...and the "unexpected" behavior got documented by users elsewhere.

FWIW, there is a somewhat similar site for Janet here with a page for -> here [2].

There have been remarks made in the past along the lines of "memory isn't free", so although I can related to wanting a place to look for gotchas and the like, I'm not sure the docstrings are the best place for them.


[1] In Janet you can also do stuff like |[:a :b], which IIUC, isn't doable in Clojure.

[2] I went ahead and added a couple of examples, but others are ofc free to do similarly :)

@na-sa-do
Copy link
Author

I think it's very desirable!

@ianthehenry, I'm not saying that -> in general is broken (anymore). I'm saying that nobody would have foreseen this precise case and thought of the specific result as a good thing. It's a natural consequence of the actual semantics of -> and co, which I forgot, and those semantics are generally useful (...presumably), but that doesn't mean this specific interaction is good. It's just inevitable.

I think in my case, the root of the issue is my familiarity with other languages' "pipeline operator", which behaves as I expected -> to. But, IMO, it's likely to recur as long as -> special-cases bare symbols. Yes, that sounds unrelated; bear with me for a moment. Imagine that you're unfamiliar with ->, and you're skimming through a codebase which uses it. If it happens to look like this:

(-> value
    some-func
    another-func
    yet-another-func)

then you would naturally intuit that this was a function that takes functions, because that's exactly what it looks like. Even if you look up the docs and read about it, first impressions are a powerful thing. On the other hand, if it looks like this:

(-> value
    (some-func)
    (another-func)
    (yet-another-func))

then, assuming you know those functions' signatures, your first impression is that this code is outright wrong. You would then wonder why this code ever worked, look up -> in the docs, and take in its meaning correctly, with no faulty assumption ever entering the picture.

Programming language design is a kind of UI design. Yes, programming languages are specialist tools and should assume that their users are specialists, but that's no reason to make things that are, frankly, outright misleading for the sake of letting people who are already familiar with them shave off a few characters.


There have been remarks made in the past along the lines of "memory isn't free", so although I can related to wanting a place to look for gotchas and the like, I'm not sure the docstrings are the best place for them.

If a few hundred bytes of memory are really that significant an issue, @sogaiu, then there should be an option to simply discard docstrings as code is loaded. That would save comparatively enormous amounts of memory, without forcing the docs to skimp on quality. Python does something similar, with its "optimizer" that (IIRC) just deletes docstrings from compiled bytecode and does nothing else.

@sogaiu
Copy link
Contributor

sogaiu commented Jul 30, 2024

Regarding discarding / unbundling of docstrings, please see this issue. I don't think it's a bad idea but (depending on what means precisely) it doesn't exist yet [1].

In case it isn't apparent, AFAIU, one of the use cases for Janet is for it to be embedded and so in that context size can make a difference. Including certain types of remarks in docstrings (vs excluding) can over time and space add up (there are a lot of built-in callables in Janet).

In the mean time, I think what was suggested above about placing certain types of clarifications elsewhere makes sense, though now, some of the concerns have been touched on at janetdocs. I'm sure additions would be welcome if what got added was insufficient.


[1] IIUC, janet can be built without docstrings using this. The resulting janet is considered non-standard though -- presumably it was considered better to have some sorts of docstrings in many cases so removing them outright is considered an extreme measure.

@CFiggers
Copy link

CFiggers commented Jul 30, 2024

@na-sa-do I perceive that you are comfortable with direct communication. Permit me to engage in some.

It seems to me that you are struggling with the concept and behavior of the -> and ->> macros. And if I were to guess at the reason for that, I would guess that it is because your mental model of what macros are and how they function in general is underdeveloped.

You object above to the way that -> and ->> "special-case" symbols, because that (in your opinion) creates the impression of a function that takes functions as arguments. That impression, you say, could lead someone (or perhaps has led someone; has perhaps led you, just for example) to expect a function declaration (i.e. (fn [])) or anonymous function literal (i.e. |()) to work in the threading macro the same way as a symbol representing a function.

The fact of the matter is, a macro is not a function and the semantics of the two are subtly different.

To wit: macros do not operate upon the compiled state of their arguments, but upon the literal tokens that they are passed. In other words, at the moment when a macro executes there is no such thing as a "function," per se. At macro-execution time, there are only such things as values and data structures containing values.[0]

The entire purpose of macros is to enable rearrangement of tokens prior to compilation of those tokens. This allows macros to receive invalid Janet syntax and produce valid Janet syntax by following the macro's internal logic—every macro, from defn to tabseq, extends the space of valid Janet syntax through the use of a Domain-Specific Language (DSL) that is local to that macro (in a sense, each macro is itself a "domain" within which a specific and distinct DSL applies).

It is the established convention for -> and ->> to take two kinds of things as arguments, after the first one: lists and non-lists. That's it.

  • Given a list, the threading macros will insert the running expression (beginning with their first argument) as either the second or last item in the list. That form, that unevaluated form, then becomes the running expression that the threading macros will either return or continue on to the next form with.
  • Given anything that is not a list, the threading macros will first wrap that form in parentheses, i.e. make it into a list, and then handle it the same way as above.

At macro execution time a form like (fn [x] (* x x)) is not a function. At macro execution time there is no such thing as a function.[0] The macro "sees" such a form as a list and threads the running expression into it as it would any other list.

It is only after macro execution time, when the form has been re-written and then inserted back in place, that any evaluation or function application happens—the final threaded form is what is handed to the compiler, at which point function calls can start to happen as usual.

You accuse -> and ->> of "special-casing" symbols, as though symbols alone are being treated in an exceptional way. But in fact, the way -> and ->> treat symbols is perfectly consistent with the way they treat all other non-list forms. Such behavior is not limited to symbols. Let's look at some examples:

Structs and tables are not lists. That means that the -> and ->> macros will wrap them in parentheses and then thread the running expression into them as normal. That means dictionaries can be used directly as valid forms in a threading macro, which I find neat:

Janet 1.35.2-872b39cc linux/aarch64/clang - '(doc)' for help
repl:1:> (-> :a {:a 1 :b 2})
1
repl:2:> (-> :a {:a :z :b :y} {:y 1 :z 2})
2

Keywords similarly are not lists. That means that the -> and ->> macros will wrap them in parentheses and then thread the running expression into them as normal. That means that method call syntax can be chained in a threading macro no problem:

Janet 1.35.2-872b39cc linux/aarch64/clang - '(doc)' for help
repl:1:> (def Obj @{:value 1
repl:2:({> :triple (fn [self]
repl:3:({(> (* 3 (self :value)))})
@{:triple <function 0x006EE68B1F90> :value 1}
repl:4:> (-> Obj :triple)
3

Add to this the known case of symbols that you've brought up. Symbols are not lists. That means that the -> and ->> macros will wrap them in parentheses and then thread the running expression into them as normal. This means that symbols that are bound to function definitions in the environment table can be used directly in the threading macros without needing to wrap them in parentheses (though you can also do exactly that and it works just as well).

Once again: you have accused -> and ->> of "special-casing" symbols, as though symbols alone are being treated in an exceptional way. But in fact, the way -> and ->> treat symbols is perfectly consistent with the way they treat all other non-list forms. To make symbols behave any differently would be the inconsistent behavior.

Supposing that the programmer has an accurate mental model of what macros are and how they operate, this simply is not strange, ambiguous, or confusing. What else could a threading macro do with a symbol or a keyword or a dictionary, if not make it capable of being threaded into by wrapping it in parentheses? And what else could a threading macro do with a form like (fn []), which is clearly a list, if not insert the running expression into it like it would any other list?

So then, finally, to your argument that treating bare symbols this way creates a footgun for new programmers who are just being introduced to the language. Your opinion seems to be that this makes learning the language more difficult than necessary. On the contrary, I say: the -> and ->> macros are indeed footguns (small ones, in my opinion), but they are footguns that make it easier for new programmers to learn the language. These macros are ideal case studies for beginner Janet programmers to organically discover, as you have, that macros are different from functions and be challenged to learn how and why.

Do -> and ->> have sharp edges on them? Yes, they do (small ones, in my opinion). And much like a utility knife, the sharp edge is the useful part.


[0] Technically I'm lying a tiny bit here—functions do exist at macro execution time, but they exist as values and have no special behavior any different from any other value—say, a number, a symbol, or a keyword.

@na-sa-do
Copy link
Author

It seems to me that you are struggling with the concept and behavior of the -> and ->> macros. And if I were to guess at the reason for that, I would guess that it is because your mental model of what macros are and how they function in general is underdeveloped.

I will grant that I was mistaken, but not about that. I'm already familiar with macros in other languages that have them, most notably Rust. However there's a critical difference between Rust macros and Janet macros (or Lisp-like languages more generally), which I'd not realized until just now. Consider this Janet code.

(foo bar)

Can you tell me whether foo is a function or a macro? No, you can't, because in Janet (and other Lisps), macros are indistinguishable from functions at the use site. On the other hand, in Rust, the two cases are spelled differently:

foo(bar); // function
foo!(bar); // macro

The exclamation mark serves as an unambiguous signal that you are exiting the realm of normal Rust and entering a DSL of some description. Here be dragons, etc. In the Rust community, convention is to always include the ! in a macro's name, even in conversation, making the distinction even clearer (although the language itself doesn't always adhere to this principle -- when importing a macro, you don't use the !).

You accuse -> and ->> of "special-casing" symbols, as though symbols alone are being treated in an exceptional way. But in fact, the way -> and ->> treat symbols is perfectly consistent with the way they treat all other non-list forms.

Once again, I will grant that I was mistaken, but it doesn't change my argument. Consider a Janet in which... hang on, what's the function to append to arrays...

Hold on a second. I was about to use a hypothetical example in which the function for concatenating arrays and the function for adding one element to the end of an array were merged together, and it guessed which sense you meant based on whether it was passed an array or not, thus forcing every call site to explicitly guard against whichever behavior they didn't mean or else become ambiguous between two operations that are obviously distinct and should not be conflated. But, um,

janet:1:> (array/concat @[] 1)
@[1]
janet:2:> (array/concat @[] [1])
@[1]

You live like this?

What else could a threading macro do with a symbol or a keyword or a dictionary, if not make it capable of being threaded into by wrapping it in parentheses?

Not do that, and instead fail with a useful error message.

And what else could a threading macro do with a form like (fn []), which is clearly a list, if not insert the running expression into it like it would any other list?

Notice that fn is a special form, and blindly injecting an expression at its beginning is guaranteed to result in utter nonsense that will inevitably fail with a bizarre non-sequitur of an error message, and so not do that, and instead fail with a useful error message.

(Okay, in the very specific case where the fn form is the first of the forms to be injected into, and the initial expression being injected is a valid destructuring pattern, it will technically produce something that will compile. But it's almost certainly not what you meant, and if you did mean that it would be much clearer to write the whole fn form as the initial expression being injected rather than splitting it like that.)

@bakpakin
Copy link
Member

Closing this as the change is overly pedantic and possibly confusing. Using macex1 should give an expansion that behaves exactly as you would expect.

@bakpakin bakpakin closed this Jul 31, 2024
@na-sa-do na-sa-do deleted the threading-macro-docs branch August 4, 2024 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants