Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds ion eq command #187

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Adds ion eq command #187

wants to merge 2 commits into from

Conversation

popematt
Copy link
Contributor

@popematt popematt commented Jan 7, 2025

Issue #, if available:

Description of changes:

Adds an ion eq command to test whether two Ion streams are equal.

I'm not entirely sure of the API. Currently, there are option flags to select an output method (bool to stdout, exitcode, or both), but we could also separate this into two different commands (eq and eq! that each support only one of the output methods) with minimal code changes.

In addition, I'm not entirely satisfied with the "input mode" selection, but we might be able to overlook that given that I've included an automatic input type detection that should correctly detect "file", "Ion", or "hex" for any non-trivial input, and most trivial inputs.

Additional Changes:

  • Exit Codes - clap uses 2 for a usage error. I have tried to reserve the 1 exit code for "mismatch" according to the convention in https://www.gnu.org/software/libc/manual/html_node/Exit-Status.html, so all other errors use exit code 3.
  • Added a step to display the versions of cargo, clippy, and rustfmt in the GHA workflow.
  • hex_reader.rs had changes that are entirely clippy and/or rustfmt changes.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@popematt popematt requested review from jobarr-amzn and zslayton and removed request for jobarr-amzn January 7, 2025 20:24
@popematt popematt changed the title Update eq command to have input mode auto-detection and multiple outp… Adds ion eq command Jan 7, 2025
Comment on lines +27 to +29
# Dump the toolchain versions to make it easier to spot when there's
# a mismatch with the version you have installed locally.
- run: cargo --version && cargo clippy --version && cargo fmt --version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thinking.


static HELP_EPILOGUE: LazyLock<String> = LazyLock::new(|| {
format!(
// '\' at the end of the line indicates that CLAP will handle the line wrapping.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a hard and fast rule. It's only true because I enabled the wrap_help feature in a prior PR.

src/bin/ion/commands/eq.rs Outdated Show resolved Hide resolved
Comment on lines +30 to +33
1. If the input is not valid UTF-8, it is assumed to be Ion binary
2. If the input is a path to an existing file, the input is assumed to be a file name
3. Then, eq attempts to parse the input as Ion
4. Then, eq attempts to parse the input as a stream of hex digit pairs
Copy link
Contributor

@zslayton zslayton Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm confident you've already thought through these cases, but I'll spell them out for other reviewers to mull.)

This is clever, but I think I'd prefer just explicitly flagging hex pairs or strings of Ion text. The inference would usually work, but there are weird corner cases that I'd rather not have to think about. For example:

  • If I try to read the file example_file and it's missing, the tool will consider the name to be the Ion symbol example_file. I think it would be better to surface an error when the file is missing. (Maybe I'm a sociopath for often making temporary files with no file extension like x or foo? 😅)
  • Many hex pair sequences that don't start with a binary IVM are legal text Ion streams. Consider 63 75 72 73 65 73 21, which is both the ascii sequence curses! and a text Ion 1.0 integer sequence.

For consistency/predictability, I'd like unadorned arguments to always be considered file names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider 63 75 72 73 65 73 21, which is both the ascii sequence curses! and a text Ion 1.0 integer sequence

...but curses! is not a valid Ion text stream, so there's no ambiguity.

I've explained in more detail here why I think that it's okay, but I'm fine with removing the automatic detection if the consensus is that it is unhelpful. (Although, would you consider having a simpler file-vs-Ion auto-detection that a user can set as their own default by setting an environment variable?)

Copy link
Contributor

@zslayton zslayton Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but curses! is not a valid Ion text stream, so there's no ambiguity.

That's what I get for trying to be funny. 😅 Lop off the ! and the point still stands.

I did see the other explanation later in my review. I think the cost/value of automatic detection is pretty neutral--it saves a small amount of typing and introduces some small corner cases. Personally I'd like --text "my Ion string" and --hex "63 75 72 73 65 73".

(Although, would you consider having a simpler file-vs-Ion auto-detection that a user can set as their own default by setting an environment variable?)

Hm, maybe? I guess I'd have to see the design.

.action(ArgAction::Count),
Arg::new(HEX_INPUT_ARG_ID)
.help("Interprets an input as a string of hex digit pairs")
.short('x')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ion inspect currently uses --hex for hex input, which I like. Capital -X is our experimental flag, which is a subtle differentiation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah. There's some consistency things I need to clean up.

Comment on lines +72 to +75
.help("Indicates that an input is an Ion stream")
.short('i')
.help_heading("Input mode")
.action(ArgAction::Count),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Ion stream mean it's a string argument containing Ion text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could also be Ion binary. Practically speaking, you can only pass in a string as a CLI arg (so it must be Ion text), but you could pass Ion binary via stdin if you're piping the output from another command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Does -i mean that a string follows that is Ion text, or that the application is going to read from STDIN where it could find either text/binary? Maybe an example or two would help.

Copy link
Contributor Author

@popematt popematt Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage: ion eq [OPTIONS] <input-a> [input-b]

There are always exactly two inputs for this command. The argument labelled <input-a> is always required. The argument labelled [input-b] is optional, but when it is omitted, the command reads from stdin to get the second input.

The "input mode" flags just indicate how to interpret the arguments.

So, -ii would mean that both inputs are Ion streams, -fi would mean that input-a is a file name, and input-b (or the content of stdin) is an Ion stream, -ff would mean that both are file names (regardless of where they come from).

I'm realizing now that this is fairly different than most of the existing commands. I initially implemented this differently, but I thought it was too verbose, so I tried to redesign the API to be more concise. I can resurrect my initial implementation is this is too unexpected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for explaining! My preference would be taking two inputs, where each can be:

  • --text "text Ion"
  • --hex "<hex pairs any Ion encoding>"
  • Unadorned positional filename argument
  • STDIN (implicit when only one input is specified, as you describe.)

I think this would work nicely in our other commands too. We can wait for others to weigh in, though!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm liking your recommendation @zslayton, though it never hurts to also support an explicit --file as well as the unadorned positional filename arguments.

.short('x')
.help_heading("Input mode")
.action(ArgAction::Count),
Arg::new("bool-output")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the default behavior should be both human-friendly output and a status code. diff returns status 1 when the inputs are not the same, along with text output:

image

I foresee using ion eq a lot on the command line and I don't want to have to write out -B or && echo $? to understand what happened. 😊

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, -E/--exit-code-output should always be the case and it should default to -B behavior with an optional --silent/-q to suppress stdout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants