[DONE] How do we refactor our errors? #636
Replies: 9 comments 13 replies
-
In #460 we also discussed introducing a |
Beta Was this translation helpful? Give feedback.
-
Here are a couple of libraries that use
I also find the pattern of using a combination of |
Beta Was this translation helpful? Give feedback.
-
An error type that fits the description in strategy 3 is built in this blog post by Jane Lusby from the Rust error handling working group. |
Beta Was this translation helpful? Give feedback.
-
With regards to the question: How do we avoid introducing irrelevant information when these errors cross module and/or crate boundaries? So suppose we have a module foo with an error |
Beta Was this translation helpful? Give feedback.
-
Generally, 1 and 2 seem good to me. I wouldn't be as strict to prescribe only using One of the most important parts to me is how wrapping errors works. In the account you can come across a wide variety of errors, from client, to storage, to DID document problems ("verification method not found", ...). These will currently be our highest-level errors exposed to developers and so the guideline should work well for those. From what I understand snafu should make that wrapping process relatively simple. Snafu seems to work best when errors are wrapped all the way to the source, in order to get a "semantic backtrace", as it was described in one of the blog posts. That seems very helpful for debugging and logging, at the cost of making the errors rather large. The consequence seems to be that if you have a relatively fragmented error landscape on the lower-levels, then the higher-level would have as many variants as there are fragmented errors. It is a downside in so far, as the developer of the higher-level needs to write a lot of wrapping code. The upside is that it is very explicit and no information is lost. Basically, what you meant with "How do we avoid introducing irrelevant information when these errors cross module and/or crate boundaries?".
I think we fundamentally want our errors to be matchable and informative without throwing errors that cannot occur, and this approach achieves both, so overall I think it's worth giving it a shot.
We can write a custom report function that produces a nice representation for JS. There's an example towards the end of that blog you linked. For guideline 3 I don't see how this is better than what we have now. Guideline 4 is also interesting, since we map the lower-level errors to instances of the higher-level module. Still, we have way more errors to handle than a function can possibly return, and this seemed like one of the main goals of the refactor to me. TL;DR: The snafu solution seems like the best trade-off, even if it's not perfect. |
Beta Was this translation helpful? Give feedback.
-
I feel that 3 is the best approach for most people to rely on, it provides a quick hint on what went wrong and then the user can see from there. In my experience in working on Identity Suite I ran into a weird issue where I had accidentally deleted the line which signed and published my updated DID with a few methods added to it and I remember that the error it gave was so vague that I could not find what was going wrong for quite a long time. If we have concise output that makes it extremely simple for most consumers of the library to deduce what went wrong then I think there is nothing better than that approach. Would love to hear your thoughts on this :) |
Beta Was this translation helpful? Give feedback.
-
It's rather difficult to analyse any of the given strategies without concrete examples of what the code would look like and how that would affect ergonomics and development "pain", both internally and for consumers of the library. There's quite a lot of background and context missing from previous discussions so I'll only comment on the approaches presented. Strategy 1
If you take a look at that guide, it's from over two years ago, which is around the same time As @PhilippGackstatter mentioned, several of the guidelines such as using I don't see any significant benefits to Example from https://nick.groenen.me/posts/rust-error-handling/ use thiserror::Error;
/// WordCountError enumerates all possible errors returned by this library.
#[derive(Error, Debug)]
pub enum WordCountError {
/// Represents an empty source. For example, an empty text file being given
/// as input to `count_words()`.
#[error("Source contains no data")]
EmptySource,
/// Represents a failure to read from input.
#[error("Read error")]
ReadError { source: std::io::Error },
/// Represents all other cases of `std::io::Error`.
#[error(transparent)]
IOError(#[from] std::io::Error),
}
This is a problem if we ever aim to provide a stable API, which includes not exposing errors from pre-1.0 crates, so I don't think This alone would be reason enough not to switch from Strategy 2
The advantage of module-level errors is specificity in that the possible errors are more defined, but has the disadvantage of being more unwieldy to developers using our library, since they have to handle multiple error types (unless they use I.e. module-level errors may hinder library developers that use identity.rs due to multiple exposed types but it probably won't affect application developers using opaque errors much. That said, I have already used a module-level error where it made sense (and due to required trait bounds) with #[derive(Debug, thiserror::Error, strum::IntoStaticStr)]
#[non_exhaustive]
pub enum DIDError {
#[error("Invalid Authority")]
InvalidAuthority,
#[error("Invalid Fragment")]
InvalidFragment,
#[error("Invalid Method Id")]
InvalidMethodId,
#[error("Invalid Method Name")]
InvalidMethodName,
#[error("Invalid Path")]
InvalidPath,
#[error("Invalid Query")]
InvalidQuery,
#[error("Invalid Scheme")]
InvalidScheme,
#[error("{0}")]
Other(&'static str),
}
I'm pretty sure the ecosystem is aligned on this: as a library, we definitely should not use opaque error crates like Strategy 3
While I have seen some crates like I also foresee long arguments about what the
I don't think implementing custom methods to return context aligns with the current error handling ecosystem when there are existing approaches like
Typically we just format them in a string if we want to add that information to the error message. Strategy 4I am most aligned with improving our current errors.
I'm not convinced the one-error-per-module approach is a significant improvement either though (see my Strategy 2 comment above). We would go from ~6 error types to more than two dozen quite easily. So what should we do?In my opinion, we should:
E.g. #[error("Invalid Document - Missing Message Id")]
InvalidDocumentMessageId,
#[error("Invalid Document - Signing Verification Method Type Not Supported")]
InvalidDocumentSigningMethodType, can become: #[error("invalid DID Document - {0}")]
InvalidDocument(&'static str), // or String if dynamic values are required. return Err(Error::InvalidDocument("missing message ID"));
return Err(Error::InvalidDocument("signing verification method type not supported"));
E.g. #[error("Invalid Root Document")]
InvalidRootDocument, needs more context: #[error("invalid root document - {0}")]
InvalidRootDocument(&'static str), // The previous message id must be null.
if !document.metadata.previous_message_id.is_null() {
return Err(Error::InvalidRootDocument("not first in chain due to non-null previous_message_id"));
}
// Validate the hash of the public key matches the DID tag.
let signature: &Signature = document.try_signature()?;
let method: &IotaVerificationMethod = document.try_resolve_method(signature)?;
let public: PublicKey = method.key_data().try_decode()?.into();
if document.id().tag() != IotaDID::encode_key(public.as_ref()) {
return Err(Error::InvalidRootDocument("signature verification method does not match DID tag"));
}
Specifically this comment which defines the project's official guidance as:
Suggestions 1-5 can be done incrementally and do not require a major all-at-once refactor, so we can easily explore whether or not they work in a single crate/module independently or combined. Suggestion 6 is confined to the Wasm bindings where we have more leeway. With regards to multiple module-level errors vs crate-level errors, I'm still undecided. See comment under Strategy 2 above. I would not support a massive refactor to module-level errors but if a proof-of-concept proves successful I would be happy to be proven wrong and go with an incremental refactor. |
Beta Was this translation helpful? Give feedback.
-
Regarding to the WASM bindings, to my understanding the bindings will somehow consume the errors and convert them to JavaScript errors. This is done now by throwing an For example the following code: try {
await account.createService({
fragment: "my-service-1",
type: "MyCustomService",
endpoint: "invalid-url"
})
}catch(e){
if(e instanceof Error){
console.log(e.name)
console.log(e.message)
}
} will print:
Although the error It might be helpful in this case if JS errors are more categorized. For example the try {
...
account.publish()
}catch(e){
if(e instanceof Error && e.name === "TangleError"){
ShowTangleStateError(e.message);
} else{
...
}
} In this case the developer doesn't care about what happened exactly as soon as it has something to do with the tangle, so maybe they can continue by checking their Tangle node or show an error screen to the user. Also one thing to keep in mind is using custom error types instad of |
Beta Was this translation helpful? Give feedback.
-
We settled on implementing Strategy 5 and re-evaluating if more should be done to improve error handling in the JS bindings once this is done. Progress on implementing the new strategy will be tracked in issue #534. |
Beta Was this translation helpful? Give feedback.
-
Introduction
In discussion #460 we discussed some of the pitfalls of our current error types and some ideas about how we can improve the situation.
This new discussion is about settling on a concrete strategy that we follow.
Here are some ideas:
Guidelines strategy 1:
Follow the same guidelines as influxdb_iox.
The aforementioned guidelines mention the pros of this approach so I will only list some cons (for identity here).
Cons:
snafu
(which is not as widely used asthiserror
andanyhow/eyre
) to everyone's dependencies if we go for this approach. Furthermoresnafu
is not yet stable (current version is 0.7) and thus subject to change.Open questions:
Guidelines strategy 2:
Similar to 1, but not necessarily using
snafu
and not necessarily as strict.eyre:Report
/anyhow::Error
or introduce our own.Report
type in errors overString/&'static str
in order to add more context at runtime (and optionally also a backtrace).Pros:
Cons:
snafu
than strategy 1.Open questions:
These are the same as in strategy 1.
Guidelines strategy 3:
Only use an abstract error type
IdentityError
that has a methodkind
that returns an enum of possible error categories (similar to https://doc.rust-lang.org/std/io/struct.Error.html#method.kind). TheErrorKind
enum should only have variants that broadly describe the kind of failure that occurred likeIO
,ConversionFailure
,MissingEntity
,InvalidProof
,InvalidData
,DuplicationAttempt
etc.Pros:
Wasm
bindings as it would allow theWasmError
to also provide the error category.Option<ContextDependentInformation>
in a non breaking manner.Cons:
Timestamp
, aDIDUrl
etc.) that may be used by callers when reacting to an error at runtime will be at best awkward.Guidelines strategy 4:
Try to improve the error types we already have. This would typically entail removing variants that represent errors from other crates. For instance these variants in
identity_account::Error
:should all be removed and then the internal code that receives these errors maps them to something that is a better fit in this context.
Furthermore we can no longer expose errors from non-stable crates in these enums so they need to be wrapped in opaque types.
Pros:
Cons:
Any thoughts on the proposed ideas or any other suggestions would be greatly appreciated.
Edit: Additional strategies after collecting feedback.
Guidelines strategy 5.
Following @cycraig's suggestion copied here:
DIDError
. (Edit: for future reference, the decision of when to define sub-error enums or inline the error message is likely based on how often that message is repeated and how many distinct variants there are. One instance => inline the message, multiple tends towards sub-error enums).E.g.
can become:
E.g.
needs more context:
source
where appropriate, to enable error chaining/context for application error handlers. This is supported by the discussions in the following Rust Error Handling Project issues:Specifically this comment which defines the project's official guidance as:
Replace blanket
#[from]
implementations of external errors with errors for specific cases as mentioned in Strategy 4. NOTE: this may actually increase the number of variants, so it does not have to be a hard rule.Mark exposed error enums as
#[non_exhaustive]
to prevent technical breaking changes from introducing new variants, similar toDIDError
.Consider using
anyhow
/eyre
to construct a report including context for error messages in the Wasm bindings. (Edit: to be clear this is because while Rust can report the context by usinganyhow
/eyre
, Wasm needs an alternative.)Suggestions 1-5 can be done incrementally and do not require a major all-at-once refactor, so we can easily explore whether or not they work in a single crate/module independently or combined. Suggestion 6 is confined to the Wasm bindings where we have more leeway.
Guidelines strategy 6:
Using something similar to strategy 3 for the javascript bindings and another strategy for Rust. This needs to be split into several sub-strategies we can call these W3-Rk where k is a number between 1 and 5 which then references the correspondingly numbered strategy above.
Note that this does not necessarily exclude also including more context in
message
as suggested in this quote from @cycraig :Pros:
name
might be in the JS bindings then knowing that it can be one of say 10 things is an improvement for many developers using the JS bindings (see @abdulmth's comment below).message
. We could even consider making the first line ofmessage
befull name: <current error.name>
.Cons:
I will directly quote @cycraig here:
Beta Was this translation helpful? Give feedback.
All reactions