Adding a Base64 type #42

Korvox · 2017-08-23T04:39:20Z

Rust is at its best when you are using strong typing to indicate meaning. While in essence a base64 encoded string is always a [u8], functions that want a base64 encoded string would often rather be more specific than just taking String, &str, [u8], Vec<u8>, etc. Hence, a Base64<C> type - representing an encoded string, it could have convenience methods for converting into / from all the various string representations and collections, and would let library authors be more explicit in wanting a base64.

It also seems to be to be much more newbie friendly to show them an api like:

fn send(data: Base64<UrlSafe>)...

Instead of just asking for a string and saying it should be base64:

// Please give me a base64 encoded url-safe string!
fn send(data: String)...

There certainly is an argument publically facing APIs shouldn't be asking for data in base64, it should be doing that conversion internally. But even internal to consumer libraries, having concrete types is still very useful for readability and maintainability - you encode something somewhere, and then have to keep annotating your uses of it in other functions and structs as being base64.

This is similar to how Url and Uri types work in other crates. They internally use Strings or Vecs but expose a type to avoid ambiguity when handling it, and then have a range of convenience into's and from's to get coerce them into the standard types.

The <C> generic is so that you can require the Default or UrlSafe encoding scheme - otherwise you have to manually inspect an unsanitized input string from anywhere you cannot perfectly trust to check for the illegal characters. Another option would have to have Base64 be an enum of Default and UrlSafe, but that becomes a runtime variant match check with cumbersome need to match on it.

A related issue is #17, where having an actual Base64 type would obviously implement the specialized display.

I wouldn't mind forking and trying to draft out a Base64 type if you are at all interested!

The text was updated successfully, but these errors were encountered:

marshallpierce · 2017-08-23T23:18:13Z

It's an interesting idea, though I think I'd have to see some examples of it in use to get a better idea of how the API should be structured. (For that matter, I'm not sure it even needs to be part of this crate.)

If you had the opportunity to create a Base64<T> from some bytes and hand it off to something else, wouldn't you be better off refactoring that destination to accept a &[u8] rather than a Base64<T>? In other words, I suspect the common case would be to refactor send() to just take a &[u8] and let it encode in the correct fashion (though wanting to pass in a cached copy of commonly used base64 would be a use case for a Base64-as-a-type solution) I think the opportunities to have a type-system-guaranteed handoff from something that a base64 encoding routine could emit to something that cares about its encoded form (presumably because it wants to decode it) may be few and far between. But, I could well be missing something...

Also, it's not just as simple as Default and UrlSafe: there's also padding and wrapping.

Korvox · 2017-08-23T23:56:20Z

Also, it's not just as simple as Default and UrlSafe: there's also padding and wrapping.

You can infer based on the Base64 you get if its padded or wrapped - its apparent in the structure. You cannot tell, however, if it was meant to be UrlSafe or not, if the string never uses the Default or UrlSafe substitution characters. That being said, it is all fairly subjective on whether that is even valuable knowledge or not! It would be really strange to put someone into a situation as the owner of a Base64 object to try appending two Base64s with no conflict even if they used the two different encoding schemes (albeit in the same vein appending Base64's where one is padded / wrapped and one is not would also require handling).

you be better off refactoring that destination to accept a &[u8]

That is the whole argument. Right now it is perfectly normal when using base64 to pass around the encoded data as any of the various forms of u8 containers. But sending around any of those containers is information lossy because the data structure doesn't enforce its interpretation as base64.

For that matter, I'm not sure it even needs to be part of this crate.

Absolutely agreed. I think this weekend I'll try impl'ing this as a separate crate. If it becomes a popular UX pattern it could be merged any time after that.

marshallpierce · 2017-08-24T02:50:07Z

One small point -- you can't necessarily infer padding or wrapping because you might only ever see input that has length % 4 == 0, and you can't infer wrapping, because you might only ever see input shorter than whatever wrapping length you cared about.

Right now it is perfectly normal when using base64 to pass around the encoded data as any of the various forms of u8 containers. But sending around any of those containers is information lossy because the data structure doesn't enforce its interpretation as base64.

I was referring to passing around the data that you would get if you decoded the base64, not the ascii bytes that could be interpreted as base64.

Anyway, I'm curious to see what you come up with.

Korvox · 2017-09-03T02:24:39Z

Almost forgot to do this! Just threw together an example over here. [1]

You are right, in that there is no real intuitive way to persist the config data in the struct without allocations (Configs would have to be zero sized phantomdata), and you need it around all the way for decoding purposes and for validation.

That being said, I think I discovered why having a Base64 type is really useful. It makes the whole interaction error-free. You can only construct a Base64 with bytes, the encoding cannot fail, and thus any Base64 that exists must be encoded and thus decoding cannot fail. That is where strong typing is really useful, and eliminates the entire error class surrounding trying to decode unencoded data.

The only piece that is missing is deserialization for this type that validates its actually base64 when being passed generic data. I'll try to do that soon™. That would be where you want the conditionality. For libraries in Rust, passing around base64 as a concrete type with decode guarantees is a real ergonomic win in my book.

[1] There are some hacks here around the public base64 api, like unwrapping the decode.

shaleh · 2017-12-24T07:45:44Z

As a Haskell hacker too, I agree. Strong types communicate intent as well as contracts.

AlexanderThaller · 2018-04-09T00:44:57Z

Would be also nice while using serde. Something like this comes to mind:

[
  {
    "CreateIndex": 100,
    "ModifyIndex": 200,
    "LockIndex": 200,
    "Key": "zip",
    "Flags": 0,
    "Value": "dGVzdA==",
    "Session": "adf4238a-882b-9ddc-4a9d-5b6758e4159e"
  }
]

#[derive(Serialize, Deserialize)]
struct RootInterface {
  CreateIndex: i64,
  ModifyIndex: i64,
  LockIndex: i64,
  Key: String,
  Flags: i64,
  Value: Base64,
  Session: String,
}

marshallpierce · 2018-04-09T02:58:47Z

Interesting. I do like the idea in general of having less magic in a custom serializer and more explicit DTOs. Perhaps there's room for that in the base64-serde crate.

ggriffiniii · 2019-07-28T15:48:33Z

I realize this is an ancient issue, but I thought I would mention that I implemented an alternative base64 crate that may be a little better fit for a strongly typed concept like this. It's called radix64 and the big difference that may help here is that each configuration is a distinct type Std, StdNoPad, UrlSafe, UrlSafeNoPad, Crypt. Having the type of encoding as part of the rust type seems like it would benefit this concept so just thought I would mention it.

Korvox changed the title ~~Adding a Base64<C> type~~ Adding a Base64 type Sep 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a Base64 type #42

Adding a Base64 type #42

Korvox commented Aug 23, 2017 •

edited

Loading

marshallpierce commented Aug 23, 2017

Korvox commented Aug 23, 2017 •

edited

Loading

marshallpierce commented Aug 24, 2017

Korvox commented Sep 3, 2017 •

edited

Loading

shaleh commented Dec 24, 2017

AlexanderThaller commented Apr 9, 2018 •

edited

Loading

marshallpierce commented Apr 9, 2018

ggriffiniii commented Jul 28, 2019

Adding a Base64 type #42

Adding a Base64 type #42

Comments

Korvox commented Aug 23, 2017 • edited Loading

marshallpierce commented Aug 23, 2017

Korvox commented Aug 23, 2017 • edited Loading

marshallpierce commented Aug 24, 2017

Korvox commented Sep 3, 2017 • edited Loading

shaleh commented Dec 24, 2017

AlexanderThaller commented Apr 9, 2018 • edited Loading

marshallpierce commented Apr 9, 2018

ggriffiniii commented Jul 28, 2019

Korvox commented Aug 23, 2017 •

edited

Loading

Korvox commented Aug 23, 2017 •

edited

Loading

Korvox commented Sep 3, 2017 •

edited

Loading

AlexanderThaller commented Apr 9, 2018 •

edited

Loading