Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hash: add XOF interface #69518

Open
FiloSottile opened this issue Sep 18, 2024 · 18 comments
Open

hash: add XOF interface #69518

FiloSottile opened this issue Sep 18, 2024 · 18 comments
Labels
Proposal Proposal-Accepted Proposal-Crypto Proposal related to crypto packages or other security issues
Milestone

Comments

@FiloSottile
Copy link
Contributor

FiloSottile commented Sep 18, 2024

Background

An extendable output function (XOF) is a hash function with arbitrary or unlimited output length. They are very useful for tasks like key derivation, random number generation, and even encryption.

We have two (or rather three) XOF in x/crypto already: SHAKE in x/crypto/sha3 and BLAKE2X in x/crypto/blake2b and x/crypto/blake2s. In third-party modules at least KangarooTwelve in github.com/cloudflare/circl/xof/k12 and BLAKE3 in lukechampine.com/blake3 and github.com/zeebo/blake3 see some use.

The SHAKE XOFs return a ShakeHash interface.

type ShakeHash interface {
	hash.Hash

	// Read reads more output from the hash; reading affects the hash's
	// state. (ShakeHash.Read is thus very different from Hash.Sum)
	// It never returns an error, but subsequent calls to Write or Sum
	// will panic.
	io.Reader

	// Clone returns a copy of the ShakeHash in its current state.
	Clone() ShakeHash
}

The BLAKE2X XOFs return a blake2[bs].XOF interface.

type XOF interface {
	// Write absorbs more data into the hash's state. It may panic if called
	// after Read.
	io.Writer

	// Read reads more output from the hash. It returns io.EOF if the limit
	// has been reached.
	io.Reader

	// Clone returns a copy of the XOF in its current state.
	Clone() XOF

	// Reset resets the XOF to its initial state.
	Reset()
}

Proposal

Important

Current proposal at #69518 (comment).

Having a standard library interface for XOFs would help prevent fragmentation and help building modular higher-level implementations (although deployments should generally select one concrete implementation).

package hash

type XOF interface {
	// Write absorbs more data into the XOF's state. It panics if called
	// after Read.
	io.Writer

	// Read reads more output from the XOF. It may return io.EOF if there
	// is a limit to the XOF output length.
	io.Reader

	// Reset resets the XOF to its initial state.
	Reset()
}

Notes

The proposed interface is a subset of the two existing ones, so values from those packages can be reused. It is also compatible with the K12 implementation. https://go.dev/play/p/AtvfO8Tkbgp

Sum and Size (from ShakeHash) are not included because XOFs don't necessarily have a "default" output size. BlockSize might potentially be useful but depends on the implementation anyway, as is not worth breaking compatibility with blake2[bs].XOF.

Clone is not included because the existing interfaces return an interface type from it. (Maybe this would have been doable with generics if x/crypto/sha3 and x/crypto/blake2[bs] returned concrete implementations rather than interfaces, but we don't want to make every use of hash.XOF generic anyway.) I will file a separate proposal to add hash.Clone and hash.CloneXOF as helper functions.

Note however that the BLAKE3 implementations differ in that they return the Reader from a method on the Writer. This is probably to allow interleaving Write and Read calls.

h := blake3.New()
h.Write([]byte("foo"))
d := h.Digest()
h.Write([]byte("bar"))
d.Read(...) // won't include bar

As long as we add hash.CloneXOF or expose Clone on the underlying XOF implementations (which both ShakeHash and blake2[bs].XOF do), cloning can be used to the same effect (with a little less compile-time safety).

h := spiffyxof.New()
h.Write([]byte("foo"))
d := h.Clone()
h.Write([]byte("bar"))
d.Read(...)
// careful not to call d.Write

/cc @golang/security @cpu

@FiloSottile FiloSottile added Proposal Proposal-Crypto Proposal related to crypto packages or other security issues labels Sep 18, 2024
@FiloSottile FiloSottile added this to the Proposal milestone Sep 18, 2024
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Sep 18, 2024
@cpu
Copy link
Contributor

cpu commented Sep 18, 2024

Having a standard library interface for XOFs would help prevent fragmentation and help building modular higher-level implementations

+1

Sum and Size (from ShakeHash) are not included because XOFs don't necessarily have a "default" output size. BlockSize might potentially be useful but depends on the implementation anyway, as is not worth breaking compatibility with blake2[bs].XOF.

This makes sense to me as an argument for why both should be avoided here.

Note however that the BLAKE3 implementations differ in that they return the Reader from a method on the Writer. This is probably to allow interleaving Write and Read calls.

It took me a minute to map this description to the concrete APIs. For lukechampine.com/blake3 this is the Hasher used for writing returned from New() having an XOF() fn that yields an OutputReader. For zeebo/blake3 this is the Hasher having a Digest() fn that yields a Digest

As long as we add hash.CloneXOF or expose Clone on the underlying XOF implementations (which both ShakeHash and blake2[bs].XOF do), cloning can be used to the same effect (with a little less compile-time safety).

Also sounds reasonable to me but is it necessary/preferred to split the proposal to add the XOF interface from the proposal to support cloning helpers? Perhaps there's a policy reason for doing this I'm overlooking? From my perspective it seems like the ability to clone intermediate state ends up being important to certain use-cases for the XOF interface.

For instance in the proposed docstring for the embedded io.Writer interface you wrote:

// It panics if called after Read.

I think this would be a place where it would be beneficial to point to the clone helper to support the interleaved read/write use-case workaround from the spiffyxof example.

@gabyhelp
Copy link

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@FiloSottile
Copy link
Contributor Author

Also sounds reasonable to me but is it necessary/preferred to split the proposal to add the XOF interface from the proposal to support cloning helpers?

I didn't want to condition hash.XOF on hash.Clone and vice-versa. If hash.XOF is accepted but hash.Clone is not, having hash.CloneXOF would be weird, so I tied hash.CloneXOF to hash.Clone. XOFs can still be cloned by dropping down to the underlying implementation, like in the spiffyxof example. (Actually, calling the underlying Clone reads better because it involves no error handling.)

@seankhliao
Copy link
Member

The interface is fairly generic, I think bytes.Buffer would also satisfy it?

@FiloSottile
Copy link
Contributor Author

The interface is fairly generic, I think bytes.Buffer would also satisfy it?

Yes, agreed. I'm afraid we have to either make it generic like this, or lose ShakeHash and blake2.XOF compatibility, since we can't expand those because they are interfaces, not concrete types.

@FiloSottile
Copy link
Contributor Author

Had some time to stew over this. New proposal, with a BlockSize method to make it more specific.

package hash

type XOF interface {
	// Write absorbs more data into the XOF's state. It panics if called
	// after Read.
	io.Writer

	// Read reads more output from the XOF. It may return io.EOF if there
	// is a limit to the XOF output length.
	io.Reader

	// Reset resets the XOF to its initial state.
	Reset()

	// BlockSize returns the XOF's underlying block size.
	// The Write method must be able to accept any amount
	// of data, but it may operate more efficiently if all writes
	// are a multiple of the block size.
	BlockSize() int
}

This is a superset of blake2.XOF, but we can add a BlockSize method to the underlying implementation, and document that blake2.XOFs can be safely interface-upgraded to hash.XOF.

@aclements
Copy link
Member

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.

@aclements aclements moved this from Incoming to Active in Proposals Oct 23, 2024
@FiloSottile
Copy link
Contributor Author

Depending how #69521 goes, we might decide to add a Clone method too. It would break compatibility with x/crypto/sha3 and x/crypto/blake2[bs] but at the end of the day we can say "use the new stdlib packages if you need the interface" given #65269. Like #69521, this is not urgent for Go 1.24 as long as it doesn't influence #69982.

@aclements
Copy link
Member

Is XOF a well-known term? Could we use a more readable name like hash.Extendable?

@rsc
Copy link
Contributor

rsc commented Nov 12, 2024

Talked to @FiloSottile about this and we agreed to leave this for Go 1.25.

@aclements
Copy link
Member

Is XOF a well-known term? Could we use a more readable name like hash.Extendable?

Talked with @rolandshoemaker and it sounds like "XOF" is the industry-standard term for this, so we should probably stick with that.

Another question is, do we actually need a defined interface for this? It would be nice if we had some concrete consumers of this interface. On the other hand, this clearly parallels hash.Hash, which has many uses, so I don't think this is a big concern.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2024

It sounds like we're happy with the API but not sure about whether we need to add this at all. As Austin wrote last week:

Another question is, do we actually need a defined interface for this?

One reason we might need the interface is to use it somewhere, in an API that accepts a hash.XOF. Do those exist?

Another reason we might need the interface is to establish a standard convention for implementations to follow. It sounds like that is the main justification for now.

I suppose one use would be in the newly proposed func CloneXOF from #69521.

Or should we instead add the Clone() hash.XOF method to the XOF interface from the start?

@aclements
Copy link
Member

This interface has some implicit "typestate" around when you can call Read and when you can call Write. Would it be better as a pair of types: you do all of your writing to one, and then you get the second type that lets you read? Something like

type XOF interface {
	// Write absorbs more data into the XOF's state.
	io.Writer

	// Reader returns a stream of output from the XOF for its current state.
	// Its Read method may return io.EOF if there is a limit to the XOF output length.
	Reader() io.Reader

	// Reset resets the XOF to its initial state.
	Reset()

	// BlockSize returns the XOF's underlying block size.
	// The Write method must be able to accept any amount
	// of data, but it may operate more efficiently if all writes
	// are a multiple of the block size.
	BlockSize() int
}

I don't know enough about XOFs to know if this requires cloning the XOF's state (or doing some more complicated lazy cloning only if Write is called after Reader), or if the "writer" state and the "reader" state are effectively distinct anyway.

And perhaps that ship has already sailed with golang.org/x/crypto/blake2b and ShakeHash, though their existing methods don't preclude adding the above Reader() method.

@rolandshoemaker
Copy link
Member

While I like the idea of returning a reader, to decouple the state, I think there are two possible reasons to avoid it.

First, this does set up a footgun, since the returned reader is decoupled from the initial hash (essentially cloning it), if the user calls Reader() before writing, they may be confused that a subsequent initial call to write does not cause a change in the returned reader. We can document this, but it's probably better to just be hard to misuse.

Secondly, I wonder if this encodes strange usage of an XOF. Off the top of my head, I cannot think of a case where you'd want to write to the XOF, get a reader, and then write some more and get a second reader. In typical usage I think you'd just do the initial write, and then do a number of reads. The Clone proposal does provide this functionality, but I think that is more explicit than this particular solution.

Overall while I think the panic is sub-optimal, it is probably the best solution here.

@aclements
Copy link
Member

Based on the discussion above, this proposal seems like a likely accept.

The proposal is to add the following API to package hash:

package hash

// An XOF (extendable output function) is a hash function with arbitrary or unlimited output length.
type XOF interface {
        // Write absorbs more data into the XOF's state. It panics if called
        // after Read.
        io.Writer

        // Read reads more output from the XOF. It may return io.EOF if there
        // is a limit to the XOF output length.
        io.Reader

        // Reset resets the XOF to its initial state.
        Reset()

        // BlockSize returns the XOF's underlying block size.
        // The Write method must be able to accept any amount
        // of data, but it may operate more efficiently if all writes
        // are a multiple of the block size.
        BlockSize() int
}

And to add a BlockSize method to the underlying implementation of blake2.XOF and document that blake2.XOFs can be safely interface-upgraded to hash.XOF.

@aclements aclements moved this from Active to Likely Accept in Proposals Jan 8, 2025
@aclements
Copy link
Member

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.

The proposal is to add the following API to package hash:

package hash

// An XOF (extendable output function) is a hash function with arbitrary or unlimited output length.
type XOF interface {
        // Write absorbs more data into the XOF's state. It panics if called
        // after Read.
        io.Writer

        // Read reads more output from the XOF. It may return io.EOF if there
        // is a limit to the XOF output length.
        io.Reader

        // Reset resets the XOF to its initial state.
        Reset()

        // BlockSize returns the XOF's underlying block size.
        // The Write method must be able to accept any amount
        // of data, but it may operate more efficiently if all writes
        // are a multiple of the block size.
        BlockSize() int
}

And to add a BlockSize method to the underlying implementation of blake2.XOF and document that blake2.XOFs can be safely interface-upgraded to hash.XOF.

@aclements aclements moved this from Likely Accept to Accepted in Proposals Jan 16, 2025
@aclements aclements changed the title proposal: hash: add XOF interface hash: add XOF interface Jan 16, 2025
@aclements aclements modified the milestones: Proposal, Backlog Jan 16, 2025
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/644235 mentions this issue: hash: add XOF interface

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/644255 mentions this issue: blake2b: implement hash.XOF

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Proposal Proposal-Accepted Proposal-Crypto Proposal related to crypto packages or other security issues
Projects
Status: Accepted
Development

No branches or pull requests

8 participants