Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web3.storage driver #11

Merged
merged 28 commits into from
Feb 2, 2023
Merged

Web3.storage driver #11

merged 28 commits into from
Feb 2, 2023

Conversation

leszko
Copy link
Contributor

@leszko leszko commented Jan 17, 2023

Add web3.storage driver.

fix livepeer/catalyst#375


How does it work?

  1. Every SaveData() does the following:
    1. Converts a file into a CAR file and sends it to web3.storage (the file is already uploaded and has its own CID, but it has no notion of any directory structure).
    2. Stores the file CID in the in-memory DAG structure
    3. The DAG structure is identified by a unique string pubId
  2. Calling Publish() uploads the whole dag directory structure into web3.storage and therefore makes all the HLS manifest files usable.

Additional comments

  • W3S Driver is stateful (because it needs to store the DAG structure); note however that only file CIDs are stored in memory, the actual file data are not stored anywhere
  • 3 external process command calls are used:
    • ipfs-car --pack - it archives any file into a CAR file, I think that technically we could do the same using Golang and go-car library, but I think it may be more work and it may be more error-prone, since big files are actually sharded in the IPFS structures
    • w3 can store add - it uploads a CAR into web3.storage
    • w3 can upload add - it binds together a number of CARs and publishes their root with the public URL
  • The authentication and authorization will be handled in a separate PR (Add authentication and multi-tenancy to w3s driver catalyst#383), so don't worry that key and proof variables are not used right now; in this PR the local w3 credentials are used.
  • During the Publish() step, the each subdirectory is uploaded as a separate CAR; technically, we should not need it, we should just create one CAR with the whole subdirectories and upload it. However, currently the go-car library fails for these scenarios, so that's the workaround. Here's the go-car PR which fixes the issue: Allow using WalkOption in WriteCar function ipld/go-car#357 UPDATE: go-car merged my PR, so we there is no longer need to create each subdirectory separately. Updated the PR.

@@ -43,6 +43,7 @@ type OSDriver interface {
NewSession(path string) OSSession
Description() string
UriSchemes() []string
Publish() string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be called Flush as well for a more common concept in I/O

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Publish() is more verbose, since it's explicit that it will return a publicly accessible URL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with Rafal - Flush() has connotations of cleanup, rather than something that returns something useful

@leszko leszko changed the title Spike for adapting web3.storage as an object_store driver Web3.storage driver Jan 27, 2023
drivers/w3s.go Outdated Show resolved Hide resolved
drivers/w3s.go Outdated

func (session *W3sSession) ReadData(ctx context.Context, name string) (*FileInfoReader, error) {
// not supported
return nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return an error from these unsupported methods to make failures more obvious?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good idea. I'll add an error for not implemented functions.

drivers/w3s.go Outdated Show resolved Hide resolved
drivers/w3s.go Outdated Show resolved Hide resolved
@leszko leszko requested a review from thomshutt January 30, 2023 11:30
Copy link
Member

@victorges victorges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

drivers/drivers.go Outdated Show resolved Hide resolved
drivers/s3.go Show resolved Hide resolved
drivers/w3s.go Outdated Show resolved Hide resolved
drivers/w3s.go Show resolved Hide resolved
drivers/w3s.go Outdated
if err != nil {
return nil, err
}
child, err = rc.addFileToDag(ctx, child, tail, filename, fileCid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole DAG logic seems quite complex! does web3.storage not provide any lib for that? kudos on figuring all of this out, if not

Copy link
Contributor Author

@leszko leszko Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They provide a lib, but unfortunately only JS. And it's not good enough to be worth running this part as a separate NodeJS binary.

drivers/w3s.go Show resolved Hide resolved
drivers/w3s.go Show resolved Hide resolved
drivers/drivers.go Outdated Show resolved Hide resolved
drivers/w3s.go Show resolved Hide resolved
drivers/w3s.go Outdated Show resolved Hide resolved
drivers/w3s.go Outdated Show resolved Hide resolved
@leszko leszko requested a review from yondonfu February 1, 2023 15:57
Copy link
Member

@yondonfu yondonfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@leszko leszko merged commit 90317d6 into main Feb 2, 2023
@leszko leszko deleted the rafal/web3storage branch February 2, 2023 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

web3.storage OS driver
4 participants