Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use buffering (bufio?) for reading image tarball on export #1339

Closed
abitrolly opened this issue Apr 9, 2022 · 2 comments
Closed

Use buffering (bufio?) for reading image tarball on export #1339

abitrolly opened this issue Apr 9, 2022 · 2 comments

Comments

@abitrolly
Copy link
Contributor

abitrolly commented Apr 9, 2022

In #1274 while reading the image tarball from stdin, it is first saved to file, because export algorithm uses random file access as described here #1274 (comment)

To makes the export more efficient, the algorithm could cache image bits in memory until they are no longer needed. So, for example, if manifest.json is parsed, keep the parsed structure in memory and discard cached bytes that contained it. For well-aligned images it will save both speed and memory. For badly aligned the performance will be the same as with current temp file, because temp file is still written into memory tmpfs on Linux.

bufio can potentially help https://pkg.go.dev/bufio but it looks like a lib for helping with string scanning. Not sure it can handle several GB of memory cache efficiently.

And here is a good article with code that allows to see random tar access in debug mode - https://blog.gopheracademy.com/advent-2017/seekable-http/

@github-actions
Copy link

github-actions bot commented Jul 9, 2022

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Keep fresh with the 'lifecycle/frozen' label.

@github-actions github-actions bot closed this as completed Aug 8, 2022
abitrolly added a commit to abitrolly/go-containerregistry that referenced this issue Aug 14, 2022
TarBuffered scans stream (`io.Reader`) once for filename and saves
unused sections in memory for later access. This should speedup
parsing a bit, because right now tarball is scanned several times,
and should save resources and speed for parsing well-formed images
from network.

See google#1339.
@abitrolly
Copy link
Contributor Author

Not sure I can reopen this, but I gave it a try without bufio in #1429.

abitrolly added a commit to abitrolly/go-containerregistry that referenced this issue Aug 17, 2022
TarBuffered scans stream (`io.Reader`) once for filename and saves
unused sections in memory for later access. This should speedup
parsing a bit, because right now tarball is scanned several times,
and should save resources and speed for parsing well-formed images
from network.

See google#1339.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant