-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENABLE_GZIP activated will result in tar.gz & tgz files compressed 2 times in gzip #30649
Comments
If it is related to
It doesn't seem to be possible. At least, according to the code, the "gzip handler" is only used when Could you provide more details about which page is sent as gzip when ENABLE_GZIP=false? Lines 246 to 254 in fcdc57d
|
Hello,
Tested with this binary : gitea-1.21-linux-amd64
Same behavior, no change : With ENABLE_GZIP,=true the .tar.gz files are compressed a second time.
Well, then, the plot thickens ... The following files are always gziped : webcomponents.js, index.js, index.css, logo.svg (the gitea one). The http request/response for index.css with ENABLE_GZIP=false
With ENABLE_GZIP=true, everything is compressed, images aside. Also, this might be the source of the problem for the double compression of the .tar.gz files : some mimetype exclusions could be missing in the code. This aside, static resources or heavy files with an enforced compression do not strike me as a wrong idea. Just, yeah, seems some part of the documentation and code need a pass or 2. |
Thank you for the details. I will try to answer the questions by my understanding. For the ENABLE_GZIP option itself:Gitea has 2 different web request handlers.
For the "tar.gz" problem itself, I did a quick test by these steps:Use the docker compose:
Test the gzip is really enabled:
Create a repo named "test-archive", then download the tar.gz archive:
I think it works well. |
I can confirm your test for a new repo, with a default readme.md : the tar.gz file is correct. I can also confirm this behavior is not the norm. The trigger is having the repo to a minimum size. With your new repo test and ENABLE_GZIP=true :
Please note the response headers are also not the same between gzip encoding activated or not So to summarize :
|
Oops, my bad .... I forgot the minimal gzip threshold 😢 you are right, I will take a look again later. |
Tested again with my docker compose in #30649 (comment) . This time, I uploaded a large avatar.png to the repo (170KB)
It seems that my Outdated, see below. |
After more testing, I think the problem is clear now. I think the gzip compression is not wrong. There are some cases:
About this:
That's expected and correct behavior. If you would like to make "curl" work with "gzip" transparently, it should use See the differences by:
|
So I think this issue could be closed? |
No. I've said it before, the browser is also not handling the tar.gz properly. As its true I didn't specify the version, here it is : Firefox ESR 115.10 (latest). This said, with ENABLE_GZIP on, currently, wget and firefox do not save a valid tar.gz file. What could be the source of the problem seems to circle around the response header
How it will be handled will depend of the client, as the type octet-stream has no real mandatory definition about how to managed it, aside being saved as a file (mind the fact this is also just a recommendation originally in said RFC). Either the client is blind and applies the content-encoding, stating gzip => uncompress the content.
This scenario would explain the situation. I did much more digging on this than intended, for something that was in the end not correctly implemented since start, from my point of view. The parameter ENABLE_GZIP should be either set to false as default (which is the current, so good already) but also masked/removed from the documentation, leaving the "full" compression to a reverse proxy like nginx or traefik. |
Thank you for more feedbacks. Let's do a quick test: This test is still based on my provided docker compose, with a large file in it:
Conclusion:
About your question:
These headers are right (at least, not wrong). I also agree that when transferring some compressed contents, using gzip to compress the data on wire again is not ideal (and unnecessary), while indeed it won't (and shouldn't) cause any real problem at the moment. I think the behavior could be improved if the 3rd gzip handler could be more smarter when it detects if the content is compressed .......... and maybe it could also hard-code something on Gitea side. |
As a quick fix, it could be like this: Skip gzip for some well-known compressed file types #30796 |
#30796 and its backport have been merged. Feel free to try 1.22-nightly |
Are there still any problems? Or maybe this issue could be closed? |
I didn't test it yet, but I don't see why it wouldn't work given the changes you did. |
Just tested, working fine with already compressed files while retaining the gzip compression for the other files. |
Description
Had to dig to understand what happened.
It started with retrieving a git repo in Gitea as an archive and having tar failing on it complaining it is not a tar archive.
Thing is, it was a gzip archive. Then after a gunzip, the .tar file was in fact a ... gzip file too.
Here are the shell commands to demonstrate this, but you can also use 7zip in a file browser, it will show an extra level with the archive name and no extension before having the first directory level.
It went under the radar for a while as I rarely use this mechanism with the Gitea instance.
While digging in the configuration I found that I had the parameter
ENABLE_GZIP = true
There is no additional parameter related to tar.*.command as stated in #26620 . Gitea has the default behavior here.
As sometimes there can be a problem due to the reverse proxy nginx in front wich also has gzip=on, I did some tests without it.
It was completely bypassed, connected directly to the gitea instance/port.
Results are :
It is worth noting the files generated by Gitea in its cache under
repo-archive/*hash*.tar.gz
are valid.Included those who were already generated and still in Gitea local cache.
I also think ENABLE_GZIP does not do what is stated in the documentation : when set to false, without a reverse proxy most of the data was still sent gzipped. What is the real use for ?
Gitea Version
1.21.9 and 1.21.11
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
No response
Screenshots
No response
Git Version
2.17.1
Operating System
Ubuntu 18.04
How are you running Gitea?
Binary release from https://github.com/go-gitea/gitea/releases
Launched by systemd
Database
SQLite
The text was updated successfully, but these errors were encountered: