Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan through HTTP/HTTPS possible? #129

Open
Jehan opened this issue Sep 2, 2022 · 2 comments
Open

Scan through HTTP/HTTPS possible? #129

Jehan opened this issue Sep 2, 2022 · 2 comments
Labels

Comments

@Jehan
Copy link

Jehan commented Sep 2, 2022

As said in my first patch #128, we now use Mirrorbits for GIMP. Now Mirrorbits has some additional requirements from the simpler (ugly) round-robin we used to have, like it needs either read-only rsync or ftp access back to the mirrors in order to scan them (as I understand, used for health and security checks).

So a good part of our mirrors already had this, so it was easy, but a few didn't. We sent emails to admins and we already had a response from one of them which disabled rsync/ftp access completely because of too many hack attempts and massive flood.

Can't the scan be done through the same protocol as the mirror, i.e. HTTPS? I assume that rsync/ftp must provide facilities to make it more efficient, is that it? Still, it should be possible to scan through HTTP(S).

@Jehan Jehan changed the title Scan through HTTPS possible? Scan through HTTP/HTTPS possible? Sep 2, 2022
@lazka
Copy link
Contributor

lazka commented Sep 10, 2022

I assume that rsync/ftp must provide facilities to make it more efficient, is that it?

yes, see rsync -r --no-motd rsync://mirror2.sandyriver.net/pub/software/gimp

HTTP doesn't have any way to get directory listings, recursive file listings, or bulk file metadata, so I doubt this is possible.

@ott
Copy link
Contributor

ott commented Sep 17, 2022

MirrorBrain supports scans over HTTP. The mirror has to have HTML directory indices. These can either be statically or dynamically generated. Almost all web servers support them. There is also some ongoing effort add JSON output of mod_autoindex of Apache HTTP Server. Perhaps other web servers support this format.

Another option would be to add a manifest/file metadata/directory listing file and validate the file from the primary server against the mirrors. I think this would be the most efficient method as the requests can be parallelized.

@jbkempf jbkempf added the RFC label Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants