-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create WAF traversal function #4558
Comments
Not that we want to do it the way it's currently done necessarily, but the current code lives here: https://github.com/ckan/ckanext-spatial/blob/master/ckanext/spatial/harvesters/waf.py#L52 |
Watch out for looping; many WAFs have a folder structure that allows you to go "up" a level; if you don't exclude those you can end up in an indefinite search. It doesn't matter if we do a Breadth First Search or a Depth First Search, but we do need to be able to look at sub-folders and list/download them as well. |
i'm checking for the parent here. the pr is a work in progress. i included a filter list to the function as a way to exclude directories we don't want to open which could be source specific. |
User Story
In order to collect files from a WAF, datagov wants create a function in the harvesting logic repo which can traverse a WAF and pull all the files.
Acceptance Criteria
WHEN the function is invoked
THEN all the files in the tree are collected and downloaded.
Background
[Any helpful contextual notes or links to artifacts/evidence, if needed]
Security Considerations (required)
[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]
Sketch
The text was updated successfully, but these errors were encountered: