Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(file-loader): Add recursive loading from sub-directories #5126

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

EshaanAgg
Copy link

What type of PR is this?
Adds support for recursive file loading.

Which issue(s) this PR fixes:
Fixes #4689

Release Notes: Yes

new features: |
  Added support for recursive subdirectories in the FileProvider API

@EshaanAgg EshaanAgg requested a review from a team as a code owner January 22, 2025 14:41
Signed-off-by: EshaanAgg <[email protected]>
@EshaanAgg
Copy link
Author

@shawnh2
I have added the tests for some basic test cases in the case of nested directories in the new test suite. Since fsnotify does not support watching recursive directories, I listed all the sub-directories of the provided directories in paths, and then added a watcher on all of them.

I am currently a bit unsure about two things:

  • How should we be handling the various change events in this region of code:
    switch event.Op {
    case fsnotify.Create, fsnotify.Write, fsnotify.Remove:
    // Since we do not watch any events in the subdirectories, any events involving files
    // modifications in current directory will trigger the event handling.
    goto handle
    default:
    // do nothing
    continue
    }

    Leaving it as it allows the current test suite to pass, so I am not certain how we should handle it.
  • Should we add tests for cases where the user creates, deletes, or renames a complete directory in one of the watched directories, or is that unnecessary? Do we intend to support that functionality right now?

Sorry for the trouble associated with reviewing a new PR. And thanks! :)

Signed-off-by: EshaanAgg <[email protected]>
internal/provider/file/resources.go Outdated Show resolved Hide resolved
internal/provider/file/file.go Outdated Show resolved Hide resolved
internal/provider/file/file.go Outdated Show resolved Hide resolved
internal/utils/path/path.go Outdated Show resolved Hide resolved
internal/utils/path/path.go Outdated Show resolved Hide resolved
@EshaanAgg
Copy link
Author

EshaanAgg commented Jan 24, 2025

Hi! I noticed that the testing CI for this PR has been failing repeatedly, and it's because of two tests:

  1. The first test suite that is failing is present in kubernetes_test.go, and is failing probably because we are trying to start two servers on the same port. Since I have not made any changes related to the same, can someone help me figure out why it is only causing an issue here?
Logs
--- FAIL: TestNamespacedProvider (8.28s)
    kubernetes_test.go:1278: 
        	Error Trace:	/home/runner/work/gateway/gateway/internal/provider/kubernetes/kubernetes_test.go:1278
        	Error:      	Received unexpected error:
        	            	failed to create manager: error listening on :8081: listen tcp :8081: bind: address already in use
        	Test:       	TestNamespacedProvider
        	
      
 --- FAIL: TestProvider (7.51s)
    kubernetes_test.go:67: 
        	Error Trace:	/home/runner/work/gateway/gateway/internal/provider/kubernetes/kubernetes_test.go:67
        	Error:      	Received unexpected error:
        	            	failed to create manager: error listening on :8081: listen tcp :8081: bind: address already in use
        	Test:       	TestProvider
  1. This is the main test suite of file.test that I have added to this PR. All the non-recursive tests pass successfully, but the initial load fails for the recursive loading tests.
--- FAIL: TestRecursiveFileProvider (19.01s)
    --- FAIL: TestRecursiveFileProvider/initial_resource_load (0.00s)
        file_test.go:226: 
            	Error Trace:	/home/runner/work/gateway/gateway/internal/provider/file/file_test.go:226
            	Error:      	Should not be zero, but was 0
            	Test:       	TestRecursiveFileProvider/initial_resource_load

This should not happen as I have repeatedly run the same test on my system locally multiple times, and it never fails. These are the verbose logs on running go test ./internal/provider/file -v after these changes:

Logs
=== RUN   TestFileProvider
2025-01-25T02:48:13.302+0530    INFO    file/file.go:208       starting health probe server     {"address": ":8081"}
2025-01-25T02:48:13.455+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:13.455+0530    INFO    file/file.go:89 Watching path added     {"path": "/tmp/test-dir-4273140525"}
2025-01-25T02:48:13.455+0530    INFO    file/file.go:89 Watching path added     {"path": "/tmp/test-files-294709326/test.yaml"}
=== RUN   TestFileProvider/initial_resource_load
=== RUN   TestFileProvider/rename_the_watched_file_then_rename_it_back
2025-01-25T02:48:14.313+0530    INFO    file/file.go:123       file changed     {"op": "RENAME", "name": "/tmp/test-files-294709326/test.yaml"}
2025-01-25T02:48:14.313+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:15.314+0530    INFO    file/file.go:123       file changed     {"op": "RENAME", "name": "/tmp/test-files-294709326/foobar.yaml"}
2025-01-25T02:48:15.314+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:15.402+0530    INFO    file/store.go:73       loaded and stored resources successfully
=== RUN   TestFileProvider/remove_the_watched_file
2025-01-25T02:48:16.318+0530    INFO    file/file.go:123       file changed     {"op": "REMOVE", "name": "/tmp/test-files-294709326/test.yaml"}
2025-01-25T02:48:16.319+0530    INFO    file/store.go:35       reload all resources
=== RUN   TestFileProvider/add_a_file_in_watched_dir
2025-01-25T02:48:17.321+0530    INFO    file/file.go:150       file changed     {"op": "CREATE", "name": "/tmp/test-dir-4273140525/test.yaml", "dir": "/tmp/test-dir-4273140525"}
2025-01-25T02:48:17.322+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:17.411+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:17.411+0530    INFO    file/file.go:150       file changed     {"op": "WRITE", "name": "/tmp/test-dir-4273140525/test.yaml", "dir": "/tmp/test-dir-4273140525"}
2025-01-25T02:48:17.411+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:17.489+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:17.489+0530    INFO    file/file.go:150       file changed     {"op": "WRITE", "name": "/tmp/test-dir-4273140525/test.yaml", "dir": "/tmp/test-dir-4273140525"}
2025-01-25T02:48:17.489+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:17.574+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:17.574+0530    INFO    file/file.go:150       file changed     {"op": "WRITE", "name": "/tmp/test-dir-4273140525/test.yaml", "dir": "/tmp/test-dir-4273140525"}
2025-01-25T02:48:17.574+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:17.649+0530    INFO    file/store.go:73       loaded and stored resources successfully
=== RUN   TestFileProvider/remove_a_file_in_watched_dir
2025-01-25T02:48:18.326+0530    INFO    file/file.go:150       file changed     {"op": "REMOVE", "name": "/tmp/test-dir-4273140525/test.yaml", "dir": "/tmp/test-dir-4273140525"}
2025-01-25T02:48:18.326+0530    INFO    file/store.go:35       reload all resources
--- PASS: TestFileProvider (6.03s)
    --- PASS: TestFileProvider/initial_resource_load (0.01s)
    --- PASS: TestFileProvider/rename_the_watched_file_then_rename_it_back (2.01s)
    --- PASS: TestFileProvider/remove_the_watched_file (1.00s)
    --- PASS: TestFileProvider/add_a_file_in_watched_dir (1.01s)
    --- PASS: TestFileProvider/remove_a_file_in_watched_dir (1.00s)
=== RUN   TestRecursiveFileProvider
2025-01-25T02:48:19.330+0530    INFO    file/file.go:208       starting health probe server     {"address": ":8081"}
2025-01-25T02:48:19.330+0530    ERROR   file/file.go:210       failed to start health probe server      {"error": "listen tcp :8081: bind: address already in use"}
2025-01-25T02:48:19.535+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:19.535+0530    INFO    file/file.go:89 Watching path added     {"path": "/tmp/test-base-3089483487/subdir-12608968943"}
2025-01-25T02:48:19.535+0530    INFO    file/file.go:89 Watching path added     {"path": "/tmp/test-base-3089483487/subdir-21205100858"}
2025-01-25T02:48:19.535+0530    INFO    file/file.go:89 Watching path added     {"path": "/tmp/test-base-3089483487"}
=== RUN   TestRecursiveFileProvider/initial_resource_load
=== RUN   TestRecursiveFileProvider/rename_a_file_in_watched_sub_dir_and_then_rename_it_back
2025-01-25T02:48:20.340+0530    INFO    file/file.go:150       file changed     {"op": "RENAME", "name": "/tmp/test-base-3089483487/subdir-12608968943/config_1.yaml", "dir": "/tmp/test-base-3089483487/subdir-12608968943"}
2025-01-25T02:48:20.340+0530    INFO    file/file.go:150       file changed     {"op": "CREATE", "name": "/tmp/test-base-3089483487/subdir-12608968943/config_1_renamed.yaml", "dir": "/tmp/test-base-3089483487/subdir-12608968943"}
2025-01-25T02:48:20.340+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:20.657+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:21.341+0530    INFO    file/file.go:150       file changed     {"op": "RENAME", "name": "/tmp/test-base-3089483487/subdir-12608968943/config_1_renamed.yaml", "dir": "/tmp/test-base-3089483487/subdir-12608968943"}
2025-01-25T02:48:21.341+0530    INFO    file/file.go:150       file changed     {"op": "CREATE", "name": "/tmp/test-base-3089483487/subdir-12608968943/config_1.yaml", "dir": "/tmp/test-base-3089483487/subdir-12608968943"}
2025-01-25T02:48:21.341+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:21.687+0530    INFO    file/store.go:73       loaded and stored resources successfully
=== RUN   TestRecursiveFileProvider/remove_a_file_from_watched_sub_dir
2025-01-25T02:48:22.342+0530    INFO    file/file.go:150       file changed     {"op": "REMOVE", "name": "/tmp/test-base-3089483487/subdir-21205100858/config_2.yaml", "dir": "/tmp/test-base-3089483487/subdir-21205100858"}
2025-01-25T02:48:22.342+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:22.564+0530    INFO    file/store.go:73       loaded and stored resources successfully
=== RUN   TestRecursiveFileProvider/add_a_new_file_to_a_watched_sub_dir
2025-01-25T02:48:24.343+0530    INFO    file/file.go:150       file changed     {"op": "CREATE", "name": "/tmp/test-base-3089483487/subdir-21205100858/config_2.yaml", "dir": "/tmp/test-base-3089483487/subdir-21205100858"}
2025-01-25T02:48:24.343+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:24.430+0530    ERROR   file/store.go:39       failed to load and store resources       {"error": "local validation error: Backend.gateway.envoyproxy.io \"\" is invalid: [metadata.name: Required value: name or generateName is required, spec: Required value, <nil>: Invalid value: \"null\": some validation rules were not checked because the object was invalid; correct the existing errors to complete validation]"}
2025-01-25T02:48:24.430+0530    INFO    file/file.go:150       file changed     {"op": "WRITE", "name": "/tmp/test-base-3089483487/subdir-21205100858/config_2.yaml", "dir": "/tmp/test-base-3089483487/subdir-21205100858"}
2025-01-25T02:48:24.430+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:24.911+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:24.911+0530    INFO    file/file.go:150       file changed     {"op": "WRITE", "name": "/tmp/test-base-3089483487/subdir-21205100858/config_2.yaml", "dir": "/tmp/test-base-3089483487/subdir-21205100858"}
2025-01-25T02:48:24.911+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:25.310+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:25.310+0530    INFO    file/file.go:150       file changed     {"op": "WRITE", "name": "/tmp/test-base-3089483487/subdir-21205100858/config_2.yaml", "dir": "/tmp/test-base-3089483487/subdir-21205100858"}
2025-01-25T02:48:25.310+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:25.703+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:25.703+0530    INFO    file/file.go:150       file changed     {"op": "WRITE", "name": "/tmp/test-base-3089483487/subdir-21205100858/config_2.yaml", "dir": "/tmp/test-base-3089483487/subdir-21205100858"}
2025-01-25T02:48:25.704+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:26.031+0530    INFO    file/store.go:73       loaded and stored resources successfully
=== RUN   TestRecursiveFileProvider/move_a_file_from_one_subdirectory_to_other_subdirectory
2025-01-25T02:48:26.350+0530    INFO    file/file.go:150       file changed     {"op": "CREATE", "name": "/tmp/test-base-3089483487/new_config.yaml", "dir": "/tmp/test-base-3089483487"}
2025-01-25T02:48:26.350+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:26.618+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:26.618+0530    INFO    file/file.go:150       file changed     {"op": "WRITE", "name": "/tmp/test-base-3089483487/new_config.yaml", "dir": "/tmp/test-base-3089483487"}
2025-01-25T02:48:26.618+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:26.861+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:26.861+0530    INFO    file/file.go:150       file changed     {"op": "REMOVE", "name": "/tmp/test-base-3089483487/subdir-21205100858/config_2.yaml", "dir": "/tmp/test-base-3089483487/subdir-21205100858"}
2025-01-25T02:48:26.861+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:27.098+0530    INFO    file/store.go:73       loaded and stored resources successfully
=== RUN   TestRecursiveFileProvider/move_a_file_from_one_subdirectory_to_an_unwatched_subdirectory
2025-01-25T02:48:27.351+0530    INFO    file/file.go:150       file changed     {"op": "REMOVE", "name": "/tmp/test-base-3089483487/new_config.yaml", "dir": "/tmp/test-base-3089483487"}
2025-01-25T02:48:27.351+0530    INFO    file/store.go:35       reload all resources
2025-01-25T02:48:27.548+0530    INFO    file/store.go:73       loaded and stored resources successfully
2025-01-25T02:48:29.353+0530    INFO    file/file.go:150       file changed     {"op": "REMOVE", "name": "/tmp/test-base-3089483487/subdir-21205100858", "dir": "/tmp/test-base-3089483487"}
2025-01-25T02:48:29.353+0530    INFO    file/store.go:35       reload all resources
--- PASS: TestRecursiveFileProvider (10.03s)
    --- PASS: TestRecursiveFileProvider/initial_resource_load (0.01s)
    --- PASS: TestRecursiveFileProvider/rename_a_file_in_watched_sub_dir_and_then_rename_it_back (2.00s)
    --- PASS: TestRecursiveFileProvider/remove_a_file_from_watched_sub_dir (2.00s)
    --- PASS: TestRecursiveFileProvider/add_a_new_file_to_a_watched_sub_dir (2.01s)
    --- PASS: TestRecursiveFileProvider/move_a_file_from_one_subdirectory_to_other_subdirectory (1.00s)
    --- PASS: TestRecursiveFileProvider/move_a_file_from_one_subdirectory_to_an_unwatched_subdirectory (2.00s)
=== RUN   TestLoadFromDir
=== RUN   TestLoadFromDir/one_level_nesting
=== RUN   TestLoadFromDir/no_nested_dir
=== RUN   TestLoadFromDir/two_level_nesting
--- PASS: TestLoadFromDir (0.00s)
    --- PASS: TestLoadFromDir/one_level_nesting (0.00s)
    --- PASS: TestLoadFromDir/no_nested_dir (0.00s)
    --- PASS: TestLoadFromDir/two_level_nesting (0.00s)
PASS
ok      github.com/envoyproxy/gateway/internal/provider/file 16.31s

Is there any difference in how the tests run locally and in the CI? I am clueless here about how I should try to debug such a test failure that isn't locally reproducible! I would appreciate any guidance concerning this.

Signed-off-by: EshaanAgg <[email protected]>
@shawnh2
Copy link
Contributor

shawnh2 commented Jan 25, 2025

@shawnh2 I have added the tests for some basic test cases in the case of nested directories in the new test suite. Since fsnotify does not support watching recursive directories, I listed all the sub-directories of the provided directories in paths, and then added a watcher on all of them.

I am currently a bit unsure about two things:

  • How should we be handling the various change events in this region of code:

    switch event.Op {
    case fsnotify.Create, fsnotify.Write, fsnotify.Remove:
    // Since we do not watch any events in the subdirectories, any events involving files
    // modifications in current directory will trigger the event handling.
    goto handle
    default:
    // do nothing
    continue
    }

    Leaving it as it allows the current test suite to pass, so I am not certain how we should handle it.

  • Should we add tests for cases where the user creates, deletes, or renames a complete directory in one of the watched directories, or is that unnecessary? Do we intend to support that functionality right now?

Sorry for the trouble associated with reviewing a new PR. And thanks! :)

Sorry for the late response. This is not the functionality we must support right now, can open a new issue to track it once this PR has merged.

Raise #5150 to track this.

@@ -58,7 +58,7 @@ go.testdata.complete: ## Override test ouputdata
go.test.coverage: go.test.cel ## Run go unit and integration tests in GitHub Actions
@$(LOG_TARGET)
KUBEBUILDER_ASSETS="$(shell $(tools/setup-envtest) use $(ENVTEST_K8S_VERSION) -p path)" \
go test ./... --tags=integration -race -coverprofile=coverage.xml -covermode=atomic
go test ./... --tags=integration -race -coverprofile=coverage.xml -covermode=atomic -p 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the trouble! Let's leave it! We can change current healthz server port to a non-conflict one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I will try to address #5149 first, then, and then update this PR post it. Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support recursive subdirectories for file-provider
2 participants