Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(fs): improve cache efficiency using git commit hash #8279

Open
knqyf263 opened this issue Jan 22, 2025 · 0 comments · May be fixed by #8278
Open

feat(fs): improve cache efficiency using git commit hash #8279

knqyf263 opened this issue Jan 22, 2025 · 0 comments · May be fixed by #8278
Assignees
Milestone

Comments

@knqyf263
Copy link
Collaborator

knqyf263 commented Jan 22, 2025

Background

Currently, the file system scanner uses blobInfo to calculate the cache key when scanning directories. This approach has some limitations:

  1. The blobInfo-based cache key effectively does not work as a persistent cache key
  2. The cache is deleted after scanning
  3. This results in unnecessary cache invalidation and repeated scans

Proposal

When scanning a Git repository, we can utilize the Git commit hash as a cache key instead of calculating a random key. This approach provides several benefits:

  1. Stable Cache Key: The commit hash is a stable and unique identifier for a specific state of the repository
  2. Cache Reuse: The same commit hash will always represent the same content, enabling efficient cache reuse
  3. Reduced I/O: By avoiding unnecessary cache deletion and regeneration, we can reduce I/O operations

Implementation Details

Git Repository Detection

  1. Check if the target directory is a Git repository
  2. If it is a Git repository:
    • Check if the repository is in a dirty state (has uncommitted changes)
    • For clean repositories:
      • Use the latest commit hash as the cache key
      • Skip cache deletion after scanning
    • For dirty repositories:
      • Fall back to using blobInfo (current behavior)
      • Cache will be deleted after scanning
  3. If it's not a Git repository:
    • Continue with the current behavior using blobInfo (current behavior)
@knqyf263 knqyf263 added this to the v0.59.0 milestone Jan 22, 2025
@knqyf263 knqyf263 self-assigned this Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

1 participant