From da213212dede70bb845bc080498842b3f9b0b300 Mon Sep 17 00:00:00 2001 From: Shayna <34726837+PsychedelicShayna@users.noreply.github.com> Date: Sat, 7 Sep 2024 02:52:24 -0300 Subject: [PATCH 1/4] Create README.md --- README.md | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..5f1e73b --- /dev/null +++ b/README.md @@ -0,0 +1,69 @@ +# jw - Jwalk CLI Frontend + +Are you frustrated with tools like `find`, `fd`, `erd`, `lsd`, `legdur` and others that seem to excel in some areas but fall short in others? I was too, so I built a solution that prioritizes speed and simplicity above all else. The design philosophy of modern tools have a tendency to stray away from the original Linux philosophy of each command doing a single thing, and doing it very well, instead opting to cram as many features in as possible. + +This isn't necessarily a bad thing, I enjoy those features, but there are many times where I want to simply grep every single path from the root of my drive, and that's when those abstractions start backfiring. Colorized output is hard to grep, they take longer to traverse, longer to print, they immediately start dumping results and create a termiiinal I/O bottleneck. + +Sometimes you just need to take a page out of the Sesto Elemento's book. + +## What is jw exactly? +jw is a command line frontend for [jwalk](https://github.com/byron/jwalk), a blazingly fast filesystem traversal library. While jwalk itself provides unparalleled performance in recursively traversing directories, it lacks a CLI, so I created jw to fill that gap. This utility leverages the power of jwalk to allow you to efficiently sift through directories containing a massive number of files, with a focus on raw performance and minimal abstraction. + +It also doubles as a way to hash a very large number of files, thanks to the insanely fast [xxHash](https://github.com/Cyan4973/xxHash) algorithm; jwalk and xxh3 go together like bread and butter. + +Rather than fancy colorized outputs, TUIs, gathering statistics, etc, jw sticks to the essentials, providing the raw performance without any of the bloat. + +It simply gives you the raw output as fast as possible, for you to pipe to other utilities, such as ripgrep/grep, xargs, and the like, with no additional nonsense. + +## Performance + +To give you a rough idea of the performance, JWalk was capable of traversing thorugh 492 GB worth of files in **3 seconds**. That's all it takes, three seconds and you can already grep for file paths. + +As for Xxh3 combined with JWalk, it was capable of hashing 7.2GB across more than 10,000 files, in **500 milliseconds**. Yes, it's that fast. Stupid fast. + +The SHA2 family and MD5 is also supported but that's only there for compatibility. + +## Usage + +``` +Usage: jw [OPTIONS] [directories]... + +Arguments: + [directories]... + The target directories to traverse, can be multiple. Use -- to read directories from stdin. + + [default: .] + +Options: + -l, --live + Display results in realtime, rather than collecting first and displaying later. + This will result in a significant drop in performance due to the constant terminal output. + + -c, --checksum [] + Output an index containing the hash of every file using the specified algorithm. + It is highly recommended you stick with xxh3, as it is significantly more performant, + and directly suited for this use case. SHA2/MD5 are only provided for compatibility. + + [possible values: xxh3, sha224, sha256, sha384, sha512, md5] + + -t, --threads + The number of threads to use to hash files in parallel. + + [default: 4] + + -d, --depth + The recursion depth limit. Setting this to 1 effectively disables recursion. + + [default: 0] + + -x, --exclude [...] + Exclude one more types of entries, separated by coma. + + [possible values: files, dirs, dot, other] + + -h, --help + Print help (see a summary with '-h') + + -V, --version + +``` From 72aab9380924ad0026bf390e5803f8f0fe99046f Mon Sep 17 00:00:00 2001 From: Shayna <34726837+PsychedelicShayna@users.noreply.github.com> Date: Sat, 7 Sep 2024 02:56:20 -0300 Subject: [PATCH 2/4] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 5f1e73b..8a9d172 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ Are you frustrated with tools like `find`, `fd`, `erd`, `lsd`, `legdur` and others that seem to excel in some areas but fall short in others? I was too, so I built a solution that prioritizes speed and simplicity above all else. The design philosophy of modern tools have a tendency to stray away from the original Linux philosophy of each command doing a single thing, and doing it very well, instead opting to cram as many features in as possible. -This isn't necessarily a bad thing, I enjoy those features, but there are many times where I want to simply grep every single path from the root of my drive, and that's when those abstractions start backfiring. Colorized output is hard to grep, they take longer to traverse, longer to print, they immediately start dumping results and create a termiiinal I/O bottleneck. +This isn't necessarily a bad thing, I enjoy those features, but there are many times where I want to simply grep every single path from the root of my drive, and that's when those abstractions start backfiring. Colorized output is hard to grep, they take longer to traverse, longer to print, use lazy algorithms because the creator never designed it thinking someone would feed a terrabyte of data to it, and they immediately start dumping results and create terminal I/O bottlenecks.. **enough** Sometimes you just need to take a page out of the Sesto Elemento's book. From 38ce029ef1b8a4dfb3deeb1d41299394be0c2060 Mon Sep 17 00:00:00 2001 From: Shayna <34726837+PsychedelicShayna@users.noreply.github.com> Date: Sat, 7 Sep 2024 03:41:32 -0300 Subject: [PATCH 3/4] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 8a9d172..bfaa251 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ Are you frustrated with tools like `find`, `fd`, `erd`, `lsd`, `legdur` and others that seem to excel in some areas but fall short in others? I was too, so I built a solution that prioritizes speed and simplicity above all else. The design philosophy of modern tools have a tendency to stray away from the original Linux philosophy of each command doing a single thing, and doing it very well, instead opting to cram as many features in as possible. -This isn't necessarily a bad thing, I enjoy those features, but there are many times where I want to simply grep every single path from the root of my drive, and that's when those abstractions start backfiring. Colorized output is hard to grep, they take longer to traverse, longer to print, use lazy algorithms because the creator never designed it thinking someone would feed a terrabyte of data to it, and they immediately start dumping results and create terminal I/O bottlenecks.. **enough** +This isn't necessarily a bad thing, I enjoy those features, but there are many times where I simply want to grep every single path from the root of my drive, and that's when those abstractions start backfiring. All the additional rendering tanks performance, the colorized output sometimes messes up your regex, you pipe it to Neovim and are met with a clusterfuck of ANSI escape codes. Higher level languages that are easier to make pretty CLI/TUIs with being single threaded, the creator never anticipating that someone would feed a terrabyte of data to it, and output immediately starts getting dumped to the terminal creating massive I/O bottlenecks... **enough** Sometimes you just need to take a page out of the Sesto Elemento's book. From 9af928cc1d4cbc9014c088817e17f2b59cd09ba2 Mon Sep 17 00:00:00 2001 From: Shayna <34726837+PsychedelicShayna@users.noreply.github.com> Date: Sat, 7 Sep 2024 04:46:26 -0300 Subject: [PATCH 4/4] Create rust.yml --- .github/workflows/rust.yml | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) create mode 100644 .github/workflows/rust.yml diff --git a/.github/workflows/rust.yml b/.github/workflows/rust.yml new file mode 100644 index 0000000..000bb2c --- /dev/null +++ b/.github/workflows/rust.yml @@ -0,0 +1,22 @@ +name: Rust + +on: + push: + branches: [ "master" ] + pull_request: + branches: [ "master" ] + +env: + CARGO_TERM_COLOR: always + +jobs: + build: + + runs-on: ubuntu-latest + + steps: + - uses: actions/checkout@v4 + - name: Build + run: cargo build --verbose + - name: Run tests + run: cargo test --verbose