This repository contains tools for manipulating CSV files, all written in Rust. Most of these are a page or two long, and they're intended to run as quickly as possible without requiring heroic optimization. If you like these, you will probably also like xsv
, which provides an extensive toolkit of CSV-processing utilities written in Rust.
catcsv
: Concatenate directory of CSV files into a single CSV stream, decompressing as needed.fixed2csv
: Convert fixed-width fields to CSV.geochunk
: Add a column to a CSV file that groups records into similarly-sized chunks based on US ZIP codes and census data.scrubcsv
: Turn messy, slightly corrupt CSV files into something clean and standardized.hashcsv
: Add a new column to a CSV file, containing a hash of the other columns. Useful for de-duplicating.geocode-csv
(separate repo): Geocode CSV files in bulk using the Smarty API (or other APIs). This is in separate repo beause it depends ontokio
and networking and a more complicated build system.
In general, this repository should contain standard modern Rust code, formatting using cargo fmt
and the supplied settings. The code should have no warnings when run with clippy
.
These tools were written over several years, and they represent a history of Rust at Faraday. The following dependencies should be replaced if we get the chance:
structopt
: Upgrade to the lastestclap
, which includes it.docopt
: Replace withclap
's newstructopt
support.error_chain
andfailure
: Replace withanyhow
(plusthiserror
if we need specific custom error types).
In general, it's a good idea to update any older code to match the newest code.