The Field Compression Laboratory aims to evaluate the impact of lossy compression on the accuracy of meteorological quantities used in numerical weather prediction. The current framework includes a Python library (fcpy) and example notebooks. Currently, we support latitude/longitude and Gaussian gridded data in netCDF and GRIB formats.
- Linux or macOS
- Anaconda/Miniconda
To set-up or update the environment with the required dependencies and download sample data used in the examples run the following command from the command-line interface:
scripts/conda_init.sh
Below is a minimal example of how to use fcpy to compare the effects of lossy compression on the relative error of specific humidity q. To set up an experiment in fcpy you need to load a GRIB or NetCDF dataset and create an fcpy suite defining a baseline, a list of compressors, and the type of metrics. If you wish to plot the data you can use the helper methods or functions provided or use your own.
# There is a 30-second wait
# because of how we import julia packages
import matplotlib.pyplot as plt
import fcpy
# Loads data as an xarray Dataset
ds = fcpy.open_dataset("data/cams_q_20191201_v3.nc")
# Only select specific humidity q
ds = ds[["q"]]
# Define the suite. Here instead of telling fcpy how many
# bits to iterate through, we let it figure out based
# on Klöwer et al. (2021)'s bit-information metric.
suite = fcpy.Suite(
ds=ds,
baseline=fcpy.Float(bits=32),
compressors=[
fcpy.Round(),
fcpy.Log(fcpy.LinQuantization()), # <- nested compressor
],
metrics=[fcpy.RelativeError, fcpy.AbsoluteError],
bits=None, # <- computes number of bits using Klöwer et al. (2021)'s bit-information
)
# Plot the maximum relative error per bit and compressor combination
suite.lineplot(fcpy.RelativeError, reduction="max")
plt.savefig("sample.png", dpi=300)
For options please refer to the API documentation.
The easiest to start is by running the Jupyter Notebooks under notebooks/
with the following command:
scripts/conda_run_notebooks.sh
There you will see two example notebooks named examples-interactive
and examples-programmatic
. The former shows how to call interactive plots and the latter programmatically.
The fcpy command-line interface offers an easy way to determine the number of bits required per variable and dimensions in a CSV table.
fcpy --input data/cams_q_20191201_v3.nc --vars q --subset lev=0-10
This will create the following CSV output table:
var_name,lev,compressor,bits,sigmas
q,1.0,Round,14.0,0.32247692346572876
q,2.0,Round,14.0,0.4317324161529541
q,3.0,Round,14.0,0.493638277053833
q,4.0,Round,15.0,0.5703839063644409
q,5.0,Round,15.0,0.6555898189544678
q,6.0,Round,16.0,0.774020254611969
q,7.0,Round,17.0,0.8185981512069702
q,8.0,Round,18.0,0.8566567897796631
q,9.0,Round,19.0,0.8976887464523315
q,10.0,Round,17.0,0.9218334555625916
q,1.0,LinQuantization,13.0,0.9965649843215942
q,2.0,LinQuantization,13.0,0.9766294956207275
q,3.0,LinQuantization,12.0,0.966712474822998
q,4.0,LinQuantization,12.0,0.9503071904182434
q,5.0,LinQuantization,12.0,0.9498687982559204
q,6.0,LinQuantization,14.0,0.9693909883499146
q,7.0,LinQuantization,15.0,0.9825363159179688
q,8.0,LinQuantization,15.0,0.9840608239173889
q,9.0,LinQuantization,16.0,1.0056097507476807
q,10.0,LinQuantization,16.0,0.9886303544044495
For more info on how to use the tool, run fcpy --help
:
usage: fcpy [-h] --input INPUT [--output OUTPUT] [--dtype {float32}] [--compressors COMPRESSORS [COMPRESSORS ...]] [--vars VARS [VARS ...]]
[--subset SUBSET [SUBSET ...]]
options:
-h, --help show this help message and exit
--input INPUT Dataset (.nc or .grib) or MARS request file (.json)
--output OUTPUT Output folder
--dtype {float32} Convert data to different type
--compressors COMPRESSORS [COMPRESSORS ...]
Example: --compressors Float,LinQuantization Log,Round
--vars VARS [VARS ...]
Variables to process, otherwise all
--subset SUBSET [SUBSET ...]
Variables to subset, e.g. --subset level=0-5
See CONTRIBUTING.md
See DEVELOP.md
Copyright 2022 ECMWF. Licensed under Apache License 2.0. In applying this licence, ECMWF does not waive the privileges and immunities granted to it by virtue of its status as an intergovernmental organisation nor does it submit to any jurisdiction.
Klöwer, M., Razinger, M., Dominguez, J.J. et al. Compressing atmospheric data into its real information content. Nat Comput Sci 1, 713–724 (2021). https://doi.org/10.1038/s43588-021-00156-2