Releases: glamod/cdm_reader_mapper
v1.0.2
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements
-
New PyPi Classifiers:
- Development Status :: 5 - Production/Stable
- Development Status :: Intended Audience :: Science/Research
- License :: OSI Approved :: Apache Software License
- Operating System :: OS Independent
v1.0.1
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements
- set package version to v1.0.1
v1.0.0
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer)
Announcements
- Final version used for GLAMOD marine processing release 7.0
Bug fixes
cdm_mapper
: Two reports that describe each other as best duplicates are not flagged as duplicates (DupDetect) (:pull:149
)cdm_mapper
: Reindex only if null values available (DupDetect) (:pull:153
)
v0.4.3
Contributors to this version: Ludwig Lierhammer (:user:ludwiglierhammer
)
Announcements
^^^^^^^^^^^^^
- First release on pypi (:issue:
17
) - First release on zenodo (:issue:
18
)
v0.4.2
Contributors to this version: Ludwig Lierhammer (:user:ludwiglierhammer
)
Announcements
^^^^^^^^^^^^^
- First release on pypi (:issue:
17
) - First release on zenodo (:issue:
18
)
v0.4.1
Contributors to this version: Ludwig Lierhammer (:user:ludwiglierhammer
)
Announcements
^^^^^^^^^^^^^
- First release on pypi (:issue:
17
) - First release on zenodo (:issue:
18
)
v0.4.0
Contributors to this version: Ludwig Lierhammer (:user:ludwiglierhammer
) and Joseph Siddons (:user:jtsiddons
)
Announcements
^^^^^^^^^^^^^
- Now under Apache v2.0 license (:pull:
69
) - First release on pypi (:issue:
17
) - First release on zenodo (:issue:
18
)
New features and enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
common.getting_files.load_file
: optionally, load data within data reference syntax (:pull:41
)common.getting_files.load_file
: optionally, clear cache directory (:pull:45
)- reworked readthedocs documentation for gathered
cdm_reader_mapper
package (:issue:19
, :pull:83
) mdf_reader
: new validation function for datetime objects (:pull:89
)mdf_reader
: select time period with new argumentsyear_init
adyear_end
(:pull:98
)cdm_mapper
: duplicate check usingrecordlinkage
(:pull:81
)mdf_reader.read
: optionally, set left and right time bounds (year_init
andyear_end
) (:issue:11
, :pull:97
)mdf_reader.read
: optionally, set both external schema and code table paths and external schema file (:issue:47
, :pull:111
)cdm_mapper
: Change both columns history and report_quality during duplicate_check (:pull:112
)cdm_mapper
: optionally, set column names to be ignored while duplicate check (:pull:115
)cdm_mapper
: optionally, set offset values for duplicate_check (:pull:119
)cdm_mapper
: optionally, set column entries to be ignored while duplicate_check (:pull:119
)cdm_mapper
: add both column namesstation_speed
andstation_course
to default duplicate check list (:pull:119
)cdm_mapper
: optionally, re-index data in ascending order according to the number of nulls in each row (:pull:119
)
Breaking changes
^^^^^^^^^^^^^^^^
- set chunksize from 10000 to 3 in testing suite (:pull:
35
) cdm_mapper
: read header columnlocation_quality
from(c1, LZ)
and set fill_value to0
(:issue:36
, :pull:37
)cdm_mapper
: set default value of header columnreport_quality
to2
(:issue:36
, :pull:37
)- reading C-RAID data: set decimal places according to input file data precision (:pull:
60
) - always convert data types of both
int
andfloat
in schemas into default data types (:issue:59
, :pull:60
) cdm_mapper.map_model
: call function without input parameterdata_atts
(:issue:66
, :pull:67
)decimal_places
information is moved frommdf_reader.schema
tocdm_mapper.tables
;decimal_places
in user-given schemas will be ignored (:issue:66
, :pull:67
)cdm_mapper
does not need any attribute information frommdf_reader
(:issue:66
, :pull:67
)cdm_mapper
: map ICOADS wind direction data (361
->0
;362
->np.nan
) (:pull:82
)cdm_mapper
: set fill_value toUNKNOWN
for C-RAID'sprimary_station_id
(:pull:93
)cdm_mapper
: map C-RAID quality flags to CDM quality flags (:pull:94
)mdf_reader
: summarize schema and code tables (:issue:11
, :pull:97
)mdf_reader
: renamec_raid
tocraid
,gcc_immt
togcc
andimma1
toicoads
(:issue:11
, :pull:97
)cdm_mapper
: summarize tables and code tables (:issue:11
, :pull:97
)cdm_mapper
: renamec_raid
tocraid
andgcc_mapping
togcc
(:issue:11
, :pull:97
)metmetpy
: renameimmt
togcc
andimma
toicoads
(:issue:11
, :pull:97
)cdm_mapper.map_model
: use standardized imodel_name as <data_model> (e.g. icoads_r300_d701) (:issue:11
, :pull:97
)mdf_reader.read
: use standardized imodel_name as <data_model> (e.g. icoads_r300_d701) (:issue:11
, :pull:97
)mdf_reader
: (core
,VS
) set column_type tokey
for all ICOADS decks (:issue:11
, :pull:97
)cdm_mapper
: rename pub47_noc mapping to pub47 (:pull:102
)- Note by each function call: rename
data_model
intoimodel
e.g. imodel=icoads_r300_d704 (:pull:103
) cdm_mapper.map_model
: call with (data, imodel=imodel) (:pull:103
)mdf_reader.read
: call with (source, imodel=imodel) (:pull:103
)- Re-order arguments to
mdf_reader.validate
, and create argument forext_table_path
(:pull:105
) operations
: delete corrections module (:pull:104
)cdm_mapper
: duplicate check is available for header table only (:pull:115
)cdm_mapper
: set report_quality to1
for bad duplicates (:pull:115
)cdm_mapper
: set default primary_station_id to4
for C-RAID mapping (:issue:117
, :pull:121
)- renamed some element names in
icoads_r300_d730
schema for consistency (InsName
toInstName
,InsPlace
toInstPlace
,InsLand
toInstLand
,No_data_entry
toNumArchiveSet
) (:pull:110
)
Internal changes
^^^^^^^^^^^^^^^^
- replace deprecated
datetime.datetime.utcnow()
withdatetime.datetime.now(datetime.UTC)
(see: python/cpython#103857) (:pull:39
, :pull:43
) - make use of
cdm-testdata
releasev2024.06.07
https://github.com/glamod/cdm-testdata/releases/tag/v2024.06.07 (:issue:44
, :pull:45
) - migration to
setup-micromamba
: https://github.com/mamba-org/provision-with-micromamba#migration-to-setup-micromamba (:pull:48
) - update actions to use Node.js 20: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#example-using-versioned-actions (:pull:
48
) mdf_reader.auxiliary.utils
: rename variable for missing values tomissing_values
(:pull:56
)- add
pre-commit
hooks:codespell
,pylint
andvulture
(:pull:56
) - use
pytest.parametrize
for testing suite (:pull:61
) - use
ast.literal_eval
instead ofeval
(:pull:64
) - remove unused code tables in
mdf_reader
(:issue:10
, :pull:65
) cdm_mapper.mappings
: usedatetime
to convertfloat
into hours and minutes.- add FOSSA license scanning to github workflows (:pull:
80
) - add
cdm_reader_mapper
author list including ORCID iD's (:pull:38
, :pull:49
) mdf_reader
: replace empty strings with missing values (:pull:89
)metmetpy
: use functionoverwrite_data
in all platform type correction functions (:pull:89
)- rename
data_model
intoimodel
(:pull:103
) - implement assertion tests for module operations (:pull:
104
) cdm_mapper
: put settings for duplicate check in _duplicate_settings (:pull:119
)cdm_mapper
: use pandas.apply function instead of for loops in duplicate_check (:pull:119
)- adding some more duplicate checks to testing suite (:pull:
119
) cdm_mapper
: re-adding conserderation of indexes of nan values during transformation (:pull:125
)
Bug fixes
^^^^^^^^^
- indexing working with user-given chunksize (:pull:
35
) - fix reading of custom schema in
mdf_reader.read
(:pull:40
) - ensure
format
schema field for delimited files is passed correctly, avoiding"...Please specify either format or field_layout in your header schema..."
error (:pull:40
) - there is a loss of data precision due to data type conversion. Hence, use default data types of both
int
andfloat
(:issue:59
, :pull:60
) - reading C-RAID data: adjust datetime formats to read dates into
MDFFileReader
(:pull:60
) - ensure external code tables are used when using an external schema in
mdf_reader.read
(:pull:105
) - update readme and example Jupyter notebooks to :pull:
103
(:pull:110
) - restructure
CLIWOC_datamodel
Jupyter notebook to add an example of data model construction (:pull:110
) - remove
create_data_model.ipynb
example Jupyter notebook (:pull:110
)
v0.3.0
Contributors to this version: Ludwig Lierhammer (:user:ludwiglierhammer
, :user:jtsiddons
)
New features and enchancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
mdf_reader
: read C-RAID netCDF buoy data (:issue:13
, :pull:24
, :pull:28
)- adding both GCC IMMT and C-RAID netCDF data to
test_data
(:pull:24
, :pull:28
) cdm_mapper
: adding C-RAID mapping and code tables (:issue:13
, :pull:28
)cdm_mapper
: addload_tables
to__init.py__
(:pull:32
)
Breaking changes
^^^^^^^^^^^^^^^^
- adding tests for IMMT and C-Raid data (:issue:
26
, :pull:24
, :pull:28
) cdm_mapper.map_model
: drop dulicated lines in pd.DataFrame before writing CDM table on disk (:pull:28
)- add pyarrow (see: pandas-dev/pandas#54466) to requirements
- solving pyarrow-snappy issue (see: openforcefield/openff-nagl#106) (:issue:
33
, :pull:28
, :pull:34
)
Internal changes
^^^^^^^^^^^^^^^^
- do not diferentiate between tuple and single column names (:pull:
24
) metmetpy
: Do not raise erros ifvalidate_datetime
,correct_datetime
,correct_pt
and/orvalidate_id
do not find any entries (:pull:24
)- get rid of warnings (:issue:
9
, :pull:27
) - adding python 3.12 to testing suite (:pull:
29
) - set time out for testing suite to 10 minutes (:pull:
29
)
Bug fixes
^^^^^^^^^^
cdm_mapper
: set debugging logger into if statement (:pull:24
)cdm_mapper
: do not use code tableqc_flag
withreport_id
(:pull:24
)metmetpy
: fixing ICOADS 30000 NRT functions forpandas>=2.2.0
(:pull:31
)cdm_mapper.read_tables
: if table not available return emptypd.DataFrame
(:pull:32
)
v0.2.0
Contributors to this version: Ludwig Lierhammer (@ludwiglierhammer) and Joseph Siddons (@jtsiddons)
Breaking changes
- move converters and decoders from
common
tomdf_reader/utils
(PR #3) - delete redundant functions from
cdm_reader_mapper.common
cdm_reader_mapper
: import common in__init__.py
- remove unused modules from
metmetpy
cdm_reader_mapper.mdf_reader
split data_models into code_tables and schema- logging: Allow for use of log file (PR #6)
- cannot use as command-line tool anymore (PR #22)
- outsource input and result data to
cdm-testdata
(GH #16, PR #21)
Internal changes
- adding tests to cdm_reader_mapper testing suite (GH #12, PR #2, #20, #22)
- adding testing result data (PR #4)
- use slugify insted of unidecde for licening reasons
- remove pip install instruction (PR #2)
HISTORY.rst
has been renamedCHANGES.rst
, to followxclim
-like conventions (PR #7).- speed up mapping functions with
swifter
(PR #4) mdf_reader
: adding auxiliary functions and classes (PR #4)mdf_reader
: read tables line-by-line (PR #20)
Bug fixes
- Fixed an issue with missing
conda
dependencies in thecdm_reader_mapper
documentation (PR #14)