- Added:
- A module,
tfs.testing
, has been added and made publicly available. It provides an assert function to compareTfsDataFrame
similar to that provided bypandas
, destined for unit tests.
- A module,
-
Changed:
- The headers of a
TfsDataFrame
are now stored as adict
and no longer anOrderedDict
. This is transparent to the user.
- The headers of a
-
Fixed:
- Removed a workaround function which is no longer necessary due to the higher minimum
pandas
version.
- Removed a workaround function which is no longer necessary due to the higher minimum
-
Changed:
- Migrated to standard
pyproject.toml
. - The minimum required
numpy
version is nownumpy 1.24
.
- Migrated to standard
-
Fixed:
- The package is now compatible with
numpy 2.x
. - The package's HDF functionality is fully compatible with
numpy 2.x
onPython >= 3.10
thanks to apytables
compatibility release. - The package's HDF functionality limits to
numpy < 2
onPython 3.9
due to the lack of compatibility frompytables
on this versions.
- The package is now compatible with
-
Changed:
- The minimum required
pandas
version is nowpandas 2.1
. - Support for
Python 3.8
has been dropped. The minimum required Python version is now3.9
.
- The minimum required
-
Fixed:
- Solved a
DeprecationWarning
appearing when writing aTfsDataFrame
to disk due to the use of.applymap
, byusing the now recommended.map
method. - Solved a
DeprecationWarning
appearing when reading a file from disk due to the use ofdelim_whitespace
in our reader, by using the now recommendedsep
option. - Solved a
FutureWarning
appearing when validating aTfsDataFrame
due to the use of thepd.option_context('mode.use_inf_as_na', True)
context manager during validation by explicitely casting infinite values toNaNs
. - Solved a
FutureWarning
appearing when validating aTfsDataFrame
due to object downcasting happening during validation by explicitely infering dtypes first.
- Solved a
- Fixed:
- Fixed a regression where the writing of a
pd.Series
-like object to disk was raising an error. It is now possible again.
- Fixed a regression where the writing of a
- Fixed:
- fixing the issues with
pandas
>=v2.1.0
(seetfs-pandas
v3.7.1
) by overwriting the_constructor_from_mgr
function.
- fixing the issues with
- Changed:
- The dependency on
pandas
was restricted to avoid the latest version,2.1.0
and above as a temporary workaround to an attribute access bug that arose with it.
- The dependency on
Minor API changes to the TFSCollections
:
-
the old
write_to
andget_filename
are renamed to_write_to
and_get_filename
as they could only be accessed internally (due to the input parameters not available to the user). This also means, that - in case they are overwritten by a user's implementation - they need to be renamed there!! -
The column which is set as index can now also be defined manually, by overwriting the attribute
INDEX
, which defaults to"NAME"
. -
New Functions of
TFSCollection
Instances:get_filename(name)
: Returns the associated filename to the property with namename
.get_path(name)
: Return the actual file path of the propertyname
flush()
: Write the current state of the TFSDataFrames into their respective files.write_tfs(filename, data_frame)
: Write thedata_frame
toself.directory
with the givenfilename
.
-
New Special Properties of
TFSCollection
Instances:defined_properties
: Tuple of strings of the defined properties on this instance.filenames
is a convenience wrapper forget_filename()
:- When called (
filenames(exist: bool)
) returns a dictionary of the defined properties and their associated filenames. Theexist
boolean filters between existing files or filenames for all properties. - Can also be used either
filenames.name
orfilenames[name]
to callget_filename(name)
on the instance.
- When called (
-
Moved the define-properties functions directly into the
Tfs
-attribute marker class. -
Return of
None
for theMaybeCall
class in case of attribute not found (instead of empty function, which didn't make sense).
-
Removed:
- The
append
andjoin
methods ofTfsDataFrame
have been removed.
- The
-
Changed:
- The dependency version on
pandas
has been restored to>=1.0.0
as the above removal restores compatibility withpandas
2.0
.
- The dependency version on
- Changed:
- Fixed a wrong deprecation of the
.merge
method ofTfsDataFrames
.
- Fixed a wrong deprecation of the
- Changed:
- The dependency on
pandas
has been pinned to<2.0
to guarantee the proper functionning of the compabilityappend
andjoin
methods inTfsDataFrames
. These will be removed with the next release oftfs-pandas
and users should use thetfs.frame.concat
compatibility function instead.
- The dependency on
- Fixed:
- Allow reading of empty lines in headers again.
-
Fixed:
- Any empty strings ("") in a file's columns will now properly be read as such and not converted to
NaN
.
- Any empty strings ("") in a file's columns will now properly be read as such and not converted to
-
Added:
- It is now possible to only read the headers of a file by using a new function,
read_headers
. The function API is not exported at the top level of the package but is available to import fromtfs.reader
.
- It is now possible to only read the headers of a file by using a new function,
- Added:
- The
read_tfs
andwrite_tfs
functions can now handle reading / writing compressed files, see documentation for details.
- The
- Changed:
- Column types are now assigned at read time instead of later on, which should improve performance for large data frames.
-
Added:
- The option is now given to the user to skip data frame validation after reading from file / before writing to file. Validation is left "on" by default, but can be turned off with a boolean argument.
-
Changes:
- The
tfs.frame.validate
function has seen its internal logic reworked to be more efficient and users performing validation on large data frames should notice a significant performance improvement. - The documentation has been expanded and improved, with notably the addition of example code snippets.
- The
- Changed:
- Allow spaces in header names.
-
Added:
- HDF5 read/write.
-
Changed:
- The minimum required Python version is now
3.7
.
- The minimum required Python version is now
-
Fixed:
- Removed dependency on depricated
numpy.str
- Removed dependency on depricated
-
Changed:
- No logging of error messages internally for reading files and checking dataframes.
Instead logging is either moved to
debug
-level or all info is now in the error message itself to be handled externally by the user.
- No logging of error messages internally for reading files and checking dataframes.
Instead logging is either moved to
- Fixed:
- String representation of empty headers is fixed (accidentally printed 'None' before).
- Fixed:
- Merging functionality from
TfsDataFrame.append
,TfsDataFrame.join
,TfsDataFrame.merge
andtfs.concat
do not crash anymore when encountering apandas.DataFrame
(or more fortfs.concat
) in their input. Signatures have been updated and tests were added for this behavior.
- Merging functionality from
A long-standing issue where merging functionality used on TfsDataFrame
(through .merge
or pandas.concat
for instance) would cause them to be cast back to pandas.DataFrame
and lose their headers has been patched.
-
Breaking changes:
- The internal API has been reworked for clarity and consistency. Note that anyone previously using the high-level exports
tfs.read
,tfs.write
andtfs.TfsDataFrame
will not be affected.
- The internal API has been reworked for clarity and consistency. Note that anyone previously using the high-level exports
-
Added:
- The
TfsDataFrame
class now has new.append
,.join
and.merge
methods wrapping the inherited methods of the same name and fixing the aforementioned issue. - A
tfs.frame.concat
function, exported astfs.concat
, has been added to wrappandas.concat
and fix the aforementioned issue. - A
tfs.frame.merge_headers
function has been added. - Top level exports are now:
tfs.TfsDataFrame
,tfs.read
,tfs.write
andtfs.concat
.
- The
-
Changes:
- The
tfs.frame.validate
function is now a public-facing documented API and may be used stably. - The
write_tfs
function now appends anEOL
(\n
) at the end of the file when writing out for visual clarity and readability. This is a purely cosmetic and does not change functionality / compatibility of the files. - Documentation and README have been updated and cleared up.
- The
Please do refer to the documentation for the use of the new merging functionality to be aware of caveats, especially when merging headers.
- Changes:
- The parsing in
read_tfs
has been reworked to make use ofpandas
's C engine, resulting in drastic performance improvements when loading files. No functionality was lost or changed.
- The parsing in
-
Fixed:
- Took care of a numpy deprecation warning when using
np.str
, which should not appear anymore for users.
- Took care of a numpy deprecation warning when using
-
Changes:
- Prior to version
2.0.3
, reading and writing would raise aTfsFormatError
in case of non-unique indices or columns. From now on, this behavior is an option inread_tfs
andwrite_tfs
callednon_unique_bahvior
which by default is set to log a warning. If explicitely asked by the user, the failed check will raise aTfsFormatError
.
- Prior to version
- Fixed:
- Proper error on non-string columns
- Writing numeric-only mixed type dataframes bug
- Fixed:
- No longer warns on MAD-X styled string column types (
%[num]s
). - Documentation is up-to-date, and plays nicely with
Sphinx
's parsing. - Fix a wrong type hint.
- No longer warns on MAD-X styled string column types (
-
Breaking Changes:
FixedColumn
,FixedColumnCollection
andFixedTfs
have been removed from the package- Objects are not converted to strings upon read anymore, and will raise an error
- Minimum pandas version is 1.0
-
Fixed:
- No longer writes an empty line to file in case of empty headers
- "Planed" dataframes capitalize plane key attributes to be consistent with other
pylhc
packages, however they can be accessed with and without capitalizing your query.
-
Changes:
- Minimum required
numpy
version is now 1.19 - TfsDataFrames now automatically cast themselves to pandas datatypes using
.convert_dtypes()
- Lighter dependency matrix
- Full testing of supported Python versions across linux, macOS and windows systems through Github Actions
- Minimum required
-
Fixed:
- Bug with testing for headers, also in pandas DataFrames
- Same testing method for all data-frame comparisons
- Some minor fixes
-
Added:
- Testing of writing of pandas DataFrames
-
Added:
- support for pathlib Paths
- strings with spaces support (all strings in data are quoted)
- more validation checks (no spaces in header/columns)
- nicer string representation
- left-align of index-column
-
Removed:
.indx
from class (useindex="NAME"
instead)
-
Fixed:
- Writing of empty dataframes
- Doc imports
- Minor bugfixes
- Fixed:
- From relative to absolute imports (IMPORTANT FIX!!)
-
Fixed:
- Additional index column after writing is removed again
- Renamded sigificant_numbers to significant_digits
- significant_digits throws proper error if zero-error is given
-
Added:
- Fixed Dataframe Class
- Type Annotations
-
Fixed:
- Metaclass-Bug in Collections
-
Added:
- Additional Unit Tests
- Versioning
- Changelog
- Initial Release