Releases: draeger-lab/ModelPolisher
Version 2.1
Version 2.1-beta
This is a beta release with much needed improvements.
Some (non-critical) issues have crept up during testing, that need some care and some smaller features have not been implemented yet that will be in the final release.
Features and Enhancements
- Added a runModelPolisher script to run the dockerized version with BiGG annotation and offline validation. Takes path to a directory containing models to be polished and writes the polished, gzipped models into an out directory inside the input directory as user.
This runs docker with user id and corresponding group id, so that the output has correct ownership.
Due to some issues, which are hopefully fixed now, this might hang when trying to find java user prefs.
Additionally, as both databases are started in detached mode, this will likely crash on first run.
Start up the DB containers manually before first run or wait a bit after the crash and run the script again - Updated to use BiGGDB release 1.6
- ModelPolisher now downloads a current version of the identifiers.org registry to validate annotations
- Optimized several bottlenecks, like waiting on database connections and caching results from some queries that only need to be run once instead for every species, reaction, etc.
ModelPolisher should now run significantly faster compared to former releases - Changed internal BiGGId handling, so longer and more complex ids should be recognized correctly
- Updated JSONparser to conform with the schema defined by cobrapy
- Validation option now works offline instead of using the online service
- Moving away from purely manual testing for more reliability, however this is an ongoing process
- BiGGAnnotation now has a progress bar
- Added code to convert bounds and objectives from a reactions kinetic law to their FBC representation, if not already set
- Added code to convert gene associations set in reaction or model notes to GeneProductAssociations
- Changed annotations to https://identifiers.org from http
Bugfixes
- Not using BiGGAnnotations or AnnotateDB should not throw NullPointerExceptions anymore
- Fixed several possible NullPointer sources
- Mostly fixed ClassCastException for GPR parsing, might still occur for very large gene reaction rules
- Fixed connections to BiGGDB leaking in a few places
- Changed default ports for BiGGDB and ADB to work correctly
- MatFiles are now closed after reading
- Entities with empty ids are not processed
Caveats
- add-adb-annotations should not be used for now, as some of the annotations added can not be resolved or are wrong. Most likely an issue with the current implementation in ModelPolisher and will be investigated and possibly fixed for the full release
- Some annotations retrieved are not specific for the entity described, e.g. annotations might describe the same entity in different organisms or compartments, not just for the actual model being polished.
This needs further investigation for the full release
Version 2.0.1
Bugfixes
ModelPolisher no longer attempts to traverse directories in the input directory.
Features
Running ModelPolisher using docker now pulls images from DockerHub instead of building them.
A second docker-compose file (docker-compose.devel.yml) is provided for development builds using the local ModelPolisher jar.
Version 2.0
Features and Enhancements
- Extended Annotation Capabilities: Earlier versions of ModelPolisher could annotate only those elements of models which have a BiGG Id mentioned for them. This project added functionality to annotate elements without BiGG Id also.
- AnnotateDB Integration: ModelPolisher now uses AnnotateDB, a database containing mappings of annotations found in computational biological models, also in addition to BiGGDB to annotate models.
- Docker Containerisation: Previously, ModelPolisher used BiGG Models Database as the only resource for annotation. Now, the new project AnnotateDB has been added. To simplify the end-users interaction with the software, ModelPolisher is now available in Docker containers so that the database backend does not longer have to be restored from a dump file.
- Glossary: ModelPolisher 2.0 can now produce glossary files for models which are annotated by ModelPolisher. This functionality has been added during the project.
- CombineArchive Support: ModelPolisher 2.0 produces multiple files (output model, glossary). Thus CombineArchive support was added to produce the complete output of one model as a single Combine Archive.
- Improved Documentation: ModelPolisher documentation is updated with the recent features and now provides better instructions to build and run ModelPolisher.
Version 2.0 of ModelPolisher does no longer support an SQLite database backend.
Bug Fixes
- Updated Matlab Parser: Matlab Parser used by ModelPolisher to parse
.mat
models were based on librarycom.jmatio
which was no longer maintained. I updated the Matlab Parser using HebiRobotics/MFL.
Version 1.7
Features and enhancements:
- Better and more general text messages (English and German)
- Additional command-line option
--no-model-notes
to avoid adding XHTML content to the file - Technical improvements to database connectors
- Automatic grouping and alphabetical sorting of MIRIAM annotations in addition to duplicate identification and removal
Bug-fixes:
- Fixed an error in the annotation of compartments: newly created compartments were always named "default", now the name comes again from BiGG database.
- Log messages are no longer duplicated when a log file option was given.
- Smaller source-code and technical improvements.
Version 1.6
Features and enhancements:
-
Added support for SQLite via sqlite-jdbc:
The jar now contains a ready to use SQLite version of BiGGDB, so no locally running PostgreSQL BiGGDB is required to use the annotation functionality of ModelPolisher. To run ModelPolisher with the in-jar DB, use the "annotate-with-bigg=true" option and omit the additional arguments you would provide when connecting to PostgreSQL. This should in general be faster than annotation with the PSQL version. -
We now use the MIRIAM registry-lib to check all resources against the identifiers.org regexes and provide simple correction mechanisms for KEGG collections, EC-codes, NCBIGI and GO IDs. If a resource does not match and cannot be corrected, it will not be added to model and will be logged instead.
-
COBRAparser is now able to parse multimodel mat files. Each model after the initial one will get an incrementing integer as suffix. Matlab fields are now matched in a case insensitive way, if no exact match can be obtained or are assigned, if the field name is a substring of a variant.
-
Basic unit definitions will now be added to models when missing.
-
Added a method to check for wrong html tags in a SBML model and replace them.
Bugfixes:
-
Added a few null checks to COBRAparser and improved its ability to recognize resource IDs (mainly PUBMED and DOI).
-
Fixed a bug in COBRAparser cutting of the last character of a Miriam ID
-
Added a check to assure output files for batch process always have the correct filetype ending.
-
Made the SQL-Query that retrieves resources from BiGGDB a bit more specific. This should prevent annotating reaction-specific resources to metabolites and vice versa.
Miscellaneous:
-
Refactoring removed some redundant assignments. This may result in a slight speedup when running ModelPolisher on a folder of models.
-
If you use runModelPolisher.sh, a copy of the call to ModelPolisher.sh with your provided arguments is now saved in ModelPolisherTemplate.sh, overwriting the default template - a simple backup of your previous call can be found in ModelPolisherTemplate.bckp.
Version 1.5
Hotfix (21.12.2016)
Running ModelPolisher without BIGG annotation was missing method calls and thus models created with this option disabled are possibly incomplete, this has now been fixed. Furthermore [organism]
should now be replaced correctly.
Features:
- JSONParser has been added, supporting COBRApy models stored in json format according to the schema defined at https://github.com/opencobra/cobrapy/blob/devel/cobra/io/json.py
- The core functionality of ModelPolisher has been split into two parts:
- SBMLPolisher now creates SBML files for all supported input formats without pulling additional information from BiGG knowledgebase
- A new option
annotate-with-bigg
(default =false
) has been added, to optionally enable annotation from BiGG knowledgebase
- A flow chart displaying the structure of ModelPolisher has been added to doc/img
Updates:
- The reactome pattern was updated
Version 1.4
Features:
- Updated pom.xml, so that the local non-maven libs work as local repositories without the need for
local_maven-repo.sh
; project is now built withmvn clean compile verify
- Updated
assembly.xml
to use relative pathnames, as per warnings duringmaven build
- ModelPolisher now uses the JSBML 1.2-SNAPSHOT from http://jsbml.sourceforge.net/m2repo_snapshots/
- Added JUnit4 tests to BiGGId.java
- Refactored BiGGId.java - both constructors should now have the same behavior
- Added a basic contract class for BiGGDB.java
Bug fixes:
- Solved a bug when changing the identifier of reactions that have subsystem information attached, it is also necessary to update the reference from the member element to the updated reaction
- Added missing
G
prefix, fixedR_EX
andR_DM
prefix, which where a letter to short previously, in BiGGId.java
Version 1.3
New features:
- Several improvements were made in the COBRA parser:
- A source of an
IndexOutOfBoundsExceptions
was solved - Error messages have been improved.
- Special cases of common errors in
mat
files were integrated and are now caught rather than causing failure.
- A source of an
- The JSBML version has been updated to revision 2492.
- It is now possible to disable the use of URIs for the annotation of models that are not (yet) included in the MIRIAM registry using the new command-line argument
--include-any-uri=<true|false>
. If the given value is set tofalse
, only those URLs will be included that contain the prefixhttp://identifiers.org/
. - External third-party libraries are now included as local Maven repositories in order to simplify the building process for developers. Thanks to @matthiaskoenig for this contribution.
- When creating groups to represent COBRA subsystems the new SBO term subsystem.
For documentation purposes, an overview html file has been created and the POM file was slightly updated.
Bug fixes:
- A typo was fixed in the model notes file.
Version 1.2
This release includes ModelFix:
- If raw files lack an objective function, ModelPolisher can now automatically declare the biomass reaction to the flux objective. The coefficient defaults to 1.0, but can be influenced via a new command-line argument.
- In cases where no biomass reaction exists, i.e., no reaction whose id matches the pattern for biomass reactions, a new command-line argument can be used to specify one or multiple target reactions. This can be used in combination with the command-line argument for coefficients. ModelPolisher displays a log message in the category
SEVERE
if it is not possible to find a suitable target reaction. Users are then advised to manually determine appropriate reaction(s). - Missing required group kind attributes are added in.
Further improvements:
- For now, we set the units of compartments to dimensionless. This might change in the future.
- New MIRIAM resources have been added.
- Smaller bug fixes.
To find out more about all command-line arguments, launch with -?
or --help
. In particular, the two new command-line arguments are:
--flux-coefficients={1, 2, 3}
(here with sample values 1, 2, 3 separated by commas; spaces are optional)--flux-objectives={R_example1:R_example2:R_example3}
(here with three example reaction identifiers separated by colon).