-
Notifications
You must be signed in to change notification settings - Fork 7
Building a datapackage
abradyIGS edited this page Apr 11, 2022
·
17 revisions
There are many valid ways to build a datapackage depending on the underlying structure of your DCC's data. Although you must submit all tables to have a valid package, most tables, and most columns within most tables, can be left blank. This allows you to use only the pieces of the C2M2 that best fit your data. This page does not discuss modeling per se, and is instead designed to give a quick overview of how to finalize your datapackage. If you need help turning your internal data model into something compatible with the C2M2 please contact the helpdesk directly for individualized support.
- Build the tables that your DCC has data for according to the technical documentation. Supplemental table information is also available in this wiki.
- Download the C2M2 JSON file from CFDE OSF space and save in the same directory as your tables.
- Follow submission prep script instructions to generate controlled vocabulary tables from your data.
- If you do not have 48 table files, download blank headers for the missing tables here and save them to the same directory.
- OPTIONAL: run the frictionless validator script.
- Submit your datapackage using the cfde-submit tool. NOTE: This will only work if you have been onboarded as a data submitter.
-
Tutorials
-
C2M2 Table Guide
-
Table Summary
- analysis_type.tsv
- anatomy.tsv
- assay_type.tsv
- biosample.tsv
- biosample_disease.tsv
- biosample_from_subject.tsv
- biosample_gene.tsv
- biosample_in_collection.tsv
- biosample_substance.tsv
- collection.tsv
- collection_anatomy.tsv
- collection_compound.tsv
- collection_defined_by_project.tsv
- collection_disease.tsv
- collection_gene.tsv
- collection_in_collection.tsv
- collection_phenotype.tsv
- collection_protein.tsv
- collection_substance.tsv
- collection_taxonomy.tsv
- compound.tsv
- data_type.tsv
- dcc.tsv (formerly
primary_dcc_contact.tsv
- disease.tsv
- file.tsv
- file_describes_biosample.tsv
- file_describes_collection.tsv
- file_describes_subject.tsv
- file_format.tsv
- file_in_collection.tsv
- gene.tsv
- id_namespace.tsv
- ncbi_taxonomy.tsv
- phenotype.tsv
- phenotype_disease.tsv
- phenotype_gene.tsv
- project.tsv
- project_in_project.tsv
- protein.tsv
- protein_gene.tsv
- subject.tsv
- subject_disease.tsv
- subject_in_collection.tsv
- subject_phenotype.tsv
- subject_race.tsv
- subject_role_taxonomy.tsv
- subject_substance.tsv
- substance.tsv
- Reference Tables
-
Table Summary