-
Notifications
You must be signed in to change notification settings - Fork 13
Higashi Usage API
from higashi.Higashi_wrapper import Higashi
higashi_model = Higashi(config_path)
Run the following commands to process the input data (only needs to be run once).
higashi_model.process_data()
This function will finish the following tasks:
- generate a dictionary that'll map genomic bin loci to the node id.
- extract data from the data.txt and turn that into the format of hyperedges (triplets)
- create contact maps based on sparse scHi-C for visualization, baseline model, and generate node attributes
- (Optional) run linear convolution + random-walk-with-restart (scHiCluster) to impute the contact maps as baseline and visualization
- (Optional) process co-assayed signals
The above function is also equivalent to
higashi_model.generate_chrom_start_end()
higashi_model.extract_table()
higashi_model.create_matrix()
Before each step is executed, a message would be printed indicating the progress, which helps the debugging process.
higashi_model.prep_model()
higashi_model.train_for_embeddings()
higashi_model.train_for_imputation_nbr_0()
higashi_model.impute_no_nbr()
higashi_model.train_for_imputation_with_nbr()
higashi_model.impute_with_nbr()
**Extra Notes: **
Higashi saves parameters of the model and embeddings every 5 epochs, the user can check if the embeddings look good in the process. For instance, the user is not sure how many epochs would Higashi converges on their new dataset and set the embedding_epoch
as 120 just to be on the safe side. During the training process, the user find that the embeddings converge at around epoch 58. Instead of waiting for 120 epochs to finish, one can just wait till the model finished the 60 epoch (as the model saves parameter every 5 epochs), and interrupt the function. The parameters will be load automatically the next time.
A few notices:
-
process_data()
only needs to be called once unless the data utilized is changed. For instance, the change of chrom_list, or data source. -
prep_model()
needs to be called right before any training and imputation function, but only needs to be called once afterhigashi_model = Higashi(...)
- Trained weights of Higashi are automatically saved in the
temp_dir
. You can continue the next stage of training or imputation directly if the previous stage is completed or intentionally interrupted.
Higashi ~ ~ Wiki
- Input files
- Usage (API)
- [Fast-Higashi initialized Higashi (Under construction)]
- Runtime of Fast-Higashi