Code repository for the manuscript: Nucleotide dependency analysis of DNA language models reveals genomic functional elements
This repository contains code for the manuscript and general code to compute and visualize nucleotide dependencies using DNA language models.
Please refer to the notebook compute_and_visualize_dep_maps.ipynb
for a quick start , it includes examples and code to:
- Visualize nucleotide dependency maps for a specific sequence and DNA Language Model
- Compute variant influence scores for a specific sequence and DNA Language Model
SpeciesLM and RiNALMo models require FlashAttention-2 to be installed (https://github.com/Dao-AILab/flash-attention).
Data with intermediate files for the diffferent manuscript notebooks can be found at: https://doi.org/10.5281/zenodo.12982537