Skip to content

My developing activities for the prot-fin project at usadellab during my practicum phase and bachelor thesis of my studies.

Notifications You must be signed in to change notification settings

qwerdenkerXD/prot-fin-dev

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

prot-fin experimental space

You find the reference proteins here

The table of Kidera factors is from here, the result of

prot-fin

Usadellab's contributions to the prot-fin project.

Project location on the IBG-4 cluster

Go to /mnt/data/hakimeh/prot-fin to find the local clone of this repository.

Project structure

The following directories contain the content indicated by their names:

  • docs
  • experiments
  • materials

Generate table of Kidera Factors

Kidera Factors are numeric values that describe the physical and chemical properties of amino acids, e.g. hydrophobicity or volume. Oversimplified they are derived from a principal component analysis of more than 180 physical and chemical features of amino acids. For the original reference see below or the R package Peptides.

Use this R-script to generate output table ./materials/Amino_Acid_Kidera_Factors.csv.

On Kidera Factors please see:

Kidera, A., Konishi, Y., Oka, M., Ooi, T., & Scheraga, H. A. (1985). Statistical analysis of the physical properties of the 20 naturally occurring amino acids. Journal of Protein Chemistry, 4(1), 23-55.

Reference proteins

We use the reference proteins (amino acid sequences in Fasta format) that were used to generate the Mapman Bin hidden Markov Models (HMMs). The two files, i.e. the Fasta file with the amino acid sequences and their UniProt identifiers and the file annotating these reference proteins with the Mapman Bin they belong to are compressed in the archive Mapman_reference_DB_202310.tar.bz2. It contains the two files:

  • mapmanreferencebins.results.txt - The file assigning Mapman Bins to reference proteins.
  • protein.fa - The Fasta file with the UniProt identifiers and their amino acid sequences.

Note that the uncompressed content of the above archive is ignored by git to avoid big data issues.

About

My developing activities for the prot-fin project at usadellab during my practicum phase and bachelor thesis of my studies.

Resources

Stars

Watchers

Forks

Packages

No packages published