Skip to content

Commit

Permalink
added supplementary info
Browse files Browse the repository at this point in the history
  • Loading branch information
rizaozcelik committed Jul 10, 2021
1 parent bb97b90 commit 75fe1e1
Show file tree
Hide file tree
Showing 5 changed files with 78 additions and 0 deletions.
22 changes: 22 additions & 0 deletions supplementary_info/DatasetAdditionalInfo.MD
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Additional Information on BDB and KIBA

The following figure illustrates the binding score distributions and quartiles in the BDB and KIBA datasets. A strong peak at $pK_d = 5$ is observed for BDB, because $K_d$ of weak interactions are often recorded as $K_d \geq 10000$ $(pK_d \leq 5)$.

![](../img/binding_score_distributions.png)

We create train/validation/test splits out of BDB and KIBA with warm and cold biomolecules (see the manuscript for more details). We report the average number of proteins, ligands, and interactions in the training and test sets in the following table, alongside standard deviations in the parentheses.

| Dataset | Fold | #Proteins | #Ligands | #Interactions |
| ------- | ------------ | -------------------- | ---------------------- | -------------------------- |
| BDB | Train | 403.4 $\pm$ 2.8 | 740.8 $\pm$ 19.46 | 17988.2 $\pm$ 646.45 |
| BDB | Validation | 355.0 $\pm$ 5.62 | 170.0 $\pm$ 11.05 | 1494.2 $\pm$ 56.17 |
| BDB | Warm | 354.4 $\pm$ 3.44 | 179.6 $\pm$ 5.28 | 1494.4 $\pm$ 56.32 |
| BDB | Cold Ligand | 376.0 $\pm$ 4.38 | 84.8 $\pm$ 5.53 | 2448.8 $\pm$ 373.48 |
| BDB | Cold Protein | 43.6 $\pm$ 2.15 | 264.8 $\pm$ 90.17 | 2360.0 $\pm$ 216.02 |
| BDB | Cold Both | 41.4 $\pm$ 3.07 | 30.8 $\pm$ 11.92 | 274.6 $\pm$ 36.19 |
| KIBA | Train | 200.6 $\pm$ 1.36 | 1834.6 $\pm$ 6.41 | 77264.4 $\pm$ 814.94 |
| KIBA | Validation | 193.0 $\pm$ 1.67 | 1467.2 $\pm$ 23.75 | 6650.2 $\pm$ 69.53 |
| KIBA | Warm | 192.0 $\pm$ 3.16 | 1476.2 $\pm$ 17.7 | 6650.6 $\pm$ 69.1 |
| KIBA | Cold Ligand | 193.0 $\pm$ 2.45 | 140.0 $\pm$ 5.59 | 6810.0 $\pm$ 570.52 |
| KIBA | Cold Protein | 14.6 $\pm$ 0.8 | 1296.0 $\pm$ 179.09 | 6259.6 $\pm$ 1024.25 |
| KIBA | Cold Both | 14.0 $\pm$ 1.1 | 100.2 $\pm$ 14.55 | 468.6 $\pm$ 37.89 |
36 changes: 36 additions & 0 deletions supplementary_info/OutOfDatasetAllModels.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
,,Accuracy,,Precision,,Recall,,F1,,AUC
Train Dataset,Model,Mean,Std,Mean,Std,Mean,Std,Mean,Std,Mean
KIBA,DeepDTA,0.829022021,0.010749762,0.31040456,0.020405,0.207502987,0.03479197,0.24583721,0.021436183,0.566980563
KIBA,DeepDTA - IDDTA (BD),0.829209845,0.006798421,0.293646927,0.022751665,0.185663082,0.033530546,0.226221088,0.029410768,0.557881176
KIBA,DeepDTA- IDDTA (BG),0.819203368,0.004696735,0.28080571,0.019095087,0.214384707,0.02384578,0.242792593,0.020989417,0.56420303
KIBA,DeepDTA- BOW (BD),0.827467617,0.010356973,0.28936218,0.032332599,0.187574671,0.044727455,0.225127479,0.037074155,0.55767945
KIBA,DeepDTA- BOW (BG),0.810207254,0.007810392,0.257553875,0.020803436,0.213811231,0.035102976,0.232655882,0.027643565,0.558758022
KIBA,BPE-DTA,0.820906736,0.01226448,0.246859466,0.044139528,0.156224612,0.038263846,0.189819025,0.039769338,0.540667091
KIBA,BPEDTA - BOW (BG),0.832130829,0.009500644,0.249915869,0.042490612,0.117228196,0.028154841,0.158205887,0.031830479,0.530717488
KIBA,BPEDTA - BOW (BD),0.818212435,0.015839599,0.246086189,0.037590463,0.155316607,0.016840031,0.187883322,0.007881915,0.538725919
KIBA,BPEDTA - ID (BD),0.816431347,0.032021931,0.276476198,0.04315266,0.190824373,0.051231744,0.217334936,0.018326673,0.552666354
KIBA,BPEDTA - ID (BG),0.826612694,0.016590142,0.271283117,0.074091025,0.156224612,0.032526657,0.197424105,0.045041716,0.543967335
KIBA,LMDTA,0.81634715,0.009150259,0.303630638,0.015367183,0.272640382,0.034951903,0.285978618,0.019046148,0.587112474
KIBA,LMDTA - IDDTA (BG),0.81382772,0.007708506,0.29278691,0.021013592,0.261792115,0.009454156,0.276196339,0.013206293,0.581081485
KIBA,LMDTA - IDDTA (BD),0.82117228,0.008211465,0.309126278,0.015802444,0.256630824,0.033439151,0.279056895,0.019134225,0.583153397
KIBA,LMDTA - BOW' (BG),0.822279793,0.008080088,0.311151727,0.026659405,0.254719235,0.02648991,0.279486627,0.023109181,0.582988013
KIBA,LMDTA - BOW' (BD),0.818335492,0.007590701,0.308900209,0.014085704,0.273548387,0.025252427,0.289368158,0.01581499,0.58864533
KIBA,LMDTA - BOW (BG),0.818490933,0.006508503,0.305905171,0.028832375,0.268243728,0.03146096,0.285659899,0.029642493,0.586498713
KIBA,LMDTA - BOW (BD),0.814540155,0.003815256,0.290545894,0.006510072,0.256917563,0.032546312,0.271955048,0.020943972,0.579438366
BDB,DeepDTA,0.787630344,0.005322511,0.324392076,0.015503491,0.095480376,0.020490006,0.146098922,0.025017741,0.523897415
BDB,DeepDTA - BOW (BD),0.788132093,0.003046999,0.316115467,0.032871381,0.090525662,0.022331726,0.139984596,0.029676764,0.522320064
BDB,DeepDTA - BOW (BG),0.776805188,0.00546601,0.300606632,0.017693696,0.121266205,0.018750223,0.171997934,0.019856819,0.527022302
BDB,DeepDTA - IDDTA (BG),0.779684273,0.010054804,0.311045065,0.035959851,0.11433138,0.019286912,0.165401376,0.019655804,0.52616195
BDB,DeepDTA - IDDTA (BD),0.78951105,0.003299523,0.33762716,0.014986693,0.098117563,0.009174978,0.15172671,0.010967684,0.526066366
BDB,BPE-DTA,0.782595785,0.008711753,0.321500027,0.005088482,0.117190552,0.03826057,0.168094919,0.039912033,0.529053519
BDB,BPEDTA - BOW (BG),0.729294308,0.011021099,0.270343267,0.010754048,0.23977979,0.029651623,0.253067034,0.018887832,0.542772451
BDB,BPEDTA - BOW (BD),0.771176722,0.003764361,0.292450595,0.006949342,0.134532055,0.016049787,0.183817636,0.01588394,0.52859322
BDB,BPEDTA - IDDTA (BG),0.741467702,0.031915291,0.276033598,0.02498593,0.198907832,0.062318656,0.222589872,0.03780359,0.534733742
BDB,BPEDTA - IDDTA (BD),0.76672583,0.018065987,0.290455981,0.012984762,0.14359794,0.04846431,0.186226212,0.042462659,0.529292685
BDB,LMDTA ,0.753366328,0.012544957,0.247825388,0.012854873,0.139904102,0.036970096,0.176400275,0.0311559,0.519616133
BDB,LMDTA - BOW' (BG),0.753929516,0.012580932,0.249518539,0.026278389,0.139877464,0.031137645,0.17774806,0.029058738,0.519954577
BDB,LMDTA - BOW' (BD),0.760559775,0.005629733,0.257506138,0.022613454,0.132525306,0.027657369,0.174199562,0.029060809,0.521257055
BDB,LMDTA - BOW (BG),0.766973291,0.005335952,0.259211652,0.02450048,0.114056118,0.011179712,0.158293749,0.014887451,0.518189405
BDB,LMDTA - BOW (BD),0.765848622,0.00512969,0.265809323,0.01238758,0.124791334,0.020280877,0.169142773,0.021052123,0.52158376
BDB,LMDTA - IDDTA (BG),0.75421111,0.012421243,0.25086088,0.021671619,0.143127331,0.039554774,0.179935287,0.037213237,0.521367185
BDB,LMDTA - IDDTA (BD),0.76108371,0.003677725,0.250004744,0.022258109,0.122979933,0.020689633,0.164531214,0.023321004,0.517944237
Binary file added supplementary_info/OutOfDatasetAllModels.xlsx
Binary file not shown.
20 changes: 20 additions & 0 deletions supplementary_info/VocabSizeSearch.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
,,,Warm,,Cold Ligand,,Cold Protein,,Cold Both,
Dataset,Chemical Vocab Size,Protein Vocab Size,CI,R2,CI,R2,CI,R2,CI,R2
BDB,8000,8000,0.886 (0.009),0.767 (0.018),0.654 (0.081),-0.101 (0.206),0.726 (0.020),0.118 (0.107),0.557 (0.088),-0.299 (0.295)
BDB,8000,16000,0.881 (0.006),0.754 (0.013),0.641 (0.092),0.005 (0.230),0.708 (0.030),0.040 (0.126),0.531 (0.049),-0.232 (0.288)
BDB,8000,32000,0.883 (0.006),0.774 (0.013),0.657 (0.083),-0.143 (0.202),0.653 (0.060),-0.256 (0.411),0.522 (0.054),-0.442 (0.349)
BDB,16000,8000,0.876 (0.007),0.762 (0.021),0.641 (0.116),-0.206 (0.376),0.696 (0.021),0.120 (0.100),0.520 (0.087),-0.318 (0.335)
BDB,16000,16000,0.881 (0.011),0.769 (0.012),0.619 (0.070),-0.233 (0.160),0.702 (0.052),0.095 (0.261),0.502 (0.057),-0.505 (0.277)
BDB,16000,32000,0.881 (0.004),0.766 (0.017),0.613 (0.097),-0.219 (0.224),0.673 (0.042),-0.224 (0.207),0.524 (0.081),-0.502 (0.299)
BDB,32000,8000,0.887 (0.007),0.762 (0.018),0.642 (0.088),-0.116 (0.155),0.707 (0.029),0.072 (0.172),0.518 (0.064),-0.221 (0.104)
BDB,32000,16000,0.886 (0.007),0.760 (0.017),0.643 (0.066),-0.029 (0.103),0.704 (0.027),0.042 (0.139),0.516 (0.054),-0.177 (0.133)
BDB,32000,32000,0.885 (0.002),0.772 (0.015),0.666 (0.043),-0.150 (0.131),0.671 (0.052),-0.231 (0.488),0.512 (0.052),-0.396 (0.162)
KIBA,8000,8000,0.880 (0.008),0.761 (0.017),0.733 (0.026),0.256 (0.110),0.690 (0.034),0.229 (0.100),0.613 (0.031),-0.015 (0.102)
KIBA,8000,16000,0.881 (0.004),0.760 (0.017),0.735 (0.018),0.268 (0.094),0.692 (0.025),0.209 (0.087),0.623 (0.033),0.005 (0.150)
KIBA,8000,32000,0.881 (0.005),0.760 (0.016),0.735 (0.025),0.274 (0.105),0.680 (0.020),0.185 (0.077),0.605 (0.033),-0.006 (0.117)
KIBA,16000,8000,0.882 (0.007),0.767 (0.019),0.739 (0.018),0.275 (0.104),0.699 (0.029),0.263 (0.071),0.613 (0.022),0.050 (0.104)
KIBA,16000,16000,0.877 (0.003),0.762 (0.010),0.736 (0.017),0.294 (0.065),0.687 (0.025),0.202 (0.079),0.618 (0.030),0.009 (0.111)
KIBA,16000,32000,0.879 (0.006),0.760 (0.017),0.737 (0.016),0.297 (0.077),0.686 (0.031),0.194 (0.078),0.615 (0.036),0.016 (0.110)
KIBA,32000,8000,0.880 (0.004),0.759 (0.011),0.740 (0.018),0.298 (0.088),0.702 (0.022),0.254 (0.033),0.626 (0.032),0.038 (0.126)
KIBA,32000,16000,0.880 (0.005),0.762 (0.017),0.739 (0.013),0.297 (0.069),0.696 (0.035),0.219 (0.114),0.604 (0.055),-0.004 (0.145)
KIBA,32000,32000,0.878 (0.003),0.760 (0.017),0.736 (0.015),0.261 (0.089),0.683 (0.029),0.198 (0.067),0.604 (0.031),-0.042 (0.054)
Binary file added supplementary_info/VocabSizeSearch.xlsx
Binary file not shown.

0 comments on commit 75fe1e1

Please sign in to comment.