Ordering Mutational Data by mutburden from high to low within each clinical data (subplot) cohort #379

shmashitup · 2021-12-12T19:28:37Z

Is there a way to order all mutation data by tumor mutation burden from high to low? I have divided my data into four cohorts which segregate my mutational data in the clinical data subplot. I would like to order my mutational data within each of these cohorts by tumor mutational burden from high to low.
I am not sure how to do this or if it is possible with this package.

zlskidmore · 2021-12-12T20:05:32Z

Hi @shmashitup

which function are you using? You could probably re-factor the input data.frame to get what you want

shmashitup · 2021-12-12T20:06:59Z

I can share my code here:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")

BiocManager::install("GenVisR")
install.packages("reshape2")
#mutational data
library(GenVisR)

library(reshape2)

mutationData <- read.delim("EC_Waterfall Plot_Mutation Data.txt")
mutationData
mutationData <- mutationData[,c("patient", "gene.name", "trv.type", "amino.acid.change")]
colnames(mutationData) <- c("sample", "gene", "variant_class", "amino.acid.change")
mutation_priority <- as.character(unique(mutationData$variant_class))
mutationColours <- c("nonsense"='#4f00A8', "frame_shift_del"='#A80100', "frame_shift_ins"='#CF5A59', "in_frame_del"='#ff9b34', "duplication"='#750054', "delins"='#A80079', "missense"='#009933', "splice_region"='#ca66ae', "deletion"='#888811')

Create an initial plot

mutationHeirarchy<- c("missense", "nonsense", "frame_shift_ins", "frame_shift_del", "delins", "deletion", "duplication", "splice_region")
waterfall(mutationData, fileType = "Custom", variant_class_order=mutationHeirarchy, mainPalette=mutationColours)

tumor mutation burden

mutationBurden <- read.delim("EC_mutationburden.txt")

First, let's look at the sample names in the mutationData and mutationBurden

mutationData$sample
mutationBurden$sample

Create the waterfall plot

waterfall(mutationData, fileType = "Custom", variant_class_order=mutationHeirarchy, mainPalette=mutationColours, mutBurden=mutationBurden)

reformat clinical data to long format

clinicalData <- read.delim("EC_Clinical Data.txt")
clinicalData_2 <- clinicalData[,c(1,2,3,4,5)]
colnames(clinicalData_2) <- c("sample", "Cohort", "MSI Comprehensive", "Sex", "Age")
clinicalData_2 <- melt(data=clinicalData_2, id.vars=c("sample"))
new_samp_order <- as.character(unique(clinicalData_2[order(clinicalData_2$variable, clinicalData_2$value),
]$sample))

create the waterfall plot

waterfall(mutationData, fileType = "Custom", variant_class_order=c("missense", "nonsense", "frame_shift_ins", "frame_shift_del", "delins", "deletion", "duplication", "splice_region"), mainPalette=mutationColours, mutBurden=mutationBurden, clinData=clinicalData_2, clinLegCol=4,
clinVarCol=c('POLE Drivers and Secondary Variant'='#ccbadc', 'POLE Drivers Only'='#9975b9', 'POLE Variants Only'='#7a5d94', 'POLE Potential New Drivers'='#5E5161', '0'='#c2ed67', '1'='#e63a27', 'Male'='#90ddee', 'Female'='#649aa6', '21-30'='#E5E8FF','31-40'='#878cfb', '41-50'='#0022ff', '51-60'='#2d41b9', '61-70'='#3d4780', '71-80'='#3a4061', '81-90'='#000000'),
clinVarOrder=c('POLE Drivers and Secondary Variant', 'POLE Drivers Only', 'POLE Potential New Drivers', 'POLE Variants Only', '0', '1', 'Male', 'Female', '21-30','31-40', '41-50', '51-60', '61-70', '71-80', '81-90'), section_heights=c(1, 3, 1), sampOrder = new_samp_order)

shmashitup · 2021-12-12T20:09:04Z

I don't know much about R (I'm a beginner). When you say refactor the input data.frame do you mean the one for the mutation burden? I thought the waterfall() automatically assigns the tumor mutation burden based on the order of the mutational data. How can I manually correct the order?
@zlskidmore

zlskidmore · 2021-12-13T15:54:31Z

if your using a waterfall plot there should be a parameter called sampOrder where you can give it your samples c("samp1", "samp2") etc.

shmashitup · 2021-12-13T15:59:12Z

I see! Thank you! This makes sense- I can do that!

…

On Mon, Dec 13, 2021 at 10:54 AM Zachary Skidmore ***@***.***> wrote: if your using a waterfall plot there should be a parameter called sampOrder where you can give it your samples c("samp1", "samp2") etc. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#379 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AW3ZH62V6FNW5DXARARQY4LUQYJMHANCNFSM5J4S4VFQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ordering Mutational Data by mutburden from high to low within each clinical data (subplot) cohort #379

Ordering Mutational Data by mutburden from high to low within each clinical data (subplot) cohort #379

shmashitup commented Dec 12, 2021

zlskidmore commented Dec 12, 2021

shmashitup commented Dec 12, 2021

shmashitup commented Dec 12, 2021 •

edited

Loading

zlskidmore commented Dec 13, 2021

shmashitup commented Dec 13, 2021 via email

Ordering Mutational Data by mutburden from high to low within each clinical data (subplot) cohort #379

Ordering Mutational Data by mutburden from high to low within each clinical data (subplot) cohort #379

Comments

shmashitup commented Dec 12, 2021

zlskidmore commented Dec 12, 2021

shmashitup commented Dec 12, 2021

Create an initial plot

tumor mutation burden

First, let's look at the sample names in the mutationData and mutationBurden

Create the waterfall plot

reformat clinical data to long format

create the waterfall plot

shmashitup commented Dec 12, 2021 • edited Loading

zlskidmore commented Dec 13, 2021

shmashitup commented Dec 13, 2021 via email

shmashitup commented Dec 12, 2021 •

edited

Loading