Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ordering Mutational Data by mutburden from high to low within each clinical data (subplot) cohort #379

Open
shmashitup opened this issue Dec 12, 2021 · 5 comments

Comments

@shmashitup
Copy link

Is there a way to order all mutation data by tumor mutation burden from high to low? I have divided my data into four cohorts which segregate my mutational data in the clinical data subplot. I would like to order my mutational data within each of these cohorts by tumor mutational burden from high to low.
I am not sure how to do this or if it is possible with this package.

@zlskidmore
Copy link
Member

Hi @shmashitup

which function are you using? You could probably re-factor the input data.frame to get what you want

@shmashitup
Copy link
Author

I can share my code here:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")

BiocManager::install("GenVisR")
install.packages("reshape2")
#mutational data
library(GenVisR)

library(reshape2)

mutationData <- read.delim("EC_Waterfall Plot_Mutation Data.txt")
mutationData
mutationData <- mutationData[,c("patient", "gene.name", "trv.type", "amino.acid.change")]
colnames(mutationData) <- c("sample", "gene", "variant_class", "amino.acid.change")
mutation_priority <- as.character(unique(mutationData$variant_class))
mutationColours <- c("nonsense"='#4f00A8', "frame_shift_del"='#A80100', "frame_shift_ins"='#CF5A59', "in_frame_del"='#ff9b34', "duplication"='#750054', "delins"='#A80079', "missense"='#009933', "splice_region"='#ca66ae', "deletion"='#888811')

Create an initial plot

mutationHeirarchy<- c("missense", "nonsense", "frame_shift_ins", "frame_shift_del", "delins", "deletion", "duplication", "splice_region")
waterfall(mutationData, fileType = "Custom", variant_class_order=mutationHeirarchy, mainPalette=mutationColours)

tumor mutation burden

mutationBurden <- read.delim("EC_mutationburden.txt")

First, let's look at the sample names in the mutationData and mutationBurden

mutationData$sample
mutationBurden$sample

Create the waterfall plot

waterfall(mutationData, fileType = "Custom", variant_class_order=mutationHeirarchy, mainPalette=mutationColours, mutBurden=mutationBurden)

reformat clinical data to long format

clinicalData <- read.delim("EC_Clinical Data.txt")
clinicalData_2 <- clinicalData[,c(1,2,3,4,5)]
colnames(clinicalData_2) <- c("sample", "Cohort", "MSI Comprehensive", "Sex", "Age")
clinicalData_2 <- melt(data=clinicalData_2, id.vars=c("sample"))
new_samp_order <- as.character(unique(clinicalData_2[order(clinicalData_2$variable, clinicalData_2$value),
]$sample))

create the waterfall plot

waterfall(mutationData, fileType = "Custom", variant_class_order=c("missense", "nonsense", "frame_shift_ins", "frame_shift_del", "delins", "deletion", "duplication", "splice_region"), mainPalette=mutationColours, mutBurden=mutationBurden, clinData=clinicalData_2, clinLegCol=4,
clinVarCol=c('POLE Drivers and Secondary Variant'='#ccbadc', 'POLE Drivers Only'='#9975b9', 'POLE Variants Only'='#7a5d94', 'POLE Potential New Drivers'='#5E5161', '0'='#c2ed67', '1'='#e63a27', 'Male'='#90ddee', 'Female'='#649aa6', '21-30'='#E5E8FF','31-40'='#878cfb', '41-50'='#0022ff', '51-60'='#2d41b9', '61-70'='#3d4780', '71-80'='#3a4061', '81-90'='#000000'),
clinVarOrder=c('POLE Drivers and Secondary Variant', 'POLE Drivers Only', 'POLE Potential New Drivers', 'POLE Variants Only', '0', '1', 'Male', 'Female', '21-30','31-40', '41-50', '51-60', '61-70', '71-80', '81-90'), section_heights=c(1, 3, 1), sampOrder = new_samp_order)

@shmashitup
Copy link
Author

shmashitup commented Dec 12, 2021

I don't know much about R (I'm a beginner). When you say refactor the input data.frame do you mean the one for the mutation burden? I thought the waterfall() automatically assigns the tumor mutation burden based on the order of the mutational data. How can I manually correct the order?
@zlskidmore

@zlskidmore
Copy link
Member

if your using a waterfall plot there should be a parameter called sampOrder where you can give it your samples c("samp1", "samp2") etc.

@shmashitup
Copy link
Author

shmashitup commented Dec 13, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants