RamEx: An R package for high-throughput microbial ramanome analyses with accurate quality assessment
- Reliability achieved via stringent statistical control
- Robustness achieved via flexible modelling of the data and automatic parameter selection
- Reproducibility promoted by thorough recording of all analysis steps
- Ease of use: high degree of automation, an analysis can be set up in several mouse clicks, no bioinformatics expertise required
- Powerful tuning options to enable unconventional experiments
- Scalability and speed: up to 100 runs processed per minutes
Download: https://github.com/qibebt-bioinfo/RamEx
Installation
Getting started
Raw data formats
Output
Changing default settings
Visualisation
Frequently asked questions (FAQ)
Contact
RamEx will be installed from GitHub:.
library('devtools')
install_github("qibebt-bioinfo/RamEx")
Raman spectra are respectively tracked in single txt files, and their meta info is recorded in the file name. Here we assume there's only one factor of the dataset, which means RamEx do not contain multiple-factor analysis. If you have multiple factors but they are independent of each other, these factors will be treated as one factor.
library(RamEx)
library(magrittr)
data(RamEx_data)
Spectral pretreatment will make the spectrum clearer, containing smoothing, baseline removal, normalization and truncation. Mean spectra will display their effects. Here the results of each step will be kept in the Ramanome for better debugging, and 'draw.mean' exhibit the final dataset.
RamEx_data %<>% Preprocessing.Smooth.Sg %>% Preprocessing.Baseline.Polyfit %>% Preprocessing.Normalize(.,'ch')
mean.spec(RamEx_data@datasets$normalized.data, [email protected]$group)
qc_icod <- Qualitycontrol.ICOD(RamEx_data@datasets$normalized.data,var_tol = 0.5)
data_cleaned <- RamEx_data[qc_icod$quality,]
mean.spec(data_cleaned@datasets$normalized.data, [email protected]$group,0.3)
qc_mcd <- Qualitycontrol.Mcd(RamEx_data@datasets$normalized.data)
qc_t2 <- Qualitycontrol.T2(RamEx_data@datasets$normalized.data)
qc_dis <- Qualitycontrol.Dis(RamEx_data@datasets$normalized.data)
qc_snr <- Qualitycontrol.Snr(RamEx_data@datasets$normalized.data, 'easy')
Get single-cell intensitiy or intensity accumulationy within a wavenumber range, pls give a list containing multiple bands or band ranges. These feature selection results will be saved as 'interested.bands' in the given Ramanome object. Further, you can add some equations by yourself.
data_cleaned <- Feature.Reduction.Intensity(data_cleaned, list(c(2000,2250),c(2750,3050), 1450, 1665))
# calculate CDR
CDR <- data.frame([email protected],
[email protected]$`2000~2250`/([email protected]$`2000~2250` + [email protected]$`2750~3050`))
Nonlinear methods, such as UMAP and t-SNE. Linear methods like PCA, pCoA. The reduced sample matrix will be contained in the Ramanome onject as 'reductions'. Attention: RamEx uses PCA to reduce the dimensions of the high-dimensional spectrum, since UMAP and t-SNE are highly complex algorithms.
data.reduction <- Feature.Reduction.Pca(data_cleaned, draw=T, save = F) %>% Feature.Reduction.Pcoa(., draw=T, save = F) %>% Feature.Reduction.Tsne(., draw=T, save = F) %>% Feature.Reduction.Umap(., draw=T, save=F)
ROC_markers <- Raman.Markers.Roc(data_cleaned@datasets$normalized.data[,sample(1:1000, 50)],[email protected]$group, paired = TRUE, threshold = 0.8)
cor_markers <- Raman.Markers.Correlations(data_cleaned@datasets$normalized.data[,sample(1:1000, 50)],as.numeric([email protected]$group), min.cor = 0.8)
RBCS.markers <- Raman.Markers.Rbcs(data_cleaned, threshold = 0.003, draw = F)
-Global IRCA. This module maybe consumed for a longer period of time due to the image drawing
IRCA.interests <- Intraramanome.Analysis.Irca.Global(data_cleaned)
-Local IRCA
bands_ann <- data.frame(rbind(cbind(c(742,850,872,971,997,1098,1293,1328,1426,1576),'Nucleic acid'),
cbind(c(824,883,1005,1033,1051,1237,1559,1651),'Protein'),
cbind(c(1076,1119,1370,2834,2866,2912),'Lipids')))
colnames(bands_ann) <- c('Wave_num', 'Group')
Intraramanome.Analysis.Irca.Local(data_cleaned, bands_ann = bands_ann)
- 2D-COS
data_cos <- Intraramanome.Analysis.2Dcos(data_cleaned)
clusters_louvain <- Phenotype.Analysis.Louvaincluster(object = data_cleaned, resolutions = c(0.8))
clusters_kmneans <- Phenotype.Analysis.Kmeans(data_cleaned,5)
clusters_hca <- Phenotype.Analysis.Hca(data_cleaned)
-PC-LDA -SVM -Random Forest
model.gmm <- Classification.Gmm(data_cleaned)
model.lda <- Classification.Lda(data_cleaned)
model.rf <- Classification.Rf(data_cleaned)
model.svm <- Classification.Svm(data_cleaned)
quan_pls <- Quantification.Pls(data_cleaned)
quan_mlr <- Quantification.Mlr(data_cleaned)
quan_glm <- Quantification.Glm(data_cleaned)
decom_mcr <- Spectral.Decomposition.Mcrals(data_cleaned,2)
decom_ica <- Spectral.Decomposition.Ica(data_cleaned, 2)
decom_nmf <- Spectral.Decomposition.Nmf(data_cleaned)
It accommodates data from mainstream instrument manufactures such as Horiba, Renishaw, Thermo Fisher Scientific, WITec, and Bruker. This module efficiently manages single-point data collection, where each spectrum is stored in a separate txt file, as well as mapping data enriched with coordinate information.
The Output pane allows to specify where the output should be saved.
RamEx can be successfully used to process almost any experiment with default settings. In general, it is recommended to only change settings when specifically advised to help information.
RamEx also offers an online version. Please visit (http://ramex.single-cell.cn).
Q: Why RamEx?
A: Raman spectroscopy, with its fast, label-free, and non-destructive nature, is increasingly popular for capturing vibrational energy levels and metabolic differences in cells, providing qualitative and quantitative insights at single-cell or subcellular resolutions. Leveraging the extensive information provided by the complex and high-dimensional nature of Ramanome, we developed RamEx, an R package designed to adeptly manage extensive Raman datasets generated by a wide range of devices and instruments. It features: 1) a dynamic outlier detection algorithm that operates without prior knowledge or fixed criteria; 2) optimized clustering and marker identification algorithms tailored to the unique properties of high dimensional, colinear and nonlinear Raman spectra; 3 ) a unified computational framework with tools and pipelines for key Raman tasks such as cell type/species identification, clusteringphenotypic analysis, and antibiotic resistance detectionmolecular composition analysis; 4) enhanced processing of large-scale datasets through C++ optimization and GPU computing; 5) a standardized Raman dataset format with integrated metadata and evaluation metrics; and 6) a graphical user-interface (GUI) for intuitive data visualization and interaction.
Q: How to install OpenCL?
A: RamEx relies on OpenCL for GPU acceleration. If you encounter issues with OpenCL installation or configuration, please refer to the following additional resources:
- Linux:
- Ensure you have the latest drivers for your GPU.
- For more information on OpenCL on Linux, refer to the POCL documentation.
- Windows:
- Intel and AMD provide detailed installation guides for their OpenCL SDKs. Follow the instructions specific to your hardware:
- macOS:
- macOS users can leverage Homebrew for a seamless installation experience also refer to the POCL documentation. If you still face issues, please consider reaching out to the respective SDK support forums or the OpenCL community.
IRCA
He Y., Huang S., Zhang P., Ji Y., Xu J., 2021. Intra-Ramanome Correlation Analysis Unveils Metabolite Conversion Network from an Isogenic Population of Cells. mBio
RBCS
Teng L., Wang X., Wang X., Gou H., Ren L., & Wang T., 2016. Label-free, rapid and quantitative phenotyping of stress response in e. coli via ramanome. Scientific Reports
Please post any questions, feedback, comments or suggestions on the GitHub Discussion board (alternatively can create a GitHub issue) or email SCC.