Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identifying Interactions Across Data Modalities (Potential New Way to Apply SpiecEasi) #271

Open
AntonKjellberg opened this issue Sep 10, 2024 · 1 comment

Comments

@AntonKjellberg
Copy link

AntonKjellberg commented Sep 10, 2024

Hi,

I want to share a new way to apply SpiecEasi and gather any input or critiques from the community.
I’m working with two datasets collected at the same time point:

  1. A profile of 20 mucosal immune markers.
  2. 16S metagenomics of hypopharyngeal samples.
    For this analysis, I’ve limited the metagenomic data to the top 30 bacteria by abundance.

In my dataset, I observe:
• Strongly correlated groups of bacteria and likewise for cytokines
• Many significant correlations between certain bacteria and immune markers, with many bacteria displaying similar
responses.

I hypothesize that the immune response to particular bacteria may be more heterogeneous than what pure correlations suggest. The underlying correlation structures within each dataset might explain the observed similarity in bacterial responses. My goal is to identify which bacteria might be driving the observed shifts in immune tone using SpiecEasi:

  1. I applied CLR transformation separately to both the bacterial count data (top 30 bacteria) and the 20 immune markers
  2. I merged the transformed data into a single dataframe.
  3. I then used this dataframe in a forked version of SpiecEasi (without the normalization function), setting pulsar.params =
    list(thresh = 1) to retrieve all interaction weights.
  4. Finally, I visualized the results through a heatmap of the weights.

The results appear promising:
• SpiecEasi correctly identified that the datasets were individually compositional, applying negative weights to interactions
within the same dataset. This is ideal for focusing on the inter-dataset interactions.
• Some results make biological sense: for instance, one of the strongest positive weights is between Staphylococcus aureus
and IL-17, a well-documented interaction.

heatmap1.pdf
heatmap2.pdf

I recognize that including all weights could be statistically challenging to interpret, but this isn’t the goal here. I already prove that strong associations in overall interaction patterns between the datasets. My primary aim is to gain additional insight into which bacteria may be driving specific changes in immune tone.

Here's some code:

# CLR-Transform them and add Species names as rows
sp <- prune_taxa(names(sort(taxa_sums(subset_samples(phy, Time == '1m')), decreasing = T)[1:30]), 
                 subset_samples(phy_counts, Time == '1m')) %>% 
  microbiome::transform('clr') %>% tax_table() %>% as_tibble() %>% .$Species
otu <- prune_taxa(names(sort(taxa_sums(subset_samples(phy, Time == '1m')), decreasing = T)[1:30]), 
                 subset_samples(phy_counts, Time == '1m')) %>% microbiome::transform('clr') %>% 
  otu_table %>% as.data.frame() 
rownames(otu) <- sp

# Add CLR-transformed cytokines
otu <- otu %>% rbind(t(get_variable(subset_samples(phy, Time == '1m'))[3:22])) %>% select_if(~ !any(is.na(.)))

# Perform SpiecEasi (Forked version without normalization function)
se_all <- SpiecEasi::spiec.easi(otu, method='mb', lambda.min.ratio=1e-2, nlambda=40, pulsar.params = list(thresh = 1))

# Obtain the weights
optbeta <- as.matrix(symBeta(getOptBeta(se_all)))

# Plot all intercations
pheatmap(optbeta,
         color = colorRampPalette(c("#0460bd", "white", '#d6023b'))(200),
         breaks = c(seq(min(optbeta), 0, length.out=ceiling(200/2) + 1), 
                    seq(max(optbeta)/200, max(optbeta), length.out=floor(200/2))),
         labels_row = rownames(otu),
         labels_col = rownames(otu))

# Plot cross-dataset intercations only
weights <- optbeta[1:30, (ncol(optbeta)-19):ncol(optbeta)]
rownames(weights) <- sp
colnames(weights) <- colnames(get_variable(phy))[3:22]
pheatmap(weights, 
         color = colorRampPalette(c("#0460bd", "white", '#d6023b'))(200),
         breaks = c(seq(min(weights), 0, length.out=ceiling(200/2) + 1), 
                    seq(max(weights)/200, max(weights), length.out=floor(200/2))))
@zdk123
Copy link
Owner

zdk123 commented Sep 10, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants