Identifying Interactions Across Data Modalities (Potential New Way to Apply SpiecEasi) #271

AntonKjellberg · 2024-09-10T09:09:28Z

Hi,

I want to share a new way to apply SpiecEasi and gather any input or critiques from the community.
I’m working with two datasets collected at the same time point:

A profile of 20 mucosal immune markers.
16S metagenomics of hypopharyngeal samples.
For this analysis, I’ve limited the metagenomic data to the top 30 bacteria by abundance.

In my dataset, I observe:
• Strongly correlated groups of bacteria and likewise for cytokines
• Many significant correlations between certain bacteria and immune markers, with many bacteria displaying similar
responses.

I hypothesize that the immune response to particular bacteria may be more heterogeneous than what pure correlations suggest. The underlying correlation structures within each dataset might explain the observed similarity in bacterial responses. My goal is to identify which bacteria might be driving the observed shifts in immune tone using SpiecEasi:

I applied CLR transformation separately to both the bacterial count data (top 30 bacteria) and the 20 immune markers
I merged the transformed data into a single dataframe.
I then used this dataframe in a forked version of SpiecEasi (without the normalization function), setting pulsar.params =
list(thresh = 1) to retrieve all interaction weights.
Finally, I visualized the results through a heatmap of the weights.

The results appear promising:
• SpiecEasi correctly identified that the datasets were individually compositional, applying negative weights to interactions
within the same dataset. This is ideal for focusing on the inter-dataset interactions.
• Some results make biological sense: for instance, one of the strongest positive weights is between Staphylococcus aureus
and IL-17, a well-documented interaction.

heatmap1.pdf
heatmap2.pdf

I recognize that including all weights could be statistically challenging to interpret, but this isn’t the goal here. I already prove that strong associations in overall interaction patterns between the datasets. My primary aim is to gain additional insight into which bacteria may be driving specific changes in immune tone.

Here's some code:

# CLR-Transform them and add Species names as rows
sp <- prune_taxa(names(sort(taxa_sums(subset_samples(phy, Time == '1m')), decreasing = T)[1:30]), 
                 subset_samples(phy_counts, Time == '1m')) %>% 
  microbiome::transform('clr') %>% tax_table() %>% as_tibble() %>% .$Species
otu <- prune_taxa(names(sort(taxa_sums(subset_samples(phy, Time == '1m')), decreasing = T)[1:30]), 
                 subset_samples(phy_counts, Time == '1m')) %>% microbiome::transform('clr') %>% 
  otu_table %>% as.data.frame() 
rownames(otu) <- sp

# Add CLR-transformed cytokines
otu <- otu %>% rbind(t(get_variable(subset_samples(phy, Time == '1m'))[3:22])) %>% select_if(~ !any(is.na(.)))

# Perform SpiecEasi (Forked version without normalization function)
se_all <- SpiecEasi::spiec.easi(otu, method='mb', lambda.min.ratio=1e-2, nlambda=40, pulsar.params = list(thresh = 1))

# Obtain the weights
optbeta <- as.matrix(symBeta(getOptBeta(se_all)))

# Plot all intercations
pheatmap(optbeta,
         color = colorRampPalette(c("#0460bd", "white", '#d6023b'))(200),
         breaks = c(seq(min(optbeta), 0, length.out=ceiling(200/2) + 1), 
                    seq(max(optbeta)/200, max(optbeta), length.out=floor(200/2))),
         labels_row = rownames(otu),
         labels_col = rownames(otu))

# Plot cross-dataset intercations only
weights <- optbeta[1:30, (ncol(optbeta)-19):ncol(optbeta)]
rownames(weights) <- sp
colnames(weights) <- colnames(get_variable(phy))[3:22]
pheatmap(weights, 
         color = colorRampPalette(c("#0460bd", "white", '#d6023b'))(200),
         breaks = c(seq(min(weights), 0, length.out=ceiling(200/2) + 1), 
                    seq(max(weights)/200, max(weights), length.out=floor(200/2))))

The text was updated successfully, but these errors were encountered:

zdk123 · 2024-09-10T11:57:03Z

If the cytokine measurements are compositional, this should be equivalent to running multi-domain SPIEC-EASI (which applies the CLR independently to each dataset as you did here).

…

On Tue, Sep 10, 2024, 5:09 AM Anton Kjellberg ***@***.***> wrote: Hi, I want to share a new way to apply SpiecEasi and gather any input or critiques from the community. I’m working with two datasets collected at the same time point: 1. A profile of 20 mucosal immune markers. 2. 16S metagenomics of hypopharyngeal samples. For this analysis, I’ve limited the metagenomic data to the top 30 bacteria by abundance. In my dataset, I observe: • Strongly correlated groups of bacteria and likewise for cytokines • Many significant correlations between certain bacteria and immune markers, with many bacteria displaying similar responses. I hypothesize that the immune response to particular bacteria may be more heterogeneous than what pure correlations suggest. The underlying correlation structures within each dataset might explain the observed similarity in bacterial responses. My goal is to identify which bacteria might be driving the observed shifts in immune tone using SpiecEasi: 1. I applied CLR transformation separately to both the bacterial count data (top 30 bacteria) and the 20 immune markers 2. I merged the transformed data into a single dataframe. 3. I then used this dataframe in a forked version of SpiecEasi (without the normalization function), setting pulsar.params = list(thresh = 1) to retrieve all interaction weights. 4. Finally, I visualized the results through a heatmap of the weights. The results appear promising: • SpiecEasi correctly identified that the datasets were individually compositional, applying negative weights to interactions within the same dataset. This is ideal for focusing on the inter-dataset interactions. • Some results make biological sense: for instance, one of the strongest positive weights is between Staphylococcus aureus and IL-17, a well-documented interaction. heatmap1.pdf <https://github.com/user-attachments/files/16942342/heatmap1.pdf> heatmap2.pdf <https://github.com/user-attachments/files/16942343/heatmap2.pdf> I recognize that including all weights could be statistically challenging to interpret, but this isn’t the goal here. I already prove that strong associations in overall interaction patterns between the datasets. My primary aim is to gain additional insight into which bacteria may be driving specific changes in immune tone. Here's some code: `# Select the top 30 Bacteria by abundance (phy is relative abundance transformed, phy_counts contains counts) CLR-Transform them and add Species names as rows sp <- prune_taxa(names(sort(taxa_sums(subset_samples(phy, Time == '1m')), decreasing = T)[1:30]), subset_samples(phy_counts, Time == '1m')) %>% microbiome::transform('clr') %>% tax_table() %>% as_tibble() %>% .$Species otu <- prune_taxa(names(sort(taxa_sums(subset_samples(phy, Time == '1m')), decreasing = T)[1:30]), subset_samples(phy_counts, Time == '1m')) %>% microbiome::transform('clr') %>% otu_table %>% as.data.frame() rownames(otu) <- sp Add CLR-transformed cytokines otu <- otu %>% rbind(t(get_variable(subset_samples(phy, Time == '1m'))[3:22])) %>% select_if(~ !any(is.na(.))) Perform SpiecEasi (Forked version without normalization function) se_all <- SpiecEasi::spiec.easi(otu, method='mb', lambda.min.ratio=1e-2, nlambda=40, pulsar.params = list(thresh = 1)) Obtain the weights optbeta <- as.matrix(symBeta(getOptBeta(se_all))) Plot all intercations pheatmap(optbeta, color = colorRampPalette(c("#0460bd", "white", '#d6023b'))(200), breaks = c(seq(min(optbeta), 0, length.out=ceiling(200/2) + 1), seq(max(optbeta)/200, max(optbeta), length.out=floor(200/2))), labels_row = rownames(otu), labels_col = rownames(otu)) Plot cross-dataset intercations only weights <- optbeta[1:30, (ncol(optbeta)-19):ncol(optbeta)] rownames(weights) <- sp colnames(weights) <- colnames(get_variable(phy))[3:22] pheatmap(weights, color = colorRampPalette(c("#0460bd", "white", '#d6023b'))(200), breaks = c(seq(min(weights), 0, length.out=ceiling(200/2) + 1), seq(max(weights)/200, max(weights), length.out=floor(200/2)))) ` — Reply to this email directly, view it on GitHub <#271>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAUD2RE5OVEJMFZDVZVG7PDZV2ZN5AVCNFSM6AAAAABN6I7AT6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGUYTKOBWGQZTEMY> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identifying Interactions Across Data Modalities (Potential New Way to Apply SpiecEasi) #271

Identifying Interactions Across Data Modalities (Potential New Way to Apply SpiecEasi) #271

AntonKjellberg commented Sep 10, 2024 •

edited

Loading

zdk123 commented Sep 10, 2024 via email

Identifying Interactions Across Data Modalities (Potential New Way to Apply SpiecEasi) #271

Identifying Interactions Across Data Modalities (Potential New Way to Apply SpiecEasi) #271

Comments

AntonKjellberg commented Sep 10, 2024 • edited Loading

zdk123 commented Sep 10, 2024 via email

AntonKjellberg commented Sep 10, 2024 •

edited

Loading