Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

olgabot/2022 jun comments #12

Merged
merged 35 commits into from
Jun 27, 2022

Conversation

olgabot
Copy link
Contributor

@olgabot olgabot commented Jun 18, 2022

  • Suggestion to add intro sentence to abstract
  • Questions and comments on introduction

@olgabot
Copy link
Contributor Author

olgabot commented Jun 18, 2022

Got about 1.5 figures in so far, will continue soon!

Questions/comments from inline HTML comments:

  1. Does the "size" of a pangenome mean number of sequences, number of genes, number of species, or something else?
  2. Is an "open" pangenome just all the genomes in the world? Why is it that "closed" pangenomes don't increase in size?
  3. Maybe add a citation for Mantel test? I had to look it up
  4. Can you add linear fit lines with $R^2 = 0.12$ and $R^2 = 0.87$ to @fig:panmers_fig A? I think it would make your point more clear. (oh wow GitHub renders LaTeX now!)
  5. Add the t-test statistic values to Figure @fig:panmers_fig B

@fig
Copy link

fig commented Jun 18, 2022

Hi!

@olgabot
Copy link
Contributor Author

olgabot commented Jun 18, 2022

Hi!

Hahah hello! Didn't mean to tag you there, sorry for the spam. Meant to be referencing a particular figure in the paper 😅

@olgabot
Copy link
Contributor Author

olgabot commented Jun 27, 2022

Made some more comments! Here are all the comments/questions/suggestions, listed out below:

  1. Is it possible to compute distance by mya from GTDB-tk?
  2. Is there a figure for noncoding reads in pseudogenes?
  3. Is there a figure for read error rates in coding vs noncoding?
  4. Is "genome" the preferred name of taxonomic rank over "strain" in GTDB?
  5. The legend for metap_sfig would be clearer if shown as a table, like a 2x2 contingency table
  6. What does "other" mean In figure 4D? The text states that BIOML-A27 was the only strain of B. uniformis, but it seems like that is not true "other" is present
  7. You may be dinged on quantifying how k-mers are "fast" relative to the other option

Copy link
Member

@taylorreiter taylorreiter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for all of your comments Olga! I removed the inline comments and made issues/notes/or comments i'll add to the conversation. I also addressed a few typos or built on your suggestions as comments...hoping I can commit those into this PR after I publish this review.

content/01.abstract.md Outdated Show resolved Hide resolved
content/01.abstract.md Outdated Show resolved Hide resolved
content/02.introduction.md Outdated Show resolved Hide resolved
content/02.introduction.md Outdated Show resolved Hide resolved
content/02.introduction.md Outdated Show resolved Hide resolved
content/03.results.md Outdated Show resolved Hide resolved
content/03.results.md Outdated Show resolved Hide resolved
content/03.results.md Outdated Show resolved Hide resolved
content/04.discussion.md Outdated Show resolved Hide resolved
content/95.appendix.md Outdated Show resolved Hide resolved
@taylorreiter
Copy link
Member

Ok and some answers to you inline comments!

Question: Does the "size" of a pangenome mean number of sequences, number of genes, number of species, or something else?

That's a really good question. I think it could mean the number of distinct sequences observed across all the genomes that were looked at, but typically it refers to the number of genes.

Question: is an "open" pangenome just all the genomes in the world? Why is it that "closed" pangenomes don't increase in size?

A pangenome can be considered all of the genomes in the world (although it not really possible to exhaustively sample all genomes in the world), but the openness or closed-ness of a pangenome is a property of the eco-evo strategy of that group of organisms. Closed pangenomes are usually associated with organisms that have a very small niche breadth.

Is "genome" the preferred name of taxonomic rank over "strain" in GTDB?

Eh...it's sort of unclear. Genome is a bit more exact, as strain can have many definitions.

What does "other" mean In figure 4D? The text states that BIOML-A27 was the only strain of B. uniformis, but it seems like that is not true "other" is present

Good catch -- it's other strains from different species that get scooped in because assembly graph queries retrieve things down to about ~93% ANI. I updated the language to make this more clear.

You may be dinged on quantifying how k-mers are "fast" relative to the other option

This is probably true...I'm going to leave it for now, and if necessary I'll go back and benchmark.

The legend for metap_sfig would be clearer if shown as a table, like a 2x2 contingency table

Love this idea! Will update.

Also note I made issues #13, #14, and #15 to account for the (potentially) missing figures. There's a chance I may preprint without these figures, and then will think about adding them to the supplement before submitting for publication.

Lastly, I will either merge this branch and then update the figures (add lines of fit, t test results, and contingency table for legend) and add more details to some sections of the methods, or will push more changes to this branch.

Thank you again so much olga! your comments were 💯

@ctb
Copy link
Member

ctb commented Jun 28, 2022

Is "genome" the preferred name of taxonomic rank over "strain" in GTDB?

Eh...it's sort of unclear. Genome is a bit more exact, as strain can have many definitions.

Yes, this is something that I think Taylor and Tessa came up with and just started using, and it's SO MUCH CLEARER than using "strain"! 🎉

Like "is it a different strain or not?" Well who knows, but it's definitely a different genome sequence, so 🤷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants