-
Notifications
You must be signed in to change notification settings - Fork 264
link Bokulich et al mock community test files from the QIIME Resources page #2105
Comments
👍 Yes please. While these are on qiita, they are not easily accessible. We could place them in a github repo, similar to www.github.com/torognes/vsearch-data |
I'm still interested in this data. Other people are too. I'm also interested in contributing to the resource page or to a new github repo for this data set. Let me know how I can help. |
@colinbrislawn, if you're available to issue a PR adding the links to the QIIME Resources page (content is here), that'd be fantastic! |
Also, ping @nbokulich so he's aware of this. |
Sure thing! Should I link to the studies on qiita, or on a FTP server, or in a github repo like I mentioned before? |
I think Qiita is ideal, if @antgonza agrees. Otherwise FTP. The files are On Fri, Feb 12, 2016 at 9:54 AM, Colin Brislawn [email protected]
|
I guess I kind of like hosting these on FTP and qiita, if possible. The vsearch test data and the Mothur example data are really easy to download and this encourages reuse. While I use and love qiita, it's new and we can lower the barrier of entry with an FTP site. Could we host these files along with the other files on |
Are the files hosted in Qiita accessible through ftp @antgonza? I can't comment on ftp.microbiome.me - that's a Knight Lab resource. |
Oh if we could hard link to the files in qiita, that would remove the barrier of entry without duplicating effort. That would be perfect! |
Agree - that would be ideal. |
Greg, didn't we copy all raw data to the taxa assignment github? Or still On Fri, Feb 12, 2016 at 9:18 AM, Colin Brislawn [email protected]
|
@nbokulich, thanks for the reminder. We did, and those links are here and other relevant data here. I thought we had these on S3, in which case we'd be paying for the data transfer and it's pretty expensive there, but these are all already on ftp.microbio.me. So, I think we're good to go, and we can link to these and to the Qiita studies. All good @colinbrislawn? |
I'm ready to start. Could you assign it to me? I'll use ftp.microbio.me as much as possible, defaulting to the S3 links when needed. I'll also mention the qiita study IDs. |
I'll do this in waves, starting with qiita and Bokulich, 2013 The original study mentions these data sets:
From that list, these studies are missing from qiita: Like you mentioned, this one is not yet publicly available: |
@colinbrislawn the ids that you are seeing in the original study have been kept in Qiita - so you just need to put the study id at the end of those links and you will have all those. @antgonza is working on getting all of them available through Qiita. |
Good to know. Once the links are live I will add them post haste! What study are 1972 and 1973 from? Those aren't mentioned in the nature paper. |
1972 AN On Fri, Feb 12, 2016 at 3:27 PM, Colin Brislawn [email protected]
|
Sorry, finger slip hit send prematurely. 1972 and 1973 are from a study we are working on now. Unpublished but you 1626 actually = 1517 (it was given a different ID when ported to qiita... 719 was split into 721 (5' reads) and 722 (3' reads). (credit again goes to NOTE: some of these are not actually mock communities. The following IDs On Fri, Feb 12, 2016 at 3:28 PM, Nicholas Bokulich [email protected]
|
Yeah... I think the links on the FTP site use a different nomenclature. I dug up the attached document, which should clear things up: it gives the Does that clear it up? On Fri, Feb 12, 2016 at 4:06 PM, Colin Brislawn [email protected]
|
Looks like the link won't attach. Here's the relevant text (or email me On Fri, Feb 12, 2016 at 4:14 PM, Nicholas Bokulich [email protected]
|
GitHub doesn't like FTPs. |
Oh thanks! I'll take another shot at it. |
I've got most of this wrapped up in a PR. With all your help, I'm really close! I have a quick question: the following folders and associated qiita studies are never mentioned in the Bokulich paper. Are these the data sets from that peerj paper? How should I present these?
Inversely, I don't have FTP links to these qiita studies.
Thanks for helping me construct this. |
That's correct --- those studies not mentioned in the Nature methods paper Studies 1683, 1684, 1689, and 1690 are NOT mock communities. They are Not sure why 719 isn't in the FTP. Think there was another outside link The datasets on the FTP are those used in the peerJ preprint. 1626 (now On Sat, Feb 13, 2016 at 9:35 PM, Colin Brislawn [email protected]
|
I was planning wait for the official publication to post the peerJ paper. Should I just post them now? I'm pretty close to finishing the 2013 paper. I was hoping to add the 1683, 1684, 1689, and 1690 studies. While they are not mock communities, they are included in the paper. I guess I'll just use qiita links if they are not on the server... |
I don't think there's any reason to wait for the peer-reviewed publication. On Sun, Feb 14, 2016 at 5:22 PM, Colin Brislawn [email protected]
|
@nbokulich, aren't the Turnbaugh 1 sequences the one from my 2011 PNAS paper? |
Yes, Turnbaugh 1 = your 2011 PNAS paper. On Mon, Feb 15, 2016 at 4:41 AM, Greg Caporaso [email protected]
|
Just to be sure, |
I was wondering if there were reference sequences for the Bokulich mock communities that were analyzed in "Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing"? What I'm looking for is something like a list of the 16S sequences present in each of the mock community strains (eg. http://www.mothur.org/MiSeqDevelopmentData/HMP_MOCK.fasta). All I can find in the original paper are lists of the strains used. The sequences themselves would be very useful to have for certain benchmarking purposes! |
@colinbrislawn: Incorrect. Turnbaugh2 == qiita 1973 @benjjneb: No, we do not have such ref sequences compiled yet. We are On Mon, Feb 15, 2016 at 11:45 AM, benjjneb [email protected] wrote:
|
OK, I think my PR is done. Can someone review #2130 ? Should I include study |
I have all of the links in email. This is a great resource for testing new methods.
The text was updated successfully, but these errors were encountered: