-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scaffold2fasta: Assertion `expectedOverlap < (int)contigString.length()' failed #125
Comments
I haven't seen that one before - did you use the |
I did at first, and I had suspected that's what had gone wrong. But, no, I repeated the estimates with |
I'm at a bit of a loss - could you send the .scaf file and .asqg.gz (or contig fasta, if the error appears with that input too). I won't be able to look until next week though. |
Here you go:
sga scaffold2fasta -m 500 --write-unplaced --use-overlap -o hsapiens-scaffolds.fa -a hsapiens-graph.asqg.gz hsapiens.scaf |
Any chance you've had some time to look into this issue, Jared? |
Sorry, no. My time is short these days I'm afraid. Is this blocking you from something? |
No worries. I am sympathetic to time constraints. We wanted to include SGA in an assembly comparison of GIAB HG004. SGA will be included in the comparison of contigs. Including scaffold results hinges on this issue. |
What's the format of the
|
The format is: std::ostream& operator<<(std::ostream& out, const ScaffoldLink& link)
{
out << link.endpointID << "," << link.distance << "," << link.stdDev << ","
<< link.edgeData.getDir() << "," << link.edgeData.getComp() << "," << link.getTypeCode();
return out;
} https://github.com/jts/sga/blob/master/src/Scaffold/ScaffoldLink.cpp#L118 |
Could I convert this to an ABySS path file like so, or is it not quite so simple? How do I convert the
|
Is the |
The former - it is the final list of contigs that go into scaffolds. |
Can you give me a brief description of how to convert a SGA
Uses cases like this one are why I so badly want to a GFA standard! It would be great if |
Hi Shaun, Agreed about GFA2. I describe the file format here: The orientation bit describes whether the scaffold should be built left-to-right (the first contig in the record is the first contig in the scaffold) or right-to-left (the first contig is the last record). See here: https://github.com/jts/sga/blob/master/src/Scaffold/ScaffoldRecord.cpp#L45 I suggest you don't spend much time on this - SGA is no longer under active development and there are better short read assemblers out there. Sorry I can't help more! Jared |
No worries, Jared. Quick comments are helpful. We're including SGA in this comparison to ABySS 2.0 due to its low memory usage. We're also comparing to DISCOVARdenovo, and it yields great contiguity, but takes a lot of memory and time. From the code it appears that only the direction bit of the first link matters, and the rest are ignored. Is that correct? https://github.com/jts/sga/blob/master/src/Scaffold/ScaffoldRecord.cpp#L60 |
Yes I believe that is correct but verify :) On Wed, Nov 2, 2016 at 1:48 PM, Shaun Jackman [email protected]
|
Here's a hacktastic script to convert a SGA #!/bin/sh
set -eu
exec gawk -e '
!/\t/ { next }
{
# Determine whether the contigs are in reverse order.
reverse = /^[^\t]*\t[^\t]*,1,[01],D/
# Orient the contigs.
$1 = $1 "+"
rc = 0
for (i = 2; i <= NF; ++i) {
if ($i ~ /,1,D/)
rc = !rc
sub(/,.*/, rc ? "-" : "+", $i)
}
# Print the contig IDs and orientations.
printf "%u", id++
if (reverse) {
for (i = NF; i > 0; --i)
printf " %s", $i
printf "\n"
} else {
print " " $0
}
}' "$@" \
| gsed -e 's/ /\t/;s/ / 199N /g' |
sga scaffold2fasta blows an assertion. See jts/sga#125
Great, thanks for the script. |
Any suggestions?
The text was updated successfully, but these errors were encountered: