Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need help to run lastz.psl-100k_sim.sh #1

Open
farhan-lab opened this issue Apr 25, 2022 · 12 comments
Open

Need help to run lastz.psl-100k_sim.sh #1

farhan-lab opened this issue Apr 25, 2022 · 12 comments

Comments

@farhan-lab
Copy link

Dear Luohao,

I am trying to investigate the ZW similarity and evolutionary strata using your script "lastz.psl-100k_sim.sh" and following similar analysis as reported in your paper Xu et al. 2019. I have assembled pseudo-scaffold for each Z and W chromosomes. I tried to run the script according to your protocol as following command:

sh ./BOPsexChr-master/lastz.psl-100k_sim.sh result_test Z_masked.fasta W-scaff.masked.fasta

But I get the following error messages.

FAILURE: in init_from_anchors(), structure size would exceed 2^32 (362128816 + 128*45266102)
consider raising scoring threshold (--hspthresh or --exact) or breaking your target sequence into smaller pieces
./BOPsexChr-master/lastz.psl-100k_sim.sh: 17: ./BOPsexChr-master/lastz.psl-100k_sim.sh: faSize: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 18: ./BOPsexChr-master/lastz.psl-100k_sim.sh: faSize: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 19: ./BOPsexChr-master/lastz.psl-100k_sim.sh: chainPreNet: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 21: ./BOPsexChr-master/lastz.psl-100k_sim.sh: chainNet: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 21: ./BOPsexChr-master/lastz.psl-100k_sim.sh: netSyntenic: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 23: ./BOPsexChr-master/lastz.psl-100k_sim.sh: faToTwoBit: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 24: ./BOPsexChr-master/lastz.psl-100k_sim.sh: faToTwoBit: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 26: ./BOPsexChr-master/lastz.psl-100k_sim.sh: ./BOPsexChr-master/lastz.psl-100k_sim.sh: 26: ./BOPsexChr-master/lastz.psl-100k_sim.sh: netToAxt: not found
axtSort: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 28: ./BOPsexChr-master/lastz.psl-100k_sim.sh: axtToMaf: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 30: ./BOPsexChr-master/lastz.psl-100k_sim.sh: mafToPsl: not found
./BOPsexChr-master/lastz.psl-100k_sim.sh: 33: ./BOPsexChr-master/lastz.psl-100k_sim.sh: pslScore: not found

Could you please guide me, how can I fix these warnings and run the script with success?
Many thanks,
Regards,
Farhan

@lurebgi
Copy link
Owner

lurebgi commented Apr 25, 2022 via email

@farhan-lab
Copy link
Author

Hi,
Many thanks for your quick response. I already installed LASTZ tool, for ucsc genomic utilities, should I install all of the listed utilities or any specific?

@lurebgi
Copy link
Owner

lurebgi commented Apr 26, 2022

For the purpose of this task, installing what is needed, e.g. faSize, will be fine, though it never harms to install all of them.

@lurebgi
Copy link
Owner

lurebgi commented Apr 26, 2022 via email

@farhan-lab
Copy link
Author

farhan-lab commented Apr 26, 2022

Thanks, I already install all utilities and run the script again, this time I got some files, but the main output files are empty:

./BOPsexChr-master/lastz.psl-100k_sim.sh: 7: ./BOPsexChr-master/lastz.psl-100k_sim.sh: module: not found
FAILURE: in init_from_anchors(), structure size would exceed 2^32 (393351136 + 128*49168891)
consider raising scoring threshold (--hspthresh or --exact) or breaking your target sequence into smaller pieces
Got 1 chroms in result_test/result_test.z.fa.size, 1 in result_test/result_test.w.list.mask.fa.size
Finishing nets
writing stdout
writing /dev/null
memory usage 41680896, utime 0 s/100, stime 1
result_test/result_test.z-w.psl is empty
Can't open perl script "psl-100k_sim.pl": No such file or directory

ls -lah
total 35M
-rwxrwxrwx 1 cytolab cytolab 402 Apr 26 14:14 result_test.axt
-rwxrwxrwx 1 cytolab cytolab 528 Apr 26 14:14 result_test.axt.chain
-rwxrwxrwx 1 cytolab cytolab 528 Apr 26 14:14 result_test.axt.chain.filt
-rwxrwxrwx 1 cytolab cytolab 95 Apr 26 14:14 result_test.chain.log
-rwxrwxrwx 1 cytolab cytolab 528 Apr 26 14:14 result_test.noClass.net
-rwxrwxrwx 1 cytolab cytolab 11M Apr 26 14:14 result_test.w.list.mask.fa.2bit
-rwxrwxrwx 1 cytolab cytolab 23 Apr 26 14:14 result_test.w.list.mask.fa.size
-rwxrwxrwx 1 cytolab cytolab 24M Apr 26 14:14 result_test.z.fa.2bit
-rwxrwxrwx 1 cytolab cytolab 23 Apr 26 14:14 result_test.z.fa.size
-rwxrwxrwx 1 cytolab cytolab 528 Apr 26 14:14 result_test.z-w.axt
-rwxrwxrwx 1 cytolab cytolab 559 Apr 26 14:14 result_test.z-w.maf
-rwxrwxrwx 1 cytolab cytolab 0 Apr 26 14:14 result_test.z-w.psl
-rwxrwxrwx 1 cytolab cytolab 0 Apr 26 14:14 result_test.z-w.psl.score
-rwxrwxrwx 1 cytolab cytolab 0 Apr 26 14:14 result_test.z-w.psl.score.ide95.filt
-rwxrwxrwx 1 cytolab cytolab 0 Apr 26 14:14 result_test.z-w.psl.score.ide95.filt.100k
-rwxrwxrwx 1 cytolab cytolab 0 Apr 26 14:14 result_test.z-w.psl.score.ide95.filt.ide-100k

Do you think it may be possible that , Z and W chromosome alignments were somehow restricted from the lastz filtering criterion "--hspthresh=2200 --inner=2000 --ydrop=3400 --gappedthresh=10000" which ended up no similarity?

Many thanks for your support

@lurebgi
Copy link
Owner

lurebgi commented Apr 26, 2022

can you show me result_test/result_test.z.fa.size

@farhan-lab
Copy link
Author

Sure, Here it is:
chromosome_8 100101675

@lurebgi
Copy link
Owner

lurebgi commented Apr 26, 2022

how about result_test/result_test.w.list.mask.fa.size?

Perhaps try lastz with default settings, i.e. without "--hspthresh=2200 --inner=2000 --ydrop=3400 --gappedthresh=10000", and try with smaller size data to figure out the problem.

@farhan-lab
Copy link
Author

cat result_test.w.list.mask.fa.size
Super-Scaffol 43643165

OK.. I will try lastz with default setting. and update on it soon

@lurebgi
Copy link
Owner

lurebgi commented Apr 26, 2022

maybe also try splitting Super-Scaffol into smaller pieces. The issue seems to be memory limitation.

@farhan-lab
Copy link
Author

Hi I just re run with last options and the output files are still empty.

cat result_test.axt

lastz.v1.04.15 --step=19 --format=axt

hsp_threshold = 3000

gapped_threshold = 3000

x_drop = 910

y_drop = 9400

gap_open_penalty = 400

gap_extend_penalty = 30

A C G T

A 91 -114 -31 -123

C -114 100 -125 -31

G -31 -125 100 -114

T -123 -31 -114 91

Is it possible that my Z and W scaffolds sequences do not show any similarity or homology at all?

@lurebgi
Copy link
Owner

lurebgi commented Apr 26, 2022

please try with nucmer or other tools then - sorry but I couldn't be more helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants