-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hominid script ends with error #2
Comments
Hello Ankit,
I think there might be something wrong with the data for SNP AXCM75183727.
Have you looked at that row? Maybe it has a "NaN" or "NA" or two columns
ran together. If you don't see anything wrong would you please send that
row?
Josh
…On Wed, Nov 13, 2019 at 11:47 PM ankit4035 ***@***.***> wrote:
Hi,
I have used Hominid where it was working well. However, recently it is
throwing an error in a new file.
The error seems to be something like:
ValueError: Input contains NaN, infinity or a value too large for
dtype('float64')
and
TypeError: 'NoneType' object is not iterable
After throwing the error, scripts stops execution. Prior to this faulty
SNP (where script halts), all other SNPs were processed and written in
output file as well.
Please help me where I could be wrong.
I am attaching the scrrenshots of the script as well as error message.
Thanks,
Ankit
[image: Screenshot from 2019-11-14 10:18:10]
<https://user-images.githubusercontent.com/8553701/68827396-d4ff9680-06c7-11ea-816b-7b38b75e980e.png>
[image: Screenshot from 2019-11-14 10:19:41]
<https://user-images.githubusercontent.com/8553701/68827397-d4ff9680-06c7-11ea-9363-132ce1d512f6.png>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=ABCI2RKEZ3FCVESKGJPFNBLQTTJ57A5CNFSM4JNFWJ5KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HZGSTKQ>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCI2RIKWPWTBLL4WBS5AKTQTTJ57ANCNFSM4JNFWJ5A>
.
|
Hi Ankit,
I don't see a problem with that row either. Could you send the row above it
for comparison?
Josh
…On Thu, Nov 14, 2019, 07:13 ankit4035 ***@***.***> wrote:
Hi Josh,
I checked that row, there is no "NA" or "NaN" in it.
[image: Screenshot from 2019-11-14 17:44:48]
<https://user-images.githubusercontent.com/8553701/68856269-1dd64000-0706-11ea-88a8-77d1ebb607ca.png>
I have also reinstalled the hominid and still facing the same issue.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=ABCI2RPCHOMFT4ZF3M6BUT3QTU6FJA5CNFSM4JNFWJ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEBUAWA#issuecomment-553861208>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCI2RLCQV4CTHV7BR7UQJDQTU6FJANCNFSM4JNFWJ5A>
.
|
Hi Ankit,
Thanks for sending the extra rows. I haven't figured out what is wrong with
AXCM75183727 yet, but it looks like -1 is being used to indicate missing
data. Use NA instead.
Would you send AECK-final.rvcf? If you are not comfortable with that I'll
send you a little script that may identify the problem.
Josh
…On Thu, Nov 14, 2019 at 10:39 PM ankit4035 ***@***.***> wrote:
Hi Josh,
Here is the row above and below it.
[image: Screenshot from 2019-11-15 09:12:22]
<https://user-images.githubusercontent.com/8553701/68915342-a0ebaa80-0787-11ea-9ac0-8aeaf08360c7.png>
Ankit
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=ABCI2RPNBGUIE2WMB46EYVLQTYKXXA5CNFSM4JNFWJ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEEGIKI#issuecomment-554198057>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCI2RIARWVSRYTCNBKKZJ3QTYKXXANCNFSM4JNFWJ5A>
.
|
Hi Ankit,
Please run this script on your input files. You will have to edit the two
file paths to point to your files. The script must run in the virtual
environment you use for hominid. It aligns the SNP and taxa data and checks
for "non-finite" values in the SNP data. If a row is found with bad values
it will be printed so we can maybe see what is going on.
Josh
…On Fri, Nov 15, 2019 at 11:22 PM ankit4035 ***@***.***> wrote:
Hi Josh,
Yes, -1 is used for missing/Undetermined call. As per your suggestion, I
replaced all -1 with "NA" in the file and run it again. The result is still
the same, error at the same SNP.
[image: Screenshot from 2019-11-16 09:52:46]
<https://user-images.githubusercontent.com/8553701/68988012-a237d880-0856-11ea-9ad6-89086a3e34bf.png>
[image: Screenshot from 2019-11-16 09:50:18]
<https://user-images.githubusercontent.com/8553701/68988011-a237d880-0856-11ea-8b69-b1d8495ee85e.png>
If possible, can you send me that script you were talking about.
Ankit
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=ABCI2RIATLIFHPKTTZUKXRTQT5YPTA5CNFSM4JNFWJ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEHJDUY#issuecomment-554602963>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCI2RP66CZ3SOWB2RCVUXLQT5YPTANCNFSM4JNFWJ5A>
.
|
Hi Josh, I think you forgot to attach the script. Can you please send it again. Ankit |
I had to change the file extension to .txt and attach it directly to the issue. |
Hi Josh, The script executed perfectly without any error. All the SNPs were finite and no issues were encountered. Ankit |
Sorry I closed the issue by mistake.. |
Hi Ankit,
Let's try printing the array that is causing trouble from inside hominid. I
created a branch in the github repository that will print the array that is
causing trouble right before the exception occurs. Please repeat the
installation process with these changes to step 4 (lines 1 and 4 are new,
the other lines are unchanged):
$ git checkout issue-2
$ conda create -n hom python=3.6 --file conda-requirements.txt
$ source activate hom
(hom) $ pip install -e .
(hom) $ pip install -r requirements.txt
(hom) $ conda install rpy2 r-essentials
Then run hominid from your new installation. If possible you might try
using just the three rows you sent before. The array that is causing the
exception should be printed. The exception will still occur.
I hope these instructions are clear. Let me know if you have any trouble.
Josh
…On Tue, Nov 19, 2019 at 7:21 AM ankit4035 ***@***.***> wrote:
Sorry I closed the issue by mistake..
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2?email_source=notifications&email_token=ABCI2RPBCOI73OHDRA3GVTLQUPK3XA5CNFSM4JNFWJ5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEOADPY#issuecomment-555483583>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCI2RLQIOT55FOWWX6CKGLQUPK3XANCNFSM4JNFWJ5A>
.
|
Hi Josh, Your instructions are very clear. I installed a new hominid and created a new virtual env. for it as well. I run the first five SNPs (problematic SNP AXCM75183727 is the second SNP) and the script ended up with the error. I am attaching here :
Ankit |
Hi Ankit, I was not expecting to see that all elements of y_true are nan. My first thought is that none of the SNP data is aligning with the taxa data. I added another print to see what the taxa data looks like. Would you please run Josh |
Hi Josh, I did as you asked. Here are the files. test_result-rvcf.txt And thanks for sorting this out. Let's finish this. Ankit |
Hi Ankit, I added some more prints to show the data before it is sent to a worker. Would you please repeat the Josh |
Hi Josh, Did as you asked and here are the results file. test-run-message.txt Ankit |
Hi Ankit, Sorry I have been out of touch lately. I just added some prints to the training loop since the data looks ok before that point. This could generate a lot of output. Please give it a try if you like. Josh |
Hi Josh, Here are the output files. test-run-message.txt Ankit |
Hello Ankit, Josh |
Hi Josh, Here are the files:
I tried with combined good and bad SNP, received the error in SNP AXCM75183727. I removed that line and tried again, received the error in SNP AXCM75183728. After removing both SNP, the file is running without any errors. Merry Christmas and happy holidays. Ankit |
Hi Ankit, I am not getting any errors from the bad SNP file. Do you get errors from it? Josh |
Hi Josh, This is the output from the terminal screen. Ankit |
Maybe there is something wrong with the formatting of the input file. When I run the bad SNP data in test-input-rvcf.txt I get no errors. This is the output:
|
I am using the same files, the command gets completed but shows error at the end. When I was running with complete files, the script stops on showing this error. However, it don't terminate the command and keeps using all the resources. Also, no new lines are added in the output file. I will try preparing the files again and see what happens. Thanks. |
Hi Josh, I reprepared both input files and reinstalled Hominid, but the same error persists. I also tried installing Hominid on a new system and the same error remains. Would it possible for you to run it if I can share the complete file with you. Can I contact you by email for the same? Ankit |
Hi Ankit, I have released hominid version 1.1.0. It has better exception handling so a worker will not die if an exception is raised. Also in the case of the numerical error you have experienced the worker will retry up to three times. This is because the error seems to be caused when the random selection of the 0, 1, and 2 classes results in a testing or validation set that is smaller than it should be. If three attempts all fail the worker will give up on that snp and request a new one. Those are the only changes. Josh |
HI Josh, Thanks for the update. I will work with new version and let you know hot it goes.. Ankit |
Hi Josh, I used the new version, and the file completed without any error. Also, the file with "cv_scores.txt" contained many entries which was missing in previous version. I also wanted to ask/clarify few things:
Thanks for your help. Ankit |
Hi Ankit,
I'm glad to hear your file completed. For your first question, I expect
there will be slight differences between repeated runs on the same input.
This is because the method relies on random re-sampling, and in particular
SNPs with smaller numbers of individuals in any of the 3 genetic classes
(AA, Aa, aa) will probably have more variation across runs. This also
applies to your second question: I would expect some differences for the
same reasons if a subset of SNPs is run.
best regards,
Josh
…On Mon, Jul 20, 2020 at 2:20 AM ankit4035 ***@***.***> wrote:
Hi Josh,
I used the new version, and the file completed without any error. Also,
the file with "cv_scores.txt" contained many entries which was missing in
previous version.
I also wanted to ask/clarify few things:
- Does running the same file again with same parameters result in
identical output, or there will be some minor changes in both?
- Also, if I run a subset of SNPs, should I be getting the same result
for those SNPs as original file?
Thanks for your help.
Ankit
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCI2RKA6B5EKTT3SLVSWI3R4PO3JANCNFSM4JNFWJ5A>
.
|
Hi Josh, I think I got your point. Anyways, thanks again for you help. Ankit |
Hi,
I have used Hominid where it was working well. However, recently it is throwing an error in a new file.
The error seems to be something like:
ValueError: Input contains NaN, infinity or a value too large for dtype('float64')
and
TypeError: 'NoneType' object is not iterable
After throwing the error, scripts stops execution. Prior to this faulty SNP (where script halts), all other SNPs were processed and written in output file as well.
Please help me where I could be wrong.
I am attaching the scrrenshots of the script as well as error message.
Thanks,
Ankit
The text was updated successfully, but these errors were encountered: