-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running dictionary command outputs error in JAVA memory #11
Comments
Can you indicate your operating system please? I assume it's a version of Windows because of the backslashes.
no need to add names - the whole world can help with this :-)
^^^ History ^^^
Much better to include the actual text as it can be cut-and-pasted. Please repost the output as text.
SVG will only be formed after the search completes.
Please give the exact command. Possible parameters are
Open source projects cannot promise delivery dates, sorry. =========== |
Hi, Yes, the OS is windows 10, version 10.0.17134. C:\bin>ami-search-new -p C:\Users\Rahul\Documents\New_book_chapter --dictionary C:\Users\Rahul\Documents\New_book_chapter\stress_and_bacteria.xml Generic values (AMISearchTool)basename null Specific values (AMISearchTool)dictionaryList [C:\Users\Rahul\Documents\New_book_chapter\stress_and_bacteria.xml] cProject: New_book_chapter running: word; word([frequencies])[{xpath:@count>20}, {w.stopwords:pmcstop.txt stopwords.txt}].............Exception in thread "main" java.lang.OutOfMemoryError: Java heap space The exact command used to allocate the memory was: -Xmx2048m Please find the text document containing the list of PMCs Thanks |
I have run this on your PMC set but with an inbuilt dictionary. No crash:
What is in your dictionary? I think the problem may be there. Can you rerun my example and see if you get a crash. And reproduce the commandline/s |
I reran your example but again got the same crash. C:\bin>ami-search-new -p C:\Users\Rahul\Documents\New_book_chapter\ --dictionary country Generic values (AMISearchTool)basename null Specific values (AMISearchTool)dictionaryList [country] cProject: New_book_chapter running: word; word([frequencies])[{xpath:@count>20}, {w.stopwords:pmcstop.txt stopwords.txt}].............Exception in thread "main" java.lang.OutOfMemoryError: Java heap space I have almost 50 terms in my dictionary, is there a limit on how many terms one can add to their own dictionary? |
On Tue, Mar 12, 2019 at 8:26 AM Rahul1711arora ***@***.***> wrote:
I reran your example but again got the same crash.
I have almost 50 terms in my dictiintensive.onary, is there a limit on how
many terms one can add to their own dictionary?
No. I have dictionaries with 50,000 terms.
I can't help because no one else has had this error so it seems to be due
to the setup on your machine. I shall be gradually modifying the code to
make it less memory-intensive. But this won't be immediate.
All I can suggest is that you use a different machine. I don't know where
you are setting the memory size but you shouldn't have to. I can't do more
here.
P
—
… You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#11 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAsxSwkOkVJ9OcYRRc4zyuoNFFTJrCkxks5vV2TRgaJpZM4bopEe>
.
--
Peter Murray-Rust
Founder ContentMine.org
and
Reader Emeritus in Molecular Informatics
Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
|
Thank you very much! I reran the entire process with another set of files and it worked fine. But for the ones I was originally working with still had a crash. No worries, I'll run the same on another machine. Thanks! |
On Wed, Mar 13, 2019 at 8:55 AM Rahul1711arora ***@***.***> wrote:
Thank you very much! I reran the entire process with another set of files
and it worked fine. But for the ones I was originally working with still
had a crash. No worries, I'll run the same on another machine. Thanks!
The files and dictionaries are small so I suspect there is a rogue file (or
combination of files) of some sort. Can you do a binary chop on the files,
e.g.
Split CProject into Cproject1 and CProject2.
If error still persists recursively split (p1.1, p1.1, p2.1, p2.2) until
you find the smallest set that shows the error. If that is 1 paper it shows
the article which gives problems.
(Less likely) there may be a dictionary entry that gives problems. Make
sure that there are no terms which would have a huge number of hits (e.g.
term="a").
P.
…--
Peter Murray-Rust
Founder ContentMine.org
and
Reader Emeritus in Molecular Informatics
Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
|
If you have only 50 dictionary entries, suggest you post the whole
dictionary here.
On Wed, Mar 13, 2019 at 6:17 PM Peter Murray-Rust <
[email protected]> wrote:
…
On Wed, Mar 13, 2019 at 8:55 AM Rahul1711arora ***@***.***>
wrote:
> Thank you very much! I reran the entire process with another set of files
> and it worked fine. But for the ones I was originally working with still
> had a crash. No worries, I'll run the same on another machine. Thanks!
>
The files and dictionaries are small so I suspect there is a rogue file
(or combination of files) of some sort. Can you do a binary chop on the
files, e.g.
Split CProject into Cproject1 and CProject2.
If error still persists recursively split (p1.1, p1.1, p2.1, p2.2) until
you find the smallest set that shows the error. If that is 1 paper it shows
the article which gives problems.
(Less likely) there may be a dictionary entry that gives problems. Make
sure that there are no terms which would have a huge number of hits (e.g.
term="a").
P.
>
--
Peter Murray-Rust
Founder ContentMine.org
and
Reader Emeritus in Molecular Informatics
Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
--
Peter Murray-Rust
Founder ContentMine.org
and
Reader Emeritus in Molecular Informatics
Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
|
Dear Prof. Peter,
The input for the getpapers was:
getpapers -q "((endophytic bacteria) AND (abiotic stress)) AND (PUB_TYPE:"Review" OR PUB_TYPE:"review-article")" -x -k 200 -o path\to\directory
Which made me download a total of 138 papers in the xml format.
Next, I created a dictionary with around 50 terms, using the command:
ami-dictionary create --terms "many" "terms" "were" "created" --dictionary name of the dictionary --directory path\to\directory -outformats xml,json,html
After running this command, I ran the command to search for the terms in my dictionary in the papers I downloaded to get the data table and the SVG diagrams.
The command I ran was:
ami-search-new -p path\to\files\inXML\format --dictionary path\to\my\dictionary
This normalized the xml to html format. But after doing this, when the count command was running to calculate the frequency of words, an error was thrown.
Please find attached a screenshot for the same.
Also. before this error was thrown, I got the tables for a test run but unfortunately, the SVG files were not formed.
I request you to kindly tell me how can I overcome this error.
The solution that I tried was changing the memory allocation for the JVM. I allocated a 2GB memory to it so that the heap space error can be overcomed, but I couldn't really find an alternative to the predicament.
Hope, the error gets resolved earlier and I can start my work soon.
Best

Rahul
The text was updated successfully, but these errors were encountered: