Skip to content
John Chodera edited this page Mar 29, 2016 · 10 revisions

Why is my job not running?

To see what resources your job is waiting on, try running checkjob:

checkjob -v -v -v <jobid>

If you want to see the latest time your job is scheduled (note it might start earlier), you can use showstart:

showstart <jobid>

The mdiag command may provide useful information if nothing else seems to be helping:

mdiag -j <jobid>

For more information, see useful Torque/Moab commands for managing jobs.

Why does Java die with memory errors?

When running java on the login node (hal.cbio.mskcc.org or mskcc-ln1), you might see

[username@mskcc-ln1 ~]$ java
Error occurred during initialization of VM
Could not reserve enough space for object heap
Could not create the Java virtual machine.

or

java --version
Unrecognized option: --version
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

To prevent resource consumption by users trying to run compute-intensive processes on the login node, restrictive shell limits have been set. In this case, java by default requests more memory than is allowed by the shell settings, leading to immediate termination.

Instead, to use java interactively, request an interactive job, e.g.:

qsub -I -q active -l walltime=04:00:00 -l nodes=1:ppn=1

You will then be able to run Java normally.

You can also use Java through the batch queue system.