You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, it is me again (back from the holidays) to ask some other questions about QCFractal. I am transitioning from psi4 to use orca 6.0.1 and I wrote a QCEngine implementation (that I will open source soon with a PR soon) to use it in the infrastructure.
The serial application work flawlessly but I am trying to use the parallel feature of ORCA and I encountered some difficulties in the implementation.
I made some modification to qcfractal and the underlying parsl to use the HighThroughputExecutor with the SlurmProvider to allow jobs with a higher number of tasks. As an example a config of the type:
with ORCA automatically configured to use 4 CPUs per calculation.
This is working correctly for single nodes but there are some conflicts with the ComputeManager and the number of active tasks and open slots.
It seems that because of this, parsl/ComputeManager when spawning another nodes because of the high number of available open slots, there is some overallocating extra tasks on the cpus of the nodes causing 4 calculations/tasks instead of 2 per node. I am diving deeper into parsl and the use of it in the manager but for now it has been tricky to solve this issue. Do you have any recommandation?
I also tried to fix the maximum number of open slots to avoid the overallocating but in this case, there is never a scaling to request new nodes and I am usually stuck to 1 node.
The text was updated successfully, but these errors were encountered:
I'm not entirely sure, but the current manager is not particularly MPI friendly, as you are finding out :) . We do assume that the number of slots is num_nodes * ntasks_per_node, but it sounds like you want to do (non-hybrid?)-MPI.
I will have to think if there's an easy way to shove this into the manager. And I need to consult the Parsl docs as well. I would certainly be interested if this could be done in a backwards-compatible way.
Hi, it is me again (back from the holidays) to ask some other questions about QCFractal. I am transitioning from psi4 to use orca 6.0.1 and I wrote a QCEngine implementation (that I will open source soon with a PR soon) to use it in the infrastructure.
The serial application work flawlessly but I am trying to use the parallel feature of ORCA and I encountered some difficulties in the implementation.
I made some modification to qcfractal and the underlying parsl to use the
HighThroughputExecutor
with theSlurmProvider
to allow jobs with a higher number of tasks. As an example a config of the type:will spawn instead of the usual:
the modified:
with ORCA automatically configured to use 4 CPUs per calculation.
This is working correctly for single nodes but there are some conflicts with the
ComputeManager
and the number ofactive tasks
andopen slots
.It seems that because of this, parsl/ComputeManager when spawning another nodes because of the high number of available open slots, there is some overallocating extra tasks on the cpus of the nodes causing 4 calculations/tasks instead of 2 per node. I am diving deeper into parsl and the use of it in the manager but for now it has been tricky to solve this issue. Do you have any recommandation?
I also tried to fix the maximum number of open slots to avoid the overallocating but in this case, there is never a scaling to request new nodes and I am usually stuck to 1 node.
The text was updated successfully, but these errors were encountered: