[yt-users] error while submitting development queue job to STAMPEDE supercomputer

Nathan Goldbaum nathan12343 at gmail.com
Fri May 27 06:53:32 PDT 2016


On Thursday, May 26, 2016, turhan nasri <turhannasri at gmail.com> wrote:

> Hi people,
>
> though I am not sure if this is the right place to ask this, but I have
> submitted a development queue job to STAMPEDE supercomputer and I was using
> yt toolkit in my script, I got the following error after around half an
> hour of running the job.
>
> [c557-702.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process
> (rank: 0, pid: 26868) terminated with signal 9 -> abort job
>

This is the important bit. Signal 9 is SIGKILL, so if I had to guess, i'd
say that your job ran out of memory and the operating system on the compute
node killed your job.

Can you request a node with more RAM?



> [c557-702.stampede.tacc.utexas.edu:mpirun_rsh][process_mpispawn_connection]
> mpispawn_0 from node c557-702 aborted: MPI process error (1)
> TACC: MPI job exited with code: 1
>
>
> can anyone please shed light on the error here? with the development queue
> the maximum code runtime is 2 hours.
> thanks in advance.
>
>
>
> -Turhan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-users-spacepope.org/attachments/20160527/2b1e9084/attachment.htm>


More information about the yt-users mailing list