[yt-users] parallel_objects with projection hanging

Semyeong Oh semyeong.oh at gmail.com
Sat Dec 6 17:47:27 PST 2014


Hi Matt,

Counting Opening MPI Barrier on XX in log, I think what happens is
somehow after one projection a barrier is opened. Thus, the process which does the projections opens different
number of barriers from the process that’s idle and only opens a barrier due to this statement in parallel_objects:
    if barrier:
        my_communicator.barrier()
and this is why it hangs after the second projection.

Setting barrier=False on parallel_objects wouldn’t work either because than it hangs after the barrier after the first projection.
Is this expected, and is there a workaround?
I couldn’t pinpoint why this happens exactly, but I’m guessing it has something to do with
that yt supports doing the actual projection in parallel, not just using parallel_objects to parallelize over multiple objects.

Semyeong

> On Dec 6, 2014, at 7:39 PM, Semyeong Oh <semyeong.oh at gmail.com> wrote:
> 
> Hi Matt,
> 
> My output from using 2 processors on 3 objects is here https://gist.github.com/smoh/6e396a7606a3bbff3450
> I’ve set loglevel to 1.
> The first two objects ran fine, printing some output
> 500 …
> 501 …
> Then, the third object start running on rank 0, while rank 1 sleeps. The projection happens, but after that upon this line,
> P000 yt : [DEBUG    ] 2014-12-06 19:03:48,494 Opening MPI Barrier on 0
> the process never ends.
> The rest of the output is from sending sigusr1 to each process, which I hardly understand..
> (There are two projections involved in my calculation)
> 
> Any clues?
> 
> Thanks,
> Semyeong
> 
> 
>> On Dec 6, 2014, at 11:53 AM, Matthew Turk <matthewturk at gmail.com> wrote:
>> 
>> Hi Semyeong,
>> 
>> This is somewhat odd.  When you say the process hangs, do you mean
>> that the process of projection hangs, or the yt script as a whole
>> hangs?  You should be able to send SIGUSR1 to the processes to get a
>> stack trace, which may help with debugging.  Or, if you Ctrl-C, it may
>> output a stack trace, which will help see where it's hanging.
>> 
>> -Matt
>> 
>> On Sat, Dec 6, 2014 at 5:14 AM, Semyeong Oh <semyeong.oh at gmail.com> wrote:
>>> Hi yt,
>>> 
>>> I have two questions on using parallel_objects. I am using yt 2.6.
>>> 
>>> 1. I have a problem of parallel_objects hanging at the end.
>>> 
>>> def do(i, pf):
>>>  cube = pf.h.region(..)
>>>  proj = pf.h.proj(…., source=cube)
>>>  frb = proj.to_frb(..)
>>>  ….
>>> 
>>> objects = [list of indices]
>>> pf = load(..)
>>> for i in parallel_objects(objects):
>>>   do(i, pf)
>>> 
>>> and I run the script as
>>> mpirun -np Nprocs python myscripy.py —parallel
>>> 
>>> When I tested with a simple print operation in do instead of proj, the parallel_objects seem to handle
>>> cases when Nobjects is not divisible by Nprocs just fine. But with my real script that has proj in do, it seems to hang at the end. For example, if Nobjects is 3 and Nprocs is 2, the first two objects goes without problem, but the projection of the third completes, but the process sort of hangs there. Why so?
>>> 
>>> 2. Is it possible to use a portion of Nprocs assigned? Also playing around with simple print operation, it seems that because of the way parallel_objects divide work, the work is duplicated. e..g, when I do mpirun -np 5 but have parallal_objects(objects, njobs=3)
>>> rank i_object
>>> 0 1
>>> 1 1
>>> 2 2
>>> 3 2
>>> 4 3
>>>>>> so object 1 would still run simultaneously on rank 0 and 1.
>>> 
>>> To prevent this, would something like below work?
>>> 
>>> size = MPI.COMM_WORLD.Get_size()
>>> rank = MPI.COMM_WORLD.Get_rank()
>>> njobs = 3
>>> for ind in parallel_objects(objects, njobs):
>>>   if rank % int(size/njobs) != 0:
>>>       continue
>>>   else:
>>>       do(ind)
>>> 
>>> Thanks,
>>> Semyeong
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> yt-users mailing list
>>> yt-users at lists.spacepope.org
>>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>> _______________________________________________
>> yt-users mailing list
>> yt-users at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
> 




More information about the yt-users mailing list