[yt-dev] missing objects using parallel_objects()
Geoffrey So
gsiisg at gmail.com
Fri Mar 23 01:14:00 PDT 2012
Hi everyone,
I am trying to improve the efficiency of my analysis script which
calculates attributes of haloes, after watching the workshop video on YT
parallelism I was motivated to give parallel_objects a try. I am
basically trying to calculate, then output some properties of each haloes
found by parallel HOP. It turns out that even if I just output the (DM
particles) mass of each halo, I am missing halo(s). It doesn't matter if I
run this in serial or parallel, I end up missing the same amount of haloes
if I use parallel_objects() like:
haloes = LoadHaloes(pf, HaloListname)
for sto, halo in parallel_objects(haloes, num_procs, storage = my_storage):
to iterate over the haloes, and the problem goes away if I just switch to:
for halo in haloes:
I noticed this when I tried it on an 800 cube dataset with around 50k
haloes, I only get 4k haloes in return, I then tried to narrow things down,
and it ruled out the way I am calculating the attributes, because I can
just output the mass from halo.total_mass() that was basically read in from
the .h5 file and I'd end up missing halo using the parallel_objects. For
128 cube dataset with 85 haloes, I'd end up missing 3 and get 82 back, and
for 64 cube dataset with 22 haloes, I'd get back 21 haloes.
Has anyone else encountered this behavior or can confirm it?
From
G.S.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20120323/d6c710e1/attachment.htm>
More information about the yt-dev
mailing list