[yt-dev] parallelHOP in yt-3

Tue Apr 5 00:22:06 PDT 2016

Hi Josh,

On a few occasions some years ago, I attempted to run parallelHOP on some
1536^3 data (I suspect the same data you refer to) and encountered some
extremely strange MPI errors that ultimately convinced me that it would
never work.  I think that if you try it, you will encounter this and see
immediately what I'm talking about.  In general, I would discourage you
from trying to integrate parallelHOP into yt-3.0 as I think it will be a
complicated procedure and will likely not be scalable to very large
datasets.

My advice would be to reconsider how necessary it is to include star
particles in your halo finding.  Any halo that has stars should still have
its mass be dominated by the dark matter, so you would likely get very
similar results.  Unless it is truly crucial to include the star particles,
rockstar really is the way to go.  There is simply nothing else around, at
least integrated into yt, that will scale to the sizes you need.  Even
then, I ran rockstar on some 1536^3 data and it was still a huge, huge
endeavor.  If the 1536^3 data you refer to is the same one that I might
have access to, then please contact me off-list and I can provide you with
what I have.

Britton

On Tue, Apr 5, 2016 at 12:31 AM, Josh Moloney <Joshua.Moloney at colorado.edu>
wrote:

> I've recently been trying to get parallelHOP integrated into the current
> version of yt-3. (I need halo finding with star particles which I think
> rules out Rockstar, and I eventually need to work with grids up to 1536^3
> which would be problematic with regular HOP.)
>
> I currently have it working properly for single core runs, and it produces
> output that matches the yt-2.x version. It also runs to completion with
> mpi, and produces mostly sensible output.
>
> However, when run on multiple cores it has problems with the halos near
> the region boundaries. In particular it produces a large number (~10% of
> total) of typically small "bad halos" which have 0 values for the halo
> maximum density/location (i.e. [0, 0, 0, 0]), and at least one negative
> value for the center of mass location (due to incorrect handling of
> periodicity). In addition, some of the "real" halos near the region
> boundaries have fewer particles in parallel runs. Comparing halo particle
> lists between serial and mpi runs shows that some of these missing
> particles are located in the "bad halos". Running with premerge=False
> improves the situation slightly (slightly fewer "bad halos", better
> agreement in some of the real ones).
>
> I think this means that the halo finder isn't correctly joining up some
> chains across processors, but I'm not familiar enough with the parallelHOP
> code to know where to look first. The only changes I've made to the code
> are in halo_objects.py, updating the old self.hierarchy.region_strict and
> similar calls as well as fixing units in a few places. I haven't made any
> changes in parallel_hop_interface.py.
>
> Has anyone run into a similar problem before? If integrating parallelHOP
> is known to be unworkable (or extremely difficult) I'll start looking at
> other options. Otherwise, if anyone has some insight into where the problem
> might lie or what to look at first, I'd appreciate any advice I can get.
> Thanks.
>      - Josh
>
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20160405/9854dee2/attachment.html>