[Yt-dev] Parallelism
Eric Hallman
hallman13 at gmail.com
Thu Aug 19 12:58:11 PDT 2010
Brian and Matt,
FYI, I have had similar issues with a 1024^3 dataset (unigrid) that
I was able to get the parallel hop working eventually, but it is a
memory hog and not super fast. It required about the same number of
cores and method of running on ranger that Brian describes.
Eric
On Aug 19, 2010, at 3:39 PM, Brian O'Shea wrote:
> Hi Matt,
>
> As you know (since we discussed it off-list), I'm the reason for
> this being mentioned to you. I had some pretty horrible problems
> with the various incarnations of HOP in yt being excruciatingly slow
> and consuming huge amounts of memory for a 1024^3 unigrid dataset,
> to the point where my grad student and I ended up just using P-
> GroupFinder, the standalone halo finder that comes with week-of-code
> enzo. Note that when I say "excruciatingly slow" and "consuming
> huge amounts of memory", I mean that when we used 256 nodes on
> Ranger, with 2 cores/node (so 512 cores total) for the 1024^3
> dataset, it still ran Ranger out of memory, or, alternately, didn't
> finish in 24 hours. Various permutations of cores per node, total
> nodes, and wall clock time all resulted in either seg faults or the
> code running out the wall clock time, to the tune of us wasting half
> a million CPU hours trying to do halo-finding via yt for this
> dataset. That's not cool. P-GroupFinder, in comparison, generated
> the halo catalog for the same dataset in about 10 minutes on 256
> processors. The difference in performance is striking, to say the
> least.
>
> We also had seriously problems with the projections taking
> significantly more time and memory than one might think they should
> based on my old standalone tools, but this is already being dealt
> with. Slices seemed to work just fine, and other things like PDFs
> seem to work fine as well.
>
> One reason that I mentioned this to Mike Norman (presumably he is
> the person who mentioned the yt thing to you) is that when we were
> at the Teragrid conference a couple of weeks ago, the subject of
> inline data analysis came up as relating to our planned Blue Waters
> unigrid and AMR runs. I expressed reservations that the current
> version of yt would be an effective solution at the scales we need
> (4096^3 unigrid run, roughly 1024^3 refine-everywhere AMR runs),
> based on my recent experiences with the code. While I am on the yt-
> dev mailing list, you know that I'm not actively developing yt (and
> maybe would be considered a novice user, at best), so I could simply
> be 100% wrong in my concerns. Maybe we could run some performance
> tests? I have a 1024^3 unigrid dataset that seems to be yt's White
> Whale...
>
>
> --Brian
>
>
> On Thu, Aug 19, 2010 at 2:57 PM, Matthew Turk
> <matthewturk at gmail.com> wrote:
> Hi all,
>
> Today at a meeting, it was mentioned that perhaps yt is having trouble
> with parallelism. To everyone out there: how reflective is this of
> your experience? Is yt okay with parallelism? (Excluding
> projections, which I have a new engine ready to go on.)
>
> -Matt
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
Eric Hallman
Google Voice: (774) 469-0278
hallman13 at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20100819/9c07cf04/attachment.html>
More information about the yt-dev
mailing list