[Yt-dev] Parallelism

Eric Hallman hallman13 at gmail.com
Thu Aug 19 12:58:11 PDT 2010


Brian and Matt,
   FYI, I have had similar issues with a 1024^3 dataset (unigrid) that  
I was able to get the parallel hop working eventually, but it is a  
memory hog and not super fast.  It required about the same number of  
cores and method of running on ranger that Brian describes.

Eric
On Aug 19, 2010, at 3:39 PM, Brian O'Shea wrote:

> Hi Matt,
>
> As you know (since we discussed it off-list), I'm the reason for  
> this being mentioned to you.  I had some pretty horrible problems  
> with the various incarnations of HOP in yt being excruciatingly slow  
> and consuming huge amounts of memory for a 1024^3 unigrid dataset,  
> to the point where my grad student and I ended up just using P- 
> GroupFinder, the standalone halo finder that comes with week-of-code  
> enzo.  Note that when I say "excruciatingly slow" and "consuming  
> huge amounts of memory", I mean that when we used 256 nodes on  
> Ranger, with 2 cores/node (so 512 cores total) for the 1024^3  
> dataset, it still ran Ranger out of memory, or, alternately, didn't  
> finish in 24 hours.  Various permutations of cores per node, total  
> nodes, and wall clock time all resulted in either seg faults or the  
> code running out the wall clock time, to the tune of us wasting half  
> a million CPU hours trying to do halo-finding via yt for this  
> dataset.  That's not cool.  P-GroupFinder, in comparison, generated  
> the halo catalog for the same dataset in about 10 minutes on 256  
> processors.  The difference in performance is striking, to say the  
> least.
>
> We also had seriously problems with the projections taking  
> significantly more time and memory than one might think they should  
> based on my old standalone tools, but this is already being dealt  
> with.  Slices seemed to work just fine, and other things like PDFs  
> seem to work fine as well.
>
> One reason that I mentioned this to Mike Norman (presumably he is  
> the person who mentioned the yt thing to you) is that when we were  
> at the Teragrid conference a couple of weeks ago, the subject of  
> inline data analysis came up as relating to our planned Blue Waters  
> unigrid and AMR runs.  I expressed reservations that the current  
> version of yt would be an effective solution at the scales we need  
> (4096^3 unigrid run, roughly 1024^3 refine-everywhere AMR runs),  
> based on my recent experiences with the code.  While I am on the yt- 
> dev mailing list, you know that I'm not actively developing yt (and  
> maybe would be considered a novice user, at best), so I could simply  
> be 100% wrong in my concerns. Maybe we could run some performance  
> tests?  I have a 1024^3 unigrid dataset that seems to be yt's White  
> Whale...
>
>
> --Brian
>
>
> On Thu, Aug 19, 2010 at 2:57 PM, Matthew Turk  
> <matthewturk at gmail.com> wrote:
> Hi all,
>
> Today at a meeting, it was mentioned that perhaps yt is having trouble
> with parallelism.  To everyone out there: how reflective is this of
> your experience?  Is yt okay with parallelism?  (Excluding
> projections, which I have a new engine ready to go on.)
>
> -Matt
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

Eric Hallman
Google Voice: (774) 469-0278
hallman13 at gmail.com




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20100819/9c07cf04/attachment.html>


More information about the yt-dev mailing list