[yt-users] timing parallel runs

Matthew Turk matthewturk at gmail.com
Mon Mar 7 10:33:51 PST 2011


Hi Dave,

On Mon, Mar 7, 2011 at 11:55 AM, Dave Semeraro
<semeraro at ncsa.illinois.edu> wrote:
>
> Hi there,
>
> I am trying to get a feel for how YT scales in parallel. I am using the time() function of python to wrap individual parts of a yt script. For example, I do this:
>
> start = time.time()
> pc.add_slice("Density", 0)
> end = time.time()
> slicetime = end - start
> print "slice took %f seconds" %slicetime
>
> Each rank does this and I get a variety of times across ranks. I am not seeing any difference in the max time with number of processes however. For example, if the max across ranks for the slice time is .8 seconds with 8 processors it is still .8 seconds for 16 processors. So I must be doing something wrong. Anybody done this before?

I'm also actually working on some more detailed scaling studies, but
I've been stymied lately by some issues with a few supercomputer
centers.  My repository for these studies is here:

https://bitbucket.org/MatthewTurk/yt_profiling/

(My plan is to include the scaling results into the answer testing suite.)

I am wondering if perhaps there's just not enough work to distribute
across the processors, and if the dominant cost is the generation of
the objects here.  Can you tell us a bit about the simulation, and how
many processors are running it?  I believe this operation should be
conducted in parallel.

Initially, my suspicion was that the slice was load-on-demand, but
having just examined this I think it should in fact be touching the
disk and communicating between processors.  There's just not a lot of
work to be done with slicing, I suppose.

An alternate method that would be more effective would be to try 2D
profiling.  (1D profiling is, interestingly enough, *slower* than 2D
profiling.)  This ensures that every grid will be touched by the
computation.

For timing, you could also look into the timing_counters Stephen Skory
has put in.  There's quite a bit of usage of them in the Parallel HOP
code, under analysis_modules/halo_finding, which shows how to set up
nested counters, etc etc.

-Matt

>
> Dave
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>



More information about the yt-users mailing list