[yt-dev] Call for testing: Projection performance

Sam Skillman samskillman at gmail.com
Thu May 3 15:47:21 PDT 2012


Meant to include the scaling image.

On Thu, May 3, 2012 at 4:44 PM, Sam Skillman <samskillman at gmail.com> wrote:

> Hi Matt & friends,
>
> I tested this on a fairly large nested simulation with about 60k grids
> using 6 nodes of Janus (dual-hex nodes) and ran from 1 to 64 processors.  I
> got fairly good scaling and made a quick mercurial repo on bitbucket with
> everything except the dataset needed to do a similar study.
> https://bitbucket.org/samskillman/quad-tree-proj-performance
>
> Raw timing:
> projects/quad_proj_scale:more perf.dat
> 64 2.444e+01
> 32 4.834e+01
> 16 7.364e+01
> 8 1.125e+02
> 4 1.853e+02
> 2 3.198e+02
> 1 6.370e+02
>
> A few notes:
> -- I ran with 64 cores first, then again so that the disks were somewhat
> warmed up, then only used the second timing of the 64 core run.
> -- While I did get full nodes, the machine doesn't have a ton of I/O nodes
> so in an ideal setting performance may be even better.
> -- My guess would be that a lot of this speedup comes from having a
> parallel filesystem, so you may not get as great of speedups on your laptop.
> -- Speedup from 32 to 64 is nearly ideal...this is great.
>
> This looks pretty great to me, and I'd +1 any PR.
>
> Sam
>
> On Thu, May 3, 2012 at 1:42 PM, Matthew Turk <matthewturk at gmail.com>wrote:
>
>> Hi all,
>>
>> I implemented this "quadtree extension" that duplicates the quadtree
>> on all processors, which may make it nicer to scale projections.
>> Previously the procedure was:
>>
>> 1) Locally project
>> 2) Merge across procs:
>>  2a) Serialize quadtree
>>  2b) Point-to-point communciate
>>  2c) Deserialize
>>  2d) Merge local and remote
>>  2d) Repeat up to 2a
>> 3) Finish
>>
>> I've added a step 0) which is "initialize entire quadtree", which
>> means all of step 2 becomes "perform sum of big array on all procs."
>> This has good and bad elements: we're still doing a lot of heavy
>> communication across processors, but it will be managed by the MPI
>> implementation instead of by yt.  Also, we avoid all of the costly
>> serialize/deserialize procedures.  So for a given dataset, step 0 will
>> be fixed in cost, but step 1 will be reduced as the number of
>> processors goes up.  Step 2, which now is a single (or two)
>> communication steps, will increase in cost with increasing number of
>> processors.
>>
>> So, it's not clear that this will *actually* be helpful or not.  It
>> needs testing, and I've pushed it here:
>>
>> bb://MatthewTurk/yt/
>> hash 3f39eb7bf468
>>
>> If anybody out there could test it, I'd be might glad.  This is the
>> script I've been using:
>>
>> http://paste.yt-project.org/show/2343/
>>
>> I'd *greatly* appreciate testing results -- particularly for proc
>> combos like 1, 2, 4, 8, 16, 32, 64, ... .  On my machine, the results
>> are somewhat inconclusive.  Keep in mind you'll have to run with the
>> option:
>>
>> --config serialize=False
>>
>> to get real results.  Here's the shell command I used:
>>
>> ( for i in 1 2 3 4 5 6 7 8 9 10 ; do mpirun -np ${i} python2.7 proj.py
>> --parallel --config serialize=False ; done ) 2>&1 | tee proj_new.log
>>
>> Comparison against results from the old method would also be super
>> helpful.
>>
>> The alternate idea that I'd had was a bit different, but harder to
>> implement, and also with a glaring problem.  The idea would be to
>> serialize arrays, do the butterfly reduction, but instead of
>> converting into data objects simply progressively walk hilbert
>> indices.  Unfortunately this only works for up to 2^32 effective size,
>> which is not going to work in a lot of cases.
>>
>> Anyway, if this doesn't work, I'd be eager to hear if anybody has any
>> ideas.  :)
>>
>> -Matt
>> _______________________________________________
>> yt-dev mailing list
>> yt-dev at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20120503/a7b7a02b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: scale.png
Type: image/png
Size: 44408 bytes
Desc: not available
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20120503/a7b7a02b/attachment-0001.png>


More information about the yt-dev mailing list