[yt-users] domain-decomposition for volume rendering huge brick?

Fri Nov 7 11:33:58 PST 2014

Yep, the volume rendering should build the AMRKDTree itself, and *should*
automatically decompose the giant brick into Np pieces. As for memory, you
may need to (eek) allow for yt casting to 64-bit floats for the data, but
you'll have to just experiment a bit.

Sam

On Fri Nov 07 2014 at 11:15:13 AM Stuart Levy <salevy at illinois.edu> wrote:

>  Thank you, Sam!   I think this makes sense.   Except, in case (1), do I
> need to do something to bring the AMRKDTree into the picture?   Or are you
> telling me that it is automatically constructed whenever you
> load_uniform_grid(), or volume-render it?
>
> I think the available nodes have 64GB, so to load the whole ~600GB might
> take at least 32 nodes or 1024 cores.
>
> Will let you know how it goes!
>
>
> On 11/7/14 11:08 AM, Sam Skillman wrote:
>
> Ack, my calculation of 256-512 cores is probably low... feel free to push
> up much higher.
>
> On Fri Nov 07 2014 at 9:03:51 AM Sam Skillman <samskillman at gmail.com>
> wrote:
>
>> Hi Stuart,
>>
>>  On Thu Nov 06 2014 at 8:36:28 AM Stuart Levy <salevy at illinois.edu>
>> wrote:
>>
>>> Hello all,
>>>
>>> We're hoping to use yt parallel volume rendering on a very large generic
>>> brick - it's a simple rectangular unigrid slab, but containing something
>>> like 1.5e11 points, so much too large for load_uniform_grid() to load
>>> into memory in a single machine.
>>>
>>
>>    Are you loading directly using something like numpy.fromfile?  If so,
>> I think the easiest method would be to replace that with a np.memmap (
>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.html).
>> Once that is loaded, you should be able to use load_uniform_grid.
>>
>>  At that point, there are two possible routes that both may or may not
>> work well.
>>
>>  1) Just try rendering with ~256-512 cores, and the AMRKDTree should try
>> to geometrically split the grid before performing and I/O.
>> or
>> 2) Use load_uniform_grid with the keyword nprocs=N ( for this size
>> simulation, you probably need something like 256-1024 processors depending
>> on the memory per core). This should do the equivalent thing to (1), but it
>> may hit the I/O here instead of in the kd-tree.
>>
>>  I *think* (1) should be your best option, but I haven't tried rendering
>> this large of a single-grid output.
>>
>>  When you build the camera option, definitely start out using the
>> keyword "no_ghost=True", as this will extrapolate rather than interpolate
>> from boundary grids to the vertices. The rendering quality won't be quite
>> as good but for unigrid simulations there isn't a tremendous difference.
>>
>>  Let us know how that goes!  I'd be very excited to see images from such
>> a large sim...
>>
>>  Sam
>>
>>
>>
>>>
>>> I imagine it wouldn't be hard to do the domain decomposition by hand,
>>> loading a different chunk of grid into each MPI process.   But then
>>> what?   What would it take to invoke the volume renderer on each piece
>>> and composite them together?   Would it help if the chunks were stored
>>> in a KDTree?   Is there some example (one of the existing data loaders?)
>>> which I could follow?
>>> _______________________________________________
>>> yt-users mailing list
>>> yt-users at lists.spacepope.org
>>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>>
>>
>
> _______________________________________________
> yt-users mailing listyt-users at lists.spacepope.orghttp://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
>  _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-users-spacepope.org/attachments/20141107/4a2a678e/attachment.html>