[Yt-dev] Geometry, RAMSES and non-patch datasets

Tue Jun 21 08:07:06 PDT 2011

Hi,

This is in reply to Matt's e-mail from 3 weeks ago (I only just realised 
I forgot to hit "confirm" on the yt-dev mailing list signup).

I guess one solution to the problem would be to abstract what a "grid" 
is (I'm guessing a grid is a container for a geometrically consistent 
chunk of the entire simulation volume?) Then allow it to answer queries 
about its geometric properties itself. So for example, ask it 
"myGrid.IsInRegion(myWeirdGeometricConstruct)". I guess the trick is to 
figure out a flexible but simple interface for this, depending on how 
well you know the requirements for what the grid should be able to do. 
In general, I think this is the ideal situation, because as Matt says 
hammering every code into the same structure in memory creates 
slowdowns. One possibility is to create a few template memory 
structures, etc, to allow people to bolt together new implementations 
for each code.

In terms of choosing algorithms for different types of fluid blob (e.g. 
one for particles, one for grids), this can be done using functionoids 
for the algorithms (or at least functionoid wrappers) and then a 
functionoid factory for spawning the correct functionoid to use with the 
container. You'd have to wrap all this up in a simple interface again, 
otherwise it'd be impossible to use.

I also suggested to Matt to create a "fluid blob" iterator that works 
for all types of fluid blob (SPH particle, octree grid cell, voronoi 
tessellation cell) but this might be very slow in Python. That said, 
iterating over "grid"s as chunks of the amr grid instead is a 
possibility. Having some kind of iterator option might be good, though, 
as doing things like tracking particles through different snapshots is 
something I've been doing extensively in my (pre-YT) work.

I don't know how much of this is already known; my domain is Ramses, 
which is still very slow to use with my dataset (although Matthew has 
been very helpful in working on the Ramses side of things). I thus 
haven't looked too much at YT yet as it's still prohibitively slow to 
load my dataset and play with it.

Cheers,

Sam

On Tue, Jun 7, 2011 at 16:15 AM, Matthew Turk <matthewturk at gmail.com 
<mailto:matthewturk at gmail.com>> wrote:

Hi all,

This is a portion of a conversation Sam Geen and I had off-list about
where to make changes and how to insert abstractions to allow for
generalized geometric reading of data; this would be useful for octree
codes, particles codes, and non-rectilinear geometry.  We decided to
"replay" the conversation on the mailing list to allow people to
contribute their ideas and thoughts.  I spent a bit of time last night
looking at the geometry usage in yt.

Right now I see a few places this will need to be fixed:

  * Data sources operate on the idea that grids act as a pre-selection
for cells.  If we get the creation of grids -- without including any
cell data inside them -- to be fast enough, this will not necessarily
need to be changed.  (i.e., apply a 'regridding' step of empty grids.)
  However, failing that, this will need to be abstracted into geometric
selection.  For cylindrical coordinates this will need to be
abstracted anyway.  The idea is that once you know which grids you
want, you read them from disk, and then mask out the points that are
not necessary.
  * The IO is currently set up -- in parallel -- to read in chunks.
Usually in parallel patch-based simulations, multiple grid patches are
stored in a single file on disk.  So, these get chunked in IO to avoid
too many fopen/seek/fclose operations (and the analogues in hdf5.)
This will need to be rethought.  Obviously, there are still some
analogues; however, it's not clear how -- without the actual
re-gridding operation -- to keep the geometry selection and the IO
separate.  I would prefer to try to do this as much as possible.  I
think it's do-able, but I don't yet have a good strategy for it.

My current feeling now is that the re-gridding may be a slightly
necessary evil *at the moment*, but only for guiding the point
selection.  It's currently been re-written to be based on hilbert
curve locating, so each grid has a unique index in L-8 or something
space.

I believe that geometry and chunking of IO are the only issues at this
time.  One possibility would actually be to move away from the idea of
grids and instead of 'hilbert chunks'.  So these would be the items
that would be selected, read from disk, and mapped.  This might fit
nicer with the Ramses method.

What do you think?

Best,

Matt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20110621/6cbf7cca/attachment.htm>