[Yt-dev] Geometry, RAMSES and non-patch datasets

Tue Jun 7 16:14:45 PDT 2011

Hi all,

This is a portion of a conversation Sam Geen and I had off-list about
where to make changes and how to insert abstractions to allow for
generalized geometric reading of data; this would be useful for octree
codes, particles codes, and non-rectilinear geometry.  We decided to
"replay" the conversation on the mailing list to allow people to
contribute their ideas and thoughts.  I spent a bit of time last night
looking at the geometry usage in yt.

Right now I see a few places this will need to be fixed:

 * Data sources operate on the idea that grids act as a pre-selection
for cells.  If we get the creation of grids -- without including any
cell data inside them -- to be fast enough, this will not necessarily
need to be changed.  (i.e., apply a 'regridding' step of empty grids.)
 However, failing that, this will need to be abstracted into geometric
selection.  For cylindrical coordinates this will need to be
abstracted anyway.  The idea is that once you know which grids you
want, you read them from disk, and then mask out the points that are
not necessary.
 * The IO is currently set up -- in parallel -- to read in chunks.
Usually in parallel patch-based simulations, multiple grid patches are
stored in a single file on disk.  So, these get chunked in IO to avoid
too many fopen/seek/fclose operations (and the analogues in hdf5.)
This will need to be rethought.  Obviously, there are still some
analogues; however, it's not clear how -- without the actual
re-gridding operation -- to keep the geometry selection and the IO
separate.  I would prefer to try to do this as much as possible.  I
think it's do-able, but I don't yet have a good strategy for it.

My current feeling now is that the re-gridding may be a slightly
necessary evil *at the moment*, but only for guiding the point
selection.  It's currently been re-written to be based on hilbert
curve locating, so each grid has a unique index in L-8 or something
space.

I believe that geometry and chunking of IO are the only issues at this
time.  One possibility would actually be to move away from the idea of
grids and instead of 'hilbert chunks'.  So these would be the items
that would be selected, read from disk, and mapped.  This might fit
nicer with the Ramses method.

What do you think?

Best,

Matt