[yt-dev] Spatial fields for Octrees in 3.0

Wed Feb 20 07:09:25 PST 2013

Hi all,

Octree support in 3.0 is coming along quite well.  The main stumbling
block is in support seems to be related to spatial fields.  In
particular, fields that rely on being passed 3D arrays of data.  Many
of these fields (such as DivV) rely on ghost zone information.
Following on this, we need to be able to place particles inside
individual Octs.

In the past, I've assumed that the overhead of a function call being
called on many, many small arrays (2x2x2) would outweigh the gain from
vectorized numpy arithmetic.  This morning, after testing, it seems
the situation is a bit more complex.

Here's a sample script:

http://paste.yt-project.org/show/3184/

The results:

http://paste.yt-project.org/show/3183/

Now, what this is doing is simulating the process of going from one
big grid down to many small grids, such that at the end we're doing
2x2x2 grid objects, for a relatively complex field (RadiusCode).
These results are not at *all* what I expected -- doing this
grid-by-grid for octs outperforms doing it for the big array.  My
first guess at why this is is that the RadiusCode field likely
performs a number of copies.  Now, if we do this instead with
DensitySquared (which is a flop-light routine) we see:

http://paste.yt-project.org/show/3185/

(Here I've also upped to a 512^3 domain dimensions)

Again, the Octs perform pretty well, out-performing everything else.
I assume, perhaps incorrectly, this is related to memory allocation
patterns or something.

Regardless, it seems that what I had previously discounted as too
costly (iterating over octs) is not so costly after all.  I'd like to
propose -- pending input from at least Doug, Sam and Chris -- that the
spatial chunking for Octrees be constructed like so:

 * When a chunk_spatial call is made, iterate over io chunks.  This
will enable management based on IO, delegated to the individual
OctGeometryHandler.  (If number of requested ghost zones > 0, we can
also build that information in here, as well.)
 * Inside the iteration of io chunks, read as much as is necessary
from each (without child masking, as is usually the case for spatial
data).  yield either an Oct singleton or a dict pre-filled with the
field data needed.  (At this time we will know which fields must be
read from disk and which must be generated.)
 * Back at top level, where the spatial chunking was called, fill back
into the original data selector.  This will likely require a new
per-geometry handler function that accepts a spatial iterator and then
fills a flat, non-spatial selector.

I think this is reasonable, and I can have a pass at implementing it.
I'll update the YTEP once it's done.  However, it's also brought up
the point that during the workshop we absolutely need to figure out a
distributed memory octree approach.  Before we implement such a thing,
we also need rigorous testing of the three Octree codes that are
included in 3.0.

I believe this strategy for spatial fields in Octrees will also work
for particle fields; you should be able to read a particle field and
use it to generate a fluid field.  For instance, the CICDeposit
functions.

Any feedback would be appreciated, but I think modulo strong
objections I want to get started on this early next week, once my
schedule lightens up a bit.

Also, this should be fun for the workshop!  :)

-Matt