<div dir="ltr">Hi Matt,<div><br></div><div style>Unfortunately, I think these tests are formed incorrectly. The fake_random_pf should take Nproc**2 instead of Nproc, and so the smallest grid you've tested here is actually 64x32x32 instead of 2x2x2. Changing to Nproc**2 gets terrible function-call-dominated scaling:</div>

<div style><br></div><div style><a href="http://paste.yt-project.org/show/3194/">http://paste.yt-project.org/show/3194/</a><br></div><div style><a href="http://paste.yt-project.org/show/3191/">http://paste.yt-project.org/show/3191/</a><br>

</div><div style><br></div><div style>...which really sucks because passing-by-oct would have been really awesome, and would fix some of the problems I mention below.</div><div style><br></div><div style>Instead, hopefully at the dev workshop, I propose we talk about how to tackle issues with oct & particle codes. </div>

<div style><br></div><div style>1. Anything you can compute on a hydro field, you should be able to compute on a particle field. </div><div style>In non-grid patch codes, these two have different structures, so we can think about creating unions of octrees and the like. in Enzo, sets of particles are attached to grids, which means you can calculate the dispersion of a local set of particles, deposit that onto the grid, and then compare hydro vs particle stuff. The best you can do with is octrees is by faking it with combinations of covering grids and stream frontends, but that's a far cry from typical usage of fields.</div>

<div style><br></div><div style>2. How do you expose non-local (e.g., spatial) fields in octs? </div><div style>How do you access neighboring octs if you want to, say, smooth a field or calculate a flux? At the moment, when you calculate a region, or a slice, or whatever is requested, the selector creates a matching mask for the octree, and the IO fills the selection. Once filled, you have just a long 1D array, making it much more difficult to dig up neighbors quickly, if they are even included. </div>

<div style><br></div><div style>In 3.0, Matt (and others?) has solved the most fundamental issues and figured out a general chunking method for single octrees, local fields (eg, Density) and fields deriving from local fields. However, I think we still have a ways to go to get feature parity with grid-patch codes.<br>

</div><div style><br></div><div style>Thanks!</div><div style>chris</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Feb 20, 2013 at 7:09 AM, Matthew Turk <span dir="ltr"><<a href="mailto:matthewturk@gmail.com" target="_blank">matthewturk@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi all,<br>

<br>

Octree support in 3.0 is coming along quite well.  The main stumbling<br>

block is in support seems to be related to spatial fields.  In<br>

particular, fields that rely on being passed 3D arrays of data.  Many<br>

of these fields (such as DivV) rely on ghost zone information.<br>

Following on this, we need to be able to place particles inside<br>

individual Octs.<br>

<br>

In the past, I've assumed that the overhead of a function call being<br>

called on many, many small arrays (2x2x2) would outweigh the gain from<br>

vectorized numpy arithmetic.  This morning, after testing, it seems<br>

the situation is a bit more complex.<br>

<br>

Here's a sample script:<br>

<br>

<a href="http://paste.yt-project.org/show/3184/" target="_blank">http://paste.yt-project.org/show/3184/</a><br>

<br>

The results:<br>

<br>

<a href="http://paste.yt-project.org/show/3183/" target="_blank">http://paste.yt-project.org/show/3183/</a><br>

<br>

Now, what this is doing is simulating the process of going from one<br>

big grid down to many small grids, such that at the end we're doing<br>

2x2x2 grid objects, for a relatively complex field (RadiusCode).<br>

These results are not at *all* what I expected -- doing this<br>

grid-by-grid for octs outperforms doing it for the big array.  My<br>

first guess at why this is is that the RadiusCode field likely<br>

performs a number of copies.  Now, if we do this instead with<br>

DensitySquared (which is a flop-light routine) we see:<br>

<br>

<a href="http://paste.yt-project.org/show/3185/" target="_blank">http://paste.yt-project.org/show/3185/</a><br>

<br>

(Here I've also upped to a 512^3 domain dimensions)<br>

<br>

Again, the Octs perform pretty well, out-performing everything else.<br>

I assume, perhaps incorrectly, this is related to memory allocation<br>

patterns or something.<br>

<br>

Regardless, it seems that what I had previously discounted as too<br>

costly (iterating over octs) is not so costly after all.  I'd like to<br>

propose -- pending input from at least Doug, Sam and Chris -- that the<br>

spatial chunking for Octrees be constructed like so:<br>

<br>

 * When a chunk_spatial call is made, iterate over io chunks.  This<br>

will enable management based on IO, delegated to the individual<br>

OctGeometryHandler.  (If number of requested ghost zones > 0, we can<br>

also build that information in here, as well.)<br>

 * Inside the iteration of io chunks, read as much as is necessary<br>

from each (without child masking, as is usually the case for spatial<br>

data).  yield either an Oct singleton or a dict pre-filled with the<br>

field data needed.  (At this time we will know which fields must be<br>

read from disk and which must be generated.)<br>

 * Back at top level, where the spatial chunking was called, fill back<br>

into the original data selector.  This will likely require a new<br>

per-geometry handler function that accepts a spatial iterator and then<br>

fills a flat, non-spatial selector.<br>

<br>

I think this is reasonable, and I can have a pass at implementing it.<br>

I'll update the YTEP once it's done.  However, it's also brought up<br>

the point that during the workshop we absolutely need to figure out a<br>

distributed memory octree approach.  Before we implement such a thing,<br>

we also need rigorous testing of the three Octree codes that are<br>

included in 3.0.<br>

<br>

I believe this strategy for spatial fields in Octrees will also work<br>

for particle fields; you should be able to read a particle field and<br>

use it to generate a fluid field.  For instance, the CICDeposit<br>

functions.<br>

<br>

Any feedback would be appreciated, but I think modulo strong<br>

objections I want to get started on this early next week, once my<br>

schedule lightens up a bit.<br>

<br>

Also, this should be fun for the workshop!  :)<br>

<br>

-Matt<br>

_______________________________________________<br>

yt-dev mailing list<br>

<a href="mailto:yt-dev@lists.spacepope.org">yt-dev@lists.spacepope.org</a><br>

<a href="http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org" target="_blank">http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org</a><br>

</blockquote></div><br></div>