<div dir="ltr">Hi Matt,<div style>Thanks for this work! </div><div style><br></div><div style>I'm not sure which side I come down on (gridding vs not-gridding.) But I do think that we ultimately we should address two things:</div>

<div style>1. Be able to mandate the full SPH octree ahead of time. For ART, for example, we have unorganized particles for stars+dm but a defined octree for the hydro. Sometimes you might want the built-by-yt particle octree to match the hydro one. This will sometimes lead to large cells with little gas, but many particles. </div>

<div style>2. Be able to operate over multiple pfs / merge them. How should the particle octree interact with the hydro one when we have both in a single snapshot/instant?  </div><div style><br></div><div style>chris</div>

<div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Jan 8, 2013 at 11:45 PM, Nathan Goldbaum <span dir="ltr"><<a href="mailto:nathan12343@gmail.com" target="_blank">nathan12343@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Matt,<br>

<br>

Can you elaborate on why it will be so hard to implement the smoothing kernel?<br>

<br>

>From my point of view its straightforward to smooth the particles onto grids, or, if we avoid gridding, splat the particles onto pixels according to the smoothing kernel.  While there are a number of possible choices for kernel, most codes (Gadget in particular) use various flavors of the cubic spline kernel, which has only one free parameter, the smoothing length, which will be written to disk along with the particle data.<br>


<br>

As for whether to grid the data or not, I think that it will be necessary for geometric selection of data.  One could in principle set up profiles over the whole simulations using the capabilities that you've already completed, querying the fluid fields at the particle locations and calculating derived fields based on the fluid fields written to disk at each particle location.  However, it seems to me that calculating fields that require some sort of differencing, or calculating profiles over a geometric object /requires/ a gridding step.<br>


<br>

I think it should be possible to do most things without the expensive gridding, following your second suggestion for visualization based on the SPLASH algorithms, but I also think we should provide the capability to export to GDF or internally convert to gridded data in memory since it enables a lot of sophisticated analysis tasks that are more or less straightforward on grid data.<br>


<br>

That being said, I'm no SPH expert, so please correct me if I'm speaking out of turn here.<br>

<br>

Cheers,<br>

<br>

Nathan<div class="HOEnZb"><div class="h5"><br>

<br>

On 1/8/13 5:18 PM, Matthew Turk wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi all,<br>

<br>

I am writing today to request comments and suggestions for supporting<br>

SPH data in yt.  This is part of a broader effort -- which includes<br>

the native Octree support -- in yt 3.0 to correctly support<br>

non-gridpatch data.<br>

<br>

This email contains a lot of information about where we are, and I was<br>

hoping to get some feedback from people who have more experience with<br>

SPH data, Gadget, AREPO, etc.  In particular, (if they're around and<br>

have the time) it'd be great to hear from Marcel Haas, Michael J.<br>

Roberts, and Chris Moody.<br>

<br>

Additionally, I am very interested in approaching this as an<br>

iterative, incremental process.  Minimum Viable Product (MVP) first,<br>

then improving as we learn and go along.<br>

<br>

= Background =<br>

<br>

As seen in this pull request:<br>

<br>

<a href="https://bitbucket.org/yt_analysis/yt-3.0/pull-request/11/initial-n-body-data-support" target="_blank">https://bitbucket.org/yt_<u></u>analysis/yt-3.0/pull-request/<u></u>11/initial-n-body-data-support</a><br>

<br>

I've implemented a first pass at N-body data support.  This proceeds<br>

through the following steps:<br>

<br>

   * Read header<br>

   * Create octree<br>

   * For each file in output:<br>

     * Add particles to octree<br>

     * When a cell crests a certain number of particles (or when<br>

particles from more than one file are co-located) refine<br>

   * Directly query particles to get data.<br>

<br>

As it stands, for N-body data this is very poor at memory<br>

conservation.  In fact, it's extremely poor.  I have however written<br>

it so that in the future, we can move to a distributed memory octree,<br>

which should alleviate difficulties.<br>

<br>

For quantitative analysis of quantities that are *in the output*, this<br>

already works.  It does not yet provide any kind of kernel or density<br>

estimator.  By mandating refinement based on file that the particles<br>

are in, load on demand operations can (more) efficiently read data.<br>

<br>

Many of the related items are outlined in the corresponding YTEP:<br>

<br>

<a href="https://yt.readthedocs.org/projects/ytep/en/latest/YTEPs/YTEP-0005.html" target="_blank">https://yt.readthedocs.org/<u></u>projects/ytep/en/latest/YTEPs/<u></u>YTEP-0005.html</a><br>

<br>

Currently, it works for OWLS data.  However, since there has been<br>

something of a Gadget fragmentation, there is considerable difficulty<br>

in "supporting Gadget" when that includes supporting the binary<br>

format.  I've instead taken the approach of attempting to isolate<br>

nearly all the functionality into classes so that, at worst, someone<br>

could implement a couple functions, supply them to the Gadget<br>

interface, and it will use those to read data from disk.  Some<br>

integration work on this is still to come, but on some level it will<br>

be a 'build your own' system for some particle codes.  OWLS data<br>

(which was run with Gadget) on the other hand is in HDF5, is<br>

self-describing, has a considerable number of metadata items<br>

affiliated with it, and even outputs the Density estimate for the<br>

hydro particles.<br>

<br>

Where I'm now stuck is how to proceed with handling SPH data.  By<br>

extension, this also applies somewhat to handling tesselation codes<br>

like AREPO and PHURBAS.<br>

<br>

= Challenges =<br>

<br>

Analyzing quantitative data, where fields can be generated from<br>

fully-local information, is not difficult.  We can in fact largely<br>

consider this to be done.  This includes anything that does not<br>

require applying a smoothing kernel.<br>

<br>

The hard part comes in in two areas:<br>

<br>

   * Processing SPH particles as fluid quantities<br>

   * Visualizing results<br>

<br>

Here are a few of our tools:<br>

<br>

   * Data selection and reading: we can get rectangular prisms and read<br>

them from disk relatively quickly.  Same for spheres.  We can also get<br>

boolean regions of these and read them from disk.<br>

   * Neighbor search in octrees.  Given Oct A at Level N, we can get<br>

all M (<=26) Octs that neighbor A at Levels <= N.<br>

   * Voronoi tesselations: we have a wrapper for Voro++ which is<br>

relatively performant, but not amazing.  The DTFE method provides a<br>

way to use this to our advantage, although this requires non-local<br>

information.  (The naive approach, of getting density by dividing the<br>

particle mass by the volume of the voronoi cell, does not give good<br>

agreement at least for the OWLS data.)<br>

<br>

Here's what we want to support:<br>

<br>

   * Quantitative analysis (profiles, etc)<br>

   * Visualization (Projections, slices, probably not VR yet)<br>

   * Speed<br>

<br>

== Gridding Data ==<br>

<br>

In the past, one approach we have explored has been to simply grid all<br>

of the values.  Assuming you have a well-behaved kernel and a large<br>

number of particles, you can calculate at every cell-center the value<br>

of the necessary fields.  If this were done, we could reduce the<br>

complexity of the problem *after* gridding the data, but it could<br>

potentially increase the complexity of the problem *before* gridding<br>

the data.<br>

<br>

For instance, presupposing we took our Octree and created blocks of<br>

8^3 (like FLASH), we could then analyze the resultant eulerian grid<br>

identically to other codes -- projections, slices, and so on would be<br>

done exactly as elsewhere.<br>

<br>

This does bring with it the challenge of conducting the smoothing<br>

kernel.  Not only are there a handful of free parameters in the<br>

smoothing kernel, but it also requires handling edge effects.<br>

Individual octs are confined to a single domain; neighbors, however,<br>

are not.  So we'd potentially be in a situation where to calculate the<br>

values inside a single Oct, we would have to read from many different<br>

files.  From the standpoint of implementing in yt, though, we could<br>

very simply implement this using the new IO functions in 3.0.  We<br>

already have a _read_particles function, so the _read_fluid function<br>

would identify the necessary region, read the particles, and then<br>

proceed to smooth them.  We would need to ensure that we were not<br>

subject to edge effects (something Daniel Price goes into some detail<br>

about in the SPLASH paper) which may prove tricky.<br>

<br>

It's been impressed upon me numerous times, however, that this is a<br>

sub-optimal solution.  So perhaps we should avoid it.<br>

<br>

== Not Gridding Data ==<br>

<br>

I spent some time thinking about this over the last little while, and<br>

it's no longer obvious to me that we need to grid the data at all even<br>

to live within the yt infrastructure.  What gridded data buys us is a<br>

clear "extent" of fluid influence (i.e., a particle's influence may<br>

extend beyond its center) and an easy way of transforming a fluid into<br>

an image.  The former is hard.  But I think the latter is easier than<br>

I realized before.<br>

<br>

Recent, still percolating changes in how yt handles pixelization and<br>

ray traversal should allow a simple mechanism for swapping out<br>

pixelization functions based on characteristics of the dataset.  A<br>

pixelization function takes a slice or projection data object (which<br>

is, in fact, an *actual* data object and not an image) and then<br>

converts them to an image.  For SPH datasets, we could mandate that<br>

the pixelization object be swapped out for something that can apply an<br>

SPH-appropriate method of pixelization (see Daniel Price's SPLASH<br>

paper, or the GigaPan paper, for a few examples).  We would likely<br>

also want to change out the Projection object, as well, and we may<br>

need to eliminate the ability to avoid IO during the pixelization<br>

procedure (i.e., the project-once-pixelize-many advantage of grid<br>

data).  But I think it would be do-able.<br>

<br>

So we could do this.  I think this is the harder road, but I think it<br>

is also possible.  We would *still* need to come up with a mechanism<br>

for applying the smoothing kernel.<br>

<br>

= Conclusions and Contributions =<br>

<br>

So I think we have two main paths forward -- gridding, and not<br>

gridding.  I would very, very much appreciate comments or feedback on<br>

either of these approaches.  I believe that, regardless of what we<br>

choose to do, we'll need to figure out how to best apply a smoothing<br>

kernel to data; I think this is probably going to be difficult.  But I<br>

am optimistic we are nearing the point of usability.  For pure N-body<br>

data, we are already there -- the trick is now to look at the data as<br>

fluids, not just particles.<br>

<br>

The best way to contribute help right now is to brainstorm and give<br>

feedback.  I'd love to hear ideas, criticisms, flaws in my logic, ways<br>

to improve things, and so on and so forth.  I think once we have an<br>

MVP, we will be better set up to move forward.<br>

<br>

If you'd like to add a reader for your own flavor of particle data, we<br>

should do that as well -- especially if you are keen to start<br>

exploring and experimenting.  The code so far can be found in the<br>

repository linked above.  I'd also appreciate comments (or acceptance)<br>

of the pull request.  I am also happy to hold a Hangout to discuss<br>

this, go over things, help set up readers or whatever.  I consider<br>

this a relatively high-priority project.  It will be a key feature of<br>

yt 3.0.<br>

<br>

(Speaking of which, today I've allotted time to focus on getting the<br>

last major patch-refinement feature done -- covering grids.  Hopefully<br>

patch-refinement people can test it out after that's done!)<br>

<br>

Thanks for your patience with such a long email!  :)<br>

<br>

-Matt<br>

______________________________<u></u>_________________<br>

yt-dev mailing list<br>

<a href="mailto:yt-dev@lists.spacepope.org" target="_blank">yt-dev@lists.spacepope.org</a><br>

<a href="http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org" target="_blank">http://lists.spacepope.org/<u></u>listinfo.cgi/yt-dev-spacepope.<u></u>org</a><br>

</blockquote>

<br>

______________________________<u></u>_________________<br>

yt-dev mailing list<br>

<a href="mailto:yt-dev@lists.spacepope.org" target="_blank">yt-dev@lists.spacepope.org</a><br>

<a href="http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org" target="_blank">http://lists.spacepope.org/<u></u>listinfo.cgi/yt-dev-spacepope.<u></u>org</a><br>

</div></div></blockquote></div><br></div></div>