[yt-dev] RFC: SPH plan in yt 3.0

Thu Jan 10 01:24:16 PST 2013

Hi Matt,
Thanks for this work!

I'm not sure which side I come down on (gridding vs not-gridding.) But I do
think that we ultimately we should address two things:
1. Be able to mandate the full SPH octree ahead of time. For ART, for
example, we have unorganized particles for stars+dm but a defined octree
for the hydro. Sometimes you might want the built-by-yt particle octree to
match the hydro one. This will sometimes lead to large cells with little
gas, but many particles.
2. Be able to operate over multiple pfs / merge them. How should the
particle octree interact with the hydro one when we have both in a single
snapshot/instant?

chris

On Tue, Jan 8, 2013 at 11:45 PM, Nathan Goldbaum <nathan12343 at gmail.com>wrote:

> Hi Matt,
>
> Can you elaborate on why it will be so hard to implement the smoothing
> kernel?
>
> From my point of view its straightforward to smooth the particles onto
> grids, or, if we avoid gridding, splat the particles onto pixels according
> to the smoothing kernel.  While there are a number of possible choices for
> kernel, most codes (Gadget in particular) use various flavors of the cubic
> spline kernel, which has only one free parameter, the smoothing length,
> which will be written to disk along with the particle data.
>
> As for whether to grid the data or not, I think that it will be necessary
> for geometric selection of data.  One could in principle set up profiles
> over the whole simulations using the capabilities that you've already
> completed, querying the fluid fields at the particle locations and
> calculating derived fields based on the fluid fields written to disk at
> each particle location.  However, it seems to me that calculating fields
> that require some sort of differencing, or calculating profiles over a
> geometric object /requires/ a gridding step.
>
> I think it should be possible to do most things without the expensive
> gridding, following your second suggestion for visualization based on the
> SPLASH algorithms, but I also think we should provide the capability to
> export to GDF or internally convert to gridded data in memory since it
> enables a lot of sophisticated analysis tasks that are more or less
> straightforward on grid data.
>
> That being said, I'm no SPH expert, so please correct me if I'm speaking
> out of turn here.
>
> Cheers,
>
> Nathan
>
>
> On 1/8/13 5:18 PM, Matthew Turk wrote:
>
>> Hi all,
>>
>> I am writing today to request comments and suggestions for supporting
>> SPH data in yt.  This is part of a broader effort -- which includes
>> the native Octree support -- in yt 3.0 to correctly support
>> non-gridpatch data.
>>
>> This email contains a lot of information about where we are, and I was
>> hoping to get some feedback from people who have more experience with
>> SPH data, Gadget, AREPO, etc.  In particular, (if they're around and
>> have the time) it'd be great to hear from Marcel Haas, Michael J.
>> Roberts, and Chris Moody.
>>
>> Additionally, I am very interested in approaching this as an
>> iterative, incremental process.  Minimum Viable Product (MVP) first,
>> then improving as we learn and go along.
>>
>> = Background =
>>
>> As seen in this pull request:
>>
>> https://bitbucket.org/yt_**analysis/yt-3.0/pull-request/**
>> 11/initial-n-body-data-support<https://bitbucket.org/yt_analysis/yt-3.0/pull-request/11/initial-n-body-data-support>
>>
>> I've implemented a first pass at N-body data support.  This proceeds
>> through the following steps:
>>
>>    * Read header
>>    * Create octree
>>    * For each file in output:
>>      * Add particles to octree
>>      * When a cell crests a certain number of particles (or when
>> particles from more than one file are co-located) refine
>>    * Directly query particles to get data.
>>
>> As it stands, for N-body data this is very poor at memory
>> conservation.  In fact, it's extremely poor.  I have however written
>> it so that in the future, we can move to a distributed memory octree,
>> which should alleviate difficulties.
>>
>> For quantitative analysis of quantities that are *in the output*, this
>> already works.  It does not yet provide any kind of kernel or density
>> estimator.  By mandating refinement based on file that the particles
>> are in, load on demand operations can (more) efficiently read data.
>>
>> Many of the related items are outlined in the corresponding YTEP:
>>
>> https://yt.readthedocs.org/**projects/ytep/en/latest/YTEPs/**
>> YTEP-0005.html<https://yt.readthedocs.org/projects/ytep/en/latest/YTEPs/YTEP-0005.html>
>>
>> Currently, it works for OWLS data.  However, since there has been
>> something of a Gadget fragmentation, there is considerable difficulty
>> in "supporting Gadget" when that includes supporting the binary
>> format.  I've instead taken the approach of attempting to isolate
>> nearly all the functionality into classes so that, at worst, someone
>> could implement a couple functions, supply them to the Gadget
>> interface, and it will use those to read data from disk.  Some
>> integration work on this is still to come, but on some level it will
>> be a 'build your own' system for some particle codes.  OWLS data
>> (which was run with Gadget) on the other hand is in HDF5, is
>> self-describing, has a considerable number of metadata items
>> affiliated with it, and even outputs the Density estimate for the
>> hydro particles.
>>
>> Where I'm now stuck is how to proceed with handling SPH data.  By
>> extension, this also applies somewhat to handling tesselation codes
>> like AREPO and PHURBAS.
>>
>> = Challenges =
>>
>> Analyzing quantitative data, where fields can be generated from
>> fully-local information, is not difficult.  We can in fact largely
>> consider this to be done.  This includes anything that does not
>> require applying a smoothing kernel.
>>
>> The hard part comes in in two areas:
>>
>>    * Processing SPH particles as fluid quantities
>>    * Visualizing results
>>
>> Here are a few of our tools:
>>
>>    * Data selection and reading: we can get rectangular prisms and read
>> them from disk relatively quickly.  Same for spheres.  We can also get
>> boolean regions of these and read them from disk.
>>    * Neighbor search in octrees.  Given Oct A at Level N, we can get
>> all M (<=26) Octs that neighbor A at Levels <= N.
>>    * Voronoi tesselations: we have a wrapper for Voro++ which is
>> relatively performant, but not amazing.  The DTFE method provides a
>> way to use this to our advantage, although this requires non-local
>> information.  (The naive approach, of getting density by dividing the
>> particle mass by the volume of the voronoi cell, does not give good
>> agreement at least for the OWLS data.)
>>
>> Here's what we want to support:
>>
>>    * Quantitative analysis (profiles, etc)
>>    * Visualization (Projections, slices, probably not VR yet)
>>    * Speed
>>
>> == Gridding Data ==
>>
>> In the past, one approach we have explored has been to simply grid all
>> of the values.  Assuming you have a well-behaved kernel and a large
>> number of particles, you can calculate at every cell-center the value
>> of the necessary fields.  If this were done, we could reduce the
>> complexity of the problem *after* gridding the data, but it could
>> potentially increase the complexity of the problem *before* gridding
>> the data.
>>
>> For instance, presupposing we took our Octree and created blocks of
>> 8^3 (like FLASH), we could then analyze the resultant eulerian grid
>> identically to other codes -- projections, slices, and so on would be
>> done exactly as elsewhere.
>>
>> This does bring with it the challenge of conducting the smoothing
>> kernel.  Not only are there a handful of free parameters in the
>> smoothing kernel, but it also requires handling edge effects.
>> Individual octs are confined to a single domain; neighbors, however,
>> are not.  So we'd potentially be in a situation where to calculate the
>> values inside a single Oct, we would have to read from many different
>> files.  From the standpoint of implementing in yt, though, we could
>> very simply implement this using the new IO functions in 3.0.  We
>> already have a _read_particles function, so the _read_fluid function
>> would identify the necessary region, read the particles, and then
>> proceed to smooth them.  We would need to ensure that we were not
>> subject to edge effects (something Daniel Price goes into some detail
>> about in the SPLASH paper) which may prove tricky.
>>
>> It's been impressed upon me numerous times, however, that this is a
>> sub-optimal solution.  So perhaps we should avoid it.
>>
>> == Not Gridding Data ==
>>
>> I spent some time thinking about this over the last little while, and
>> it's no longer obvious to me that we need to grid the data at all even
>> to live within the yt infrastructure.  What gridded data buys us is a
>> clear "extent" of fluid influence (i.e., a particle's influence may
>> extend beyond its center) and an easy way of transforming a fluid into
>> an image.  The former is hard.  But I think the latter is easier than
>> I realized before.
>>
>> Recent, still percolating changes in how yt handles pixelization and
>> ray traversal should allow a simple mechanism for swapping out
>> pixelization functions based on characteristics of the dataset.  A
>> pixelization function takes a slice or projection data object (which
>> is, in fact, an *actual* data object and not an image) and then
>> converts them to an image.  For SPH datasets, we could mandate that
>> the pixelization object be swapped out for something that can apply an
>> SPH-appropriate method of pixelization (see Daniel Price's SPLASH
>> paper, or the GigaPan paper, for a few examples).  We would likely
>> also want to change out the Projection object, as well, and we may
>> need to eliminate the ability to avoid IO during the pixelization
>> procedure (i.e., the project-once-pixelize-many advantage of grid
>> data).  But I think it would be do-able.
>>
>> So we could do this.  I think this is the harder road, but I think it
>> is also possible.  We would *still* need to come up with a mechanism
>> for applying the smoothing kernel.
>>
>> = Conclusions and Contributions =
>>
>> So I think we have two main paths forward -- gridding, and not
>> gridding.  I would very, very much appreciate comments or feedback on
>> either of these approaches.  I believe that, regardless of what we
>> choose to do, we'll need to figure out how to best apply a smoothing
>> kernel to data; I think this is probably going to be difficult.  But I
>> am optimistic we are nearing the point of usability.  For pure N-body
>> data, we are already there -- the trick is now to look at the data as
>> fluids, not just particles.
>>
>> The best way to contribute help right now is to brainstorm and give
>> feedback.  I'd love to hear ideas, criticisms, flaws in my logic, ways
>> to improve things, and so on and so forth.  I think once we have an
>> MVP, we will be better set up to move forward.
>>
>> If you'd like to add a reader for your own flavor of particle data, we
>> should do that as well -- especially if you are keen to start
>> exploring and experimenting.  The code so far can be found in the
>> repository linked above.  I'd also appreciate comments (or acceptance)
>> of the pull request.  I am also happy to hold a Hangout to discuss
>> this, go over things, help set up readers or whatever.  I consider
>> this a relatively high-priority project.  It will be a key feature of
>> yt 3.0.
>>
>> (Speaking of which, today I've allotted time to focus on getting the
>> last major patch-refinement feature done -- covering grids.  Hopefully
>> patch-refinement people can test it out after that's done!)
>>
>> Thanks for your patience with such a long email!  :)
>>
>> -Matt
>> ______________________________**_________________
>> yt-dev mailing list
>> yt-dev at lists.spacepope.org
>> http://lists.spacepope.org/**listinfo.cgi/yt-dev-spacepope.**org<http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org>
>>
>
> ______________________________**_________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/**listinfo.cgi/yt-dev-spacepope.**org<http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20130110/8390bc47/attachment.htm>