[yt-dev] Interacting with data in yt 3.0 (was Field units from code to code)

Casey W. Stark caseywstark at gmail.com
Mon Apr 2 10:54:17 PDT 2012


Sounds good.

On Mon, Apr 2, 2012 at 10:47 AM, Matthew Turk <matthewturk at gmail.com> wrote:

> Hi Casey,
>
> On Mon, Apr 2, 2012 at 1:01 PM, Casey W. Stark <caseywstark at gmail.com>
> wrote:
> > I think I forgot to reply -- Tuesday works for me and Wednesday is good
> > before 11 or after 12:30 Pacific.
> >
> > We can sort this out during the hangout, but which issue are we focusing
> on?
> > Is this more for the units system, renaming fields in the 3.0 branch, or
> the
> > dataset change? (or maybe something else that was mentioned, there were a
> > lot)
>
> How about 1:00PM pacific on Wednesday?  And I was thinking we'd work
> in yt-refactor and change up the fields.
>
> -Matt
>
> >
> > Best,
> > Casey
> >
> >
> > On Fri, Mar 30, 2012 at 12:59 PM, Matthew Turk <matthewturk at gmail.com>
> > wrote:
> >>
> >> On Fri, Mar 30, 2012 at 1:22 PM, Nathan Goldbaum <goldbaum at ucolick.org>
> >> wrote:
> >> >> 1) Get rid of accessing parameters with an implicit __getitem__ on
> the
> >> >> parameter file (i.e., pf["SomethingThatOnlyExistsInOneCode"]).  I'm
> >> >> +10 on this.
> >> >> 2) Move units into the .units object (I'm mostly with Casey on this,
> >> >> but I think it should be a part of the field_info object)
> >> >> 3) Have things like current_time, domain_dimensions and so on move
> >> >> into basic_info and make them dict objects.
> >> >>
> >> >> I think of those, I'm in favor of one and two, but somewhat opposed
> to
> >> >> #3.  Right now we have these attributes mandated for subclasses of
> >> >> StaticOutput:
> >> >
> >> > I'd say #3 is the least important.  I'd be fine with the dataset
> object
> >> > having some non-dict attributes that describe the nature of the
> dataset
> >> > rather than storing them all in a basic_info dict.  One thing to think
> >> > about: if we want to support pure-particle datasets, then we should
> drop the
> >> > notion of  refine_by as a basic dataset attribute.
> >>
> >> I think whether refine_by sticks around depends on how we end up
> >> wanting to address fluid quantities in particle datasets.  One
> >> possibility for handling SPH data is to grid it, and while I don't
> >> want to lock us into that (myopic at best) I don't want to exclude it
> >> as an ultimate possibility.  But yes, in general, I agree.  As I have
> >> been working on the geometry refactor, the number of times refine_by
> >> is access has been going down, as for the most part it relies on (for
> >> instance) the projection code knowing how to handle data from grids,
> >> which has been pshed back onto the grids instead.  Now projections
> >> simply receive data that is ordered spatially, and that data is
> >> appropriately added.
> >>
> >> >
> >> >> With the geometry_refactor, I'd like to consolidate functionality
> into
> >> >> the main "dataset" object.  The geometry can still provide access to
> >> >> the individual grids (of course) but data objects, finding max,
> >> >> getting stats about the simulation, etc, should all go into the main
> >> >> dataset object, and the geometry handler can simply be created on the
> >> >> fly if necessary.
> >> >
> >> > Why not get access to objects through a geometry attribute that hangs
> >> > off of the dataset object.  If I wanted to instantiate a sphere
> object, I
> >> > would just do:
> >> >
> >> > sp = ds.geometry.sphere()
> >> >
> >> > This is pretty much the same as the pf.h.sphere() syntax in place
> right
> >> > now but allows for arbitrary selection embedded inside of the new
> geometry
> >> > code.
> >>
> >> That's how I was implementing it.  I just wasn't sure this was as
> >> clean.  Having the plots then hang off the geometry feels a little
> >> funny.
> >>
> >> Also, I don't think I explicitly commented on Casey's hangout
> >> suggestion -- I am in favor.  Could we do Tuesday afternoon (late
> >> morning CA time) or Wednesday same?
> >>
> >> -Matt
> >>
> >> >
> >> > Nathan Goldbaum
> >> > Graduate Student
> >> > Astronomy & Astrophysics, UCSC
> >> > goldbaum at ucolick.org
> >> > http://www.ucolick.org/~goldbaum
> >> >
> >> > On Mar 30, 2012, at 3:48 AM, Matthew Turk wrote:
> >> >
> >> >> In general, I agree with the idea Nathan put out.  (Also, I think
> this
> >> >> is a fine time to have a bikeshed discussion.  Many of the underlying
> >> >> assumptions about how yt works were laid out a long time ago.)  But,
> >> >> I'm not entirely sure I understand how different it would be --
> >> >> conceptually, yes, I see what you're getting at, that we'd have a set
> >> >> number of attributes.  In what I was thinking of for the geometry
> >> >> refactor so far I'm trying to get rid of the "hierarchy" as existing
> >> >> for every data set, and instead relying on what amounts to an
> >> >> object-finder and io-coordinator, which I'm calling a geometry
> >> >> handler.  It sounds like what you would like is:
> >> >>
> >> >> 1) Get rid of accessing parameters with an implicit __getitem__ on
> the
> >> >> parameter file (i.e., pf["SomethingThatOnlyExistsInOneCode"]).  I'm
> >> >> +10 on this.
> >> >> 2) Move units into the .units object (I'm mostly with Casey on this,
> >> >> but I think it should be a part of the field_info object)
> >> >> 3) Have things like current_time, domain_dimensions and so on move
> >> >> into basic_info and make them dict objects.
> >> >>
> >> >> I think of those, I'm in favor of one and two, but somewhat opposed
> to
> >> >> #3.  Right now we have these attributes mandated for subclasses of
> >> >> StaticOutput:
> >> >>
> >> >> refine_by
> >> >> dimensionality
> >> >> current_time
> >> >> domain_dimensions
> >> >> domain_left_edge
> >> >> domain_right_edge
> >> >> unique_identifier
> >> >> current_redshift
> >> >> cosmological_simulation
> >> >> omega_matter
> >> >> omega_lambda
> >> >> hubble_constant
> >> >>
> >> >> The only ones here that I think would be okay to move out of
> >> >> properties would be the cosmology items, and even those I'm -0 on
> >> >> moving.
> >> >>
> >> >> But, in general, the idea of moving from this two-stage system of
> >> >> parameter file (rather than dataset) and hierarchy (rather than an
> >> >> implicitly-handled geometry) is something I am in support of.  The
> >> >> geometry is something that should nearly *always* be handled by the
> >> >> backend, rather than by the user.  So having the library require
> >> >> pf.h.sphere(...) is less than ideal, since it's exposing something
> >> >> relatively unfortunate (that building a hundred thousand grid objects
> >> >> can take some time).
> >> >>
> >> >> The main ways that the static output is interacted with:
> >> >>
> >> >> * Parameter information specific to a simulation code
> >> >> * Properties that yt needs to know about
> >> >> * To get at the hierarchy
> >> >> * Input to plot collections
> >> >>
> >> >> The main ways that the hierarchy is interacted with:
> >> >>
> >> >> * Getting data objects
> >> >> * Finding max
> >> >> * Statistics about the simulation
> >> >> * Inspecting individual grids (much less common use case now that it
> >> >> was before)
> >> >>
> >> >> All of these use cases are still valid, but I think it's clear that
> >> >> accessing individual grids and accessing simulation-specific
> >> >> parameters are not "generic" functions.  What a lot of this
> discussion
> >> >> has really brought up for me is that we're talking about *generic*
> >> >> functionality, not code-specific functionality, and we right now do
> >> >> not have the best enumeration of functionality and where it lies.
> >> >>
> >> >> With the geometry_refactor, I'd like to consolidate functionality
> into
> >> >> the main "dataset" object.  The geometry can still provide access to
> >> >> the individual grids (of course) but data objects, finding max,
> >> >> getting stats about the simulation, etc, should all go into the main
> >> >> dataset object, and the geometry handler can simply be created on the
> >> >> fly if necessary.
> >> >>
> >> >> This brings up two points, though --
> >> >>
> >> >> 1) Does our method of instantiating objects still hold up?  i.e.,
> >> >> ds.sphere(...) and so on?  Or does our dataset object then become
> >> >> overcrowded?  I would also like to move *all* plotting objects into
> >> >> whatever we end up deciding is the location data containers come
> from,
> >> >> which for instance could look like ds.plot("slice", "x") (for
> >> >> instance, although we can bikeshed that later), which would return a
> >> >> plot window.
> >> >> 2) Datasets and time series should behave, if not identically, at
> >> >> least consistently in their APIs.  Moving to a completely ds-mediated
> >> >> mechanism for generating, accessing and inspecting data opens up the
> >> >> ability to then construct very nice and simply proxy objects.  As an
> >> >> example, while something this is currently technically possible with
> >> >> the current Time Series API, it's a bit tricky:
> >> >>
> >> >> ts = TimeSeriesData.from_filenames(...)
> >> >> plot = ts.plot("slice", "x", (100.0, 'au'))
> >> >> ts.seek(dt = (100, 'years'))
> >> >> plot.save()
> >> >> ts.seek(dt = (10, 'years'))
> >> >> plot.save()
> >> >>
> >> >> (The time-slider, as Tom likes to call it ...)
> >> >>
> >> >> In general, this idea of moving toward more thoughtful
> >> >> dataset-construction, rather than the hokey parameter file +
> hierarchy
> >> >> construction brings with it a mindset shift which I'd like to spread
> >> >> to the time series, which can continue to be a focus.
> >> >>
> >> >> What do you think?
> >> >>
> >> >> -Matt
> >> >>
> >> >> On Thu, Mar 29, 2012 at 7:08 PM, Casey W. Stark <
> caseywstark at gmail.com>
> >> >> wrote:
> >> >>> +1 on datasets, although I would like to see the unit object(s) at
> the
> >> >>> field
> >> >>> level.
> >> >>>
> >> >>>
> >> >>> On Thu, Mar 29, 2012 at 4:04 PM, Cameron Hummels
> >> >>> <chummels at astro.columbia.edu> wrote:
> >> >>>>
> >> >>>> +1 on datasets.
> >> >>>>
> >> >>>>
> >> >>>> On 3/29/12 6:58 PM, Nathan Goldbaum wrote:
> >> >>>>>
> >> >>>>> +1.  I'd also be up to help out with the sprint.  Doing a virtual
> >> >>>>> sprint
> >> >>>>> using a google hangout might help mitigate some of the distance
> >> >>>>> problems.
> >> >>>>>
> >> >>>>> While we're brining up Enzo-isms that we should get rid of, I
> think
> >> >>>>> it
> >> >>>>> might be a good idea to make a conceptual shift in the basic
> python
> >> >>>>> UI.
> >> >>>>>  Instead referring to the interface between the user and the data
> as
> >> >>>>> a
> >> >>>>> parameter file, I think instead we should be talking about
> datasets.
> >> >>>>>  One
> >> >>>>> would instantiate a dataset just like we do now with parameter
> >> >>>>> files:
> >> >>>>>
> >> >>>>> ds = load(filename)
> >> >>>>>
> >> >>>>> A dataset would also have some universal attributes which would
> >> >>>>> present
> >> >>>>> themselves to the user as a dict, e.g. ds.units, ds.parameters,
> >> >>>>> ds.basic_info (like current_time, timestep, filename, and
> simulation
> >> >>>>> code),
> >> >>>>> and ds.hierarchy (not sure how that would interfere with the
> >> >>>>> geometry
> >> >>>>> refactor).
> >> >>>>>
> >> >>>>> This may be a paintibg the bike shed discussion, but I think this
> >> >>>>> shift
> >> >>>>> will help new users understand how to access their data.
>  Thoughts?
> >> >>>>>
> >> >>>>> On Mar 29, 2012, at 3:40 PM, Matthew Turk<matthewturk at gmail.com>
> >> >>>>>  wrote:
> >> >>>>>
> >> >>>>>> Hi Nathan and Casey,
> >> >>>>>>
> >> >>>>>> I agree with what both of you have said.  The Orion/Nyx units
> >> >>>>>> should
> >> >>>>>> be made to be consistent, but more importantly I think we should
> >> >>>>>> continue breaking away from Enzo-isms in the code.
> >> >>>>>>
> >> >>>>>> As it stands, all of the universal fields call underlying
> >> >>>>>> Enzo-named
> >> >>>>>> aliases -- Density, ThermalEnergy, etc etc.  I hope we can have a
> >> >>>>>> 3.0
> >> >>>>>> out within a calendar year, hopefully by the end of this year.
> >> >>>>>>  (I've
> >> >>>>>> been pushing on the geometry refactor, although recently other
> >> >>>>>> efforts
> >> >>>>>> have been paying off which has decreased my output there.)  I am
> >> >>>>>> much,
> >> >>>>>> much less doubtful than Casey is that we cannot do this; in fact,
> >> >>>>>> I'm
> >> >>>>>> completely in favor of this and I think it would be relatively
> >> >>>>>> straightforward to implement.
> >> >>>>>>
> >> >>>>>> In the existing system we have a mechanism for aliasing fields.
> >> >>>>>>  What
> >> >>>>>> we can do is provide an additional translation system where we
> >> >>>>>> enumerate the fields that are available for items in
> >> >>>>>> UniversalFields,
> >> >>>>>> and then construct aliases to those.  This would mean changing
> what
> >> >>>>>> is
> >> >>>>>> aliased in existing non-Enzo frontends, and adding aliases in
> Enzo.
> >> >>>>>> The style of name Casey proposes is what I woudl also agree with:
> >> >>>>>> underscores, lower cases, and erring on the side of verbosity.
>  The
> >> >>>>>> fields off hand that we would need to do this for (in their
> current
> >> >>>>>> enzo-isms):
> >> >>>>>>
> >> >>>>>> x-velocity =>  velocity_x (same for y, z)
> >> >>>>>> Density =>  density
> >> >>>>>> TotalEnergy =>  ?
> >> >>>>>> GasEnergy =>  thermal_energy_specific (and
> thermal_energy_density)
> >> >>>>>> Temperature =>  temperature
> >> >>>>>>
> >> >>>>>> and so on.
> >> >>>>>>
> >> >>>>>> Once we have these aliases in place, an overall cleanup of
> >> >>>>>> UniversalFields should take place.  One place we should clean up
> is
> >> >>>>>> ensuring that there are no conditionals; rather than conditionals
> >> >>>>>> inside the functions, we should place those conditionals inside
> the
> >> >>>>>> parameter file types.  So for instance, if you have a field that
> is
> >> >>>>>> calculated differently depending on the parameter HydroMethod (in
> >> >>>>>> Enzo
> >> >>>>>> for instance) you simply set a validator on the field requiring
> the
> >> >>>>>> parameter be set to a particular value, and then only the field
> >> >>>>>> which
> >> >>>>>> satisfies that validator will be called when requested.
> >> >>>>>>
> >> >>>>>> So we've gotten rid of a bunch of enzo-isms in the parameter
> files;
> >> >>>>>> after fields, what else can we address?  And, I'd be up for
> >> >>>>>> sprinting
> >> >>>>>> on this (which should take just a few hours) basically any time
> >> >>>>>> next
> >> >>>>>> week or after.  I'd also be up for talking more about geometry
> >> >>>>>> refactoring, if anyone is interested, but it's not quite to the
> >> >>>>>> point
> >> >>>>>> that I think I am satisfied enough with the architecture to
> request
> >> >>>>>> input / contributions.  Sometimes (especially with big
> >> >>>>>> architectural
> >> >>>>>> things like this) I think it's a shame we do all of our work
> >> >>>>>> virtually, as I think a lot of this would be easier to bang out
> in
> >> >>>>>> person for a couple hours.
> >> >>>>>>
> >> >>>>>> -Matt
> >> >>>>>>
> >> >>>>>> On Wed, Mar 28, 2012 at 6:14 PM, Casey W.
> >> >>>>>> Stark<caseywstark at gmail.com>
> >> >>>>>>  wrote:
> >> >>>>>>>
> >> >>>>>>> Hi Nathan.
> >> >>>>>>>
> >> >>>>>>> I'm also worried about this and I agree that fields with the
> same
> >> >>>>>>> name
> >> >>>>>>> should all be consistent. I would support some sort of cleanup
> of
> >> >>>>>>> frontend
> >> >>>>>>> fields, and I can get the Nyx fields in line and help with Enzo.
> >> >>>>>>>
> >> >>>>>>> I doubt we can do this, but I would prefer changing the field
> >> >>>>>>> names as
> >> >>>>>>> part
> >> >>>>>>> of the removing enzo-isms and geometry handling refactoring
> >> >>>>>>> pushes. For
> >> >>>>>>> instance, the field in Orion could be thermal_energy_density and
> >> >>>>>>> the
> >> >>>>>>> field
> >> >>>>>>> in Enzo could be specific_thermal_energy. I also noticed this
> >> >>>>>>> issue
> >> >>>>>>> when I
> >> >>>>>>> was using "Density" in Enzo (proper density in cgs) and
> "density"
> >> >>>>>>> in
> >> >>>>>>> Nyx
> >> >>>>>>> (comoving density in cgs).
> >> >>>>>>>
> >> >>>>>>> Best,
> >> >>>>>>> Casey
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> On Wed, Mar 28, 2012 at 1:47 PM, Nathan
> >> >>>>>>> Goldbaum<goldbaum at ucolick.org>
> >> >>>>>>> wrote:
> >> >>>>>>>>
> >> >>>>>>>> Hi all,
> >> >>>>>>>>
> >> >>>>>>>> On IRC today we noticed that Orion defines its ThermalEnergy
> >> >>>>>>>> field per
> >> >>>>>>>> unit volume but Enzo and FLASH define ThermalEnergy per unit
> >> >>>>>>>> mass.  Is
> >> >>>>>>>> this
> >> >>>>>>>> a problem?  Since yt defaults to the Enzo field names, should
> we
> >> >>>>>>>> try
> >> >>>>>>>> to make
> >> >>>>>>>> sure that all fields are defined using the same units as in
> Enzo?
> >> >>>>>>>>  Is
> >> >>>>>>>> there
> >> >>>>>>>> a convention for how different codes should define derived
> fields
> >> >>>>>>>> that
> >> >>>>>>>> are
> >> >>>>>>>> aliased to Enzo fields?
> >> >>>>>>>>
> >> >>>>>>>> One problem for this particular example is that the Pressure
> >> >>>>>>>> field is
> >> >>>>>>>> defined in terms of ThermalEnergy in universal_fields.py so the
> >> >>>>>>>> units
> >> >>>>>>>> of
> >> >>>>>>>> ThermalEnergy become important if a user merely wants the gas
> >> >>>>>>>> pressure
> >> >>>>>>>> in
> >> >>>>>>>> the simulation.
> >> >>>>>>>>
> >> >>>>>>>> One possible solution for this issue would be the units
> overhaul
> >> >>>>>>>> we're
> >> >>>>>>>> planning. If all fields are associated with a unit object, we
> can
> >> >>>>>>>> simply
> >> >>>>>>>> query the units to ensure that units are taken care of
> correctly
> >> >>>>>>>> and
> >> >>>>>>>> code-to-code comparisons aren't sensitive to the units chosen
> for
> >> >>>>>>>> fields in
> >> >>>>>>>> the frontend.
> >> >>>>>>>>
> >> >>>>>>>> Personally, I think it would be best if we could make sure that
> >> >>>>>>>> all of
> >> >>>>>>>> the
> >> >>>>>>>> fields aliased to Enzo fields have the same units.
> >> >>>>>>>>
> >> >>>>>>>> Nathan Goldbaum
> >> >>>>>>>> Graduate Student
> >> >>>>>>>> Astronomy&  Astrophysics, UCSC
> >> >>>>>>>>
> >> >>>>>>>> goldbaum at ucolick.org
> >> >>>>>>>> http://www.ucolick.org/~goldbaum
> >> >>>>>>>>
> >> >>>>>>>> _______________________________________________
> >> >>>>>>>> yt-dev mailing list
> >> >>>>>>>> yt-dev at lists.spacepope.org
> >> >>>>>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> _______________________________________________
> >> >>>>>>> yt-dev mailing list
> >> >>>>>>> yt-dev at lists.spacepope.org
> >> >>>>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >> >>>>>>>
> >> >>>>>> _______________________________________________
> >> >>>>>> yt-dev mailing list
> >> >>>>>> yt-dev at lists.spacepope.org
> >> >>>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>> _______________________________________________
> >> >>>>> yt-dev mailing list
> >> >>>>> yt-dev at lists.spacepope.org
> >> >>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >> >>>>>
> >> >>>> _______________________________________________
> >> >>>> yt-dev mailing list
> >> >>>> yt-dev at lists.spacepope.org
> >> >>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >> >>>
> >> >>>
> >> >>>
> >> >>> _______________________________________________
> >> >>> yt-dev mailing list
> >> >>> yt-dev at lists.spacepope.org
> >> >>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >> >>>
> >> >> _______________________________________________
> >> >> yt-dev mailing list
> >> >> yt-dev at lists.spacepope.org
> >> >> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >> >>
> >> >> !DSPAM:10175,4f758f9f246202301928688!
> >> >>
> >> >
> >> > _______________________________________________
> >> > yt-dev mailing list
> >> > yt-dev at lists.spacepope.org
> >> > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >> _______________________________________________
> >> yt-dev mailing list
> >> yt-dev at lists.spacepope.org
> >> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >
> >
> >
> > _______________________________________________
> > yt-dev mailing list
> > yt-dev at lists.spacepope.org
> > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20120402/e9cb3227/attachment.htm>


More information about the yt-dev mailing list