[yt-users] error in the field SoundSpeed?

Matthew Turk matthewturk at gmail.com
Tue May 24 13:47:56 PDT 2011


Hi Luigi and everyone,

Off-list Luigi and I exchanged data, and I was able to reproduce the
problem.  I've separated my response into sections for easy
skipping-of-boring-bits.

= Background =

I think some background about how all the field stuff works is in
order; it's come up recently in a couple different contexts, and I
think it is showing its weaknesses.  The current method for field
definitions dates from sometime in late 2007, and has not been updated
substantially since that time.  (Looking around in the repo indicates
it was added in the 365th revision; we're now at around 4300
revisions.)  It's based around a design pattern called "The Borg" --
this is a python version of a "Singleton", such that *effectively*
only one instance of an object can exist.  So one could in theory
instantiate tons and tons of objects of class Whatever, and they would
all retain the same state; you could manipulate the OrionFieldInfo
affiliated with pf1 and it would affect the OrionFieldInfo affiliated
with pf2, even though they pointed to different memory locations.

So how this works currently is that for every field that *might* exist
in the data file, a dummy entry is appended to the field info
container specific to that type of output (Enzo, Orion, ART, etc.)
The code then detects which fields are in the data file (this is where
the problem starts) and then if it doesn't recognize them, it adds a
new dummy entry.  Of course, this works differently for different
codes, since not all codes use self-describing formats (i.e., HDF5 or
BoxLib) and some even have fixed sets of fields in each data file.

= The Problem =

The problem is that up until relatively recently, we did not allow for
appropriate fallbacks.  There were a number of circular field
definitions, and the actual field infos that were selected for each
data file were incorrectly specified, so you could end up with
incorrect conversions and whatnot.

When this problem was fixed, it became necessary to avoid circular
dependency analysis.  This is where the aliasing of Gas_Energy and
GasEnergy enters in.  The derived fields were taking precedent over
the non-derived fields, and the detection was being executed on the
original field info containers, not on the field info containers
affiliated with a given parameter file.  So they'd hit recursion
limits.  The idea was to remove the aliasing, which we had hoped would
not cause a problem, and pass the buck to Future Development down the
road.

This was, it seems, the wrong decision.

= The Solution, and a Roadmap =

The solution is, I think, is to rework FieldInfoContainers.  I have
begun doing this, and with my first couple passes at it I believe it
is giving good results.  I will need you to test it out, Luigi, before
we move forward.

However, a change like this is going to end up being disruptive.  We
broke API last fall in the unstable branch when we moved to the 2.0
API; I'm loathe to break it again so soon.  So any changes will be
API-compatible, which will make them a bit less invasive than I was
thinking.

So here's my roadmap.  I'm going to continue working on refactoring
out the fields, in this repository:

https://bitbucket.org/MatthewTurk/yt-newfields

in the branch "deliberate_fields"; I have committed an initial pass at
this that reduces the overall complexity of the system as well as
fixes the specific issues that Luigi reported.  These problems will
not be fixed in the main branch until we have both a new system for
fields that is as good as we can make it and we have a system for
automated binary diffing of the old results versus the new.  Luigi,
I've pasted a script here:

http://paste.enzotools.org/show/1666/

that demonstrates that ThermalEnergy now works and that data-on-disk
will replace generated fields.

We had originally plannet a release of yt 2.2 before the time that
this could be finished, which would have the new GUI (Reason v5, or
"Cinco Analysis Generator") as its centerpiece.  (We're very proud of
the GUI and want to get it in as many hands as possible.)  However,
it's entirely possible that we'll be able to finish all of these
agenda items, and so I think I'm going to shift the 2.2 schedule to
accommodate a rewrite of the field subsystem for clarity and reliance,
fixing the bug you reported but ensuring no regression on existing
fields.

What needs to be done for the 2.2 release, which would be a merging of
the development tip to stable, as well as updating modules etc in
various places:

 * Binary-diff testing mechanism (70% completed) needs to be finished.
 I would very much appreciate assistance with this, and I am more than
happy to give guidance and work on this together in IRC or over email.
 This is a very good potential first project for anybody reading who
wants to start developing!
 * Field info containers: this is one that I will likely take the lead
on, but I believe I have a good handle for how it goes.  That being
said, I would like to get some feedback before we go too far on how
the fields should be specified and how overrides should work.
 * GUI testing: If anyone out there would be willing or interested in
testing the GUI, this would be great.  I'm happy to send out
instructions on how to test it out.  Right now it's very functional,
has no dependencies outside some static (no compilation necessary)
files that come with the current yt install script, and it supports
interactive plotting, pylab, phase plots, evaluation, generic image
generation and so on, all over an SSH-tunneled web connection.

I'd very much appreciate any thoughts, comments, etc, and I'd love it
if people that are maybe interested in getting their feet wet with yt
development would stop by IRC or email yt-dev, so we can chat about
this sort of thing.  Working on testing is a pretty well-defined
problem, and it'd be a good way to familiarize yourself with the yt
codebase.

Anyway, I hope this resolution is alright.  Let me know what you
think; it's a tricky spot we're in, because the issue you're seeing is
actually caused by fixing a bug that showed up when we fixed another
bug, and reducing the complexity is the only sustainable path forward,
but it's also the most difficult.

Best wishes,

Matt

On Tue, May 24, 2011 at 10:39 AM, Luigi Iapichino
<luigi at ita.uni-heidelberg.de> wrote:
> Hi Matt,
>
> it looks like it is getting a bit complicated...
> Let's refer only to GasEnergy, for simplicity. Following your suggestion, I
> added to the file ~/.yt/my_plugins.py the lines
>
> def _convertEnergy(data):
>    return data.convert("x-velocity")**2.0
>
> def _GasEnergy(field, data):
>    return data["Gas_Energy"] / _convertEnergy(data)
> add_enzo_field("GasEnergy", function=_GasEnergy,
>           units=r"\rm{ergs}/\rm{g}", convert_function=_convertEnergy)
>
> (note the add_enzo_field, as you suggested). In this way, slicing GasEnergy
> finally worked, but >not< projecting it. I attach a simple script reproducing
> my error (assuming you have a test Enzo dataset with the field Gas_Energy in
> it).
> I hope that tracking this issue is useful for your development; you mentioned
> that future, major changes will address this problem anyway. If you prefer, I
> can simply go back to the stable branch for a quick fix, until some more
> efficient and general solution is found.
>
> Cheers,
>
>  Luigi
>
> On Tuesday 24 May 2011, Matthew Turk wrote:
>> Hi Luigi,
>>
>> What I think might be happening is that your field ("ThermalEnergy" is
>> the one that you can't access, right?) is being added to the universal
>> container, which is less specific than the Enzo-specific container;
>> ThermalEnergy is nominally defined in the Enzo-specific container, so
>> it wins.  Try changing add_field to add_enzo_field and see if that
>> helps out.  If you want to redefine a field that is *in* the data
>> output, that is trickier, as it requires also removing the item from
>> the list of 'in the data' fields with
>> "pf.h.field_list.pop(pf.h.field_list.index('whatever'))".
>>
>> The field stuff has grown to the point where it no longer scales in
>> the human dimension.  I think this will be a target of a cleanup in
>> the near future.  Britton and I have been working to expand testing
>> coverage and to move to a new scheme of binary-diff testing of
>> results; this will help us to refactor things like the field
>> containers.
>>
>> -Matt
>>
>> On Tue, May 24, 2011 at 5:36 AM, Luigi Iapichino
>>
>> <luigi at ita.uni-heidelberg.de> wrote:
>> > Hi Matt,
>> >
>> > thanks for the suggestions. I added to my file ~/.yt/my_plugins.py the
>> > following lines:
>> >
>> > def _ThermalEnergy(field, data):
>> >    if data.pf["HydroMethod"] == 2:
>> >        return data["Total_Energy"]
>> >    else:
>> >        if data.pf["DualEnergyFormalism"]:
>> >            return data["GasEnergy"]
>> >        else:
>> >            return data["Total_Energy"] - 0.5*(
>> >                   data["x-velocity"]**2.0
>> >                 + data["y-velocity"]**2.0
>> >                 + data["z-velocity"]**2.0 )
>> > add_field("ThermalEnergy", function=_ThermalEnergy,
>> >          units=r"\rm{ergs}/\rm{cm^3}")
>> >
>> > def _convertEnergy(data):
>> >    return data.convert("x-velocity")**2.0
>> >
>> > def _GasEnergy(field, data):
>> >     return data["Gas_Energy"] / _convertEnergy(data)
>> > add_field("GasEnergy", function=_GasEnergy,
>> >           units=r"\rm{ergs}/\rm{g}", convert_function=_convertEnergy)
>> >
>> >
>> > def _TotalEnergy(field, data):
>> >    return data["Total_Energy"] / _convertEnergy(data)
>> > add_field("TotalEnergy", function=_TotalEnergy,
>> >          display_name = "\mathrm{Total}\/\mathrm{Energy}",
>> >          units=r"\rm{ergs}/\rm{g}", convert_function=_convertEnergy)
>> >
>> > and I switched back to the unstable branch. Still, I have a problem (that
>> > I noticed also other times): what I put in my_plugins.py is added to yt,
>> > but it does not override the existing definitions. In my case, I get
>> > again an error, because yt takes the definition of ThermalEnergy from
>> > yt-unstable/src/yt-hg/yt/frontends/enzo/fields.py , rather than from
>> > my_plugins.py (I can even copy the definitions in my yt script, but it
>> > doesn't change the problem). Is there a way to arrange my script in such
>> > a way to give priority to the definitions in my_plugins.py?
>> >
>> > Cheers,
>> >
>> >  Luigi
>> >
>> > On Monday 23 May 2011, Matthew Turk wrote:
>> >> Hi Luigi,
>> >>
>> >> Yes -- in the past we mutually aliased the fields "GasEnergy" and
>> >> "Gas_Energy" to each other, so that if one was not found the other was
>> >> looked for.  Unfortunately, this aliasing caused infinite recursion
>> >> loops:
>> >>
>> >> https://bitbucket.org/yt_analysis/yt/changeset/258083307f14
>> >>
>> >> so we had to remove this dual-aliasing.  I had hoped it wouldn't cause
>> >> problems; I tried very hard with Sam to come up with a way around
>> >> this, so that we could leave the aliasing in.  Unfortunately without
>> >> reworking the field container, we could not.  I am hoping that by the
>> >> end of the calendar year we will be able to rework the field container
>> >> to be more flexible.  However, for now, it should suffice to add into
>> >> your plugins file the lines that were removed in the linked commit.
>> >> This also took place while we standardized in Enzo on GasEnergy rather
>> >> than Gas_Energy (although I personally preferred the hyphen :):
>> >>
>> >> https://groups.google.com/d/topic/enzo-dev/64j9OZEAzcU/discussion
>> >>
>> >> Sorry for the inconvenience -- let me know if the plugins fix doesn't
>> >> work and we'll figure something else out.
>> >>
>> >> -Matt
>> >>
>> >> On Mon, May 23, 2011 at 3:49 PM,  <luigi at ita.uni-heidelberg.de> wrote:
>> >> > Dear all,
>> >> >
>> >> > I want to point out that, in the unstable branch, any attempt of using
>> >> > the field SoundSpeed returns an error. I have the impression that yt
>> >> > looks for GasEnergy, but in my dataset I have Gas_Energy defined. In
>> >> > the stable branch everything works well. I hope it does not depend on
>> >> > the data I'm analysing (some old Enzo 1.0 output)...
>> >> > Cheers,
>> >> >
>> >> >  Luigi
>> >
>> > --
>> >
>> > ---------------------------------------------------------------
>> >
>> > Luigi Iapichino
>> > Zentrum fuer Astronomie der Universitaet Heidelberg
>> > Institut fuer Theoretische Astrophysik
>> > Albert-Ueberle-Str. 2, D-69120 Heidelberg, Germany
>> > Tel: +49 6221 548983, Fax: +49 6221 544221
>> > e-mail: luigi at ita.uni-heidelberg.de
>> > URL: http://www.ita.uni-heidelberg.de/~luigi/
>> > _______________________________________________
>> > yt-users mailing list
>> > yt-users at lists.spacepope.org
>> > http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>
>> _______________________________________________
>> yt-users mailing list
>> yt-users at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
>
> --
>
> ---------------------------------------------------------------
>
> Luigi Iapichino
> Zentrum fuer Astronomie der Universitaet Heidelberg
> Institut fuer Theoretische Astrophysik
> Albert-Ueberle-Str. 2, D-69120 Heidelberg, Germany
> Tel: +49 6221 548983, Fax: +49 6221 544221
> e-mail: luigi at ita.uni-heidelberg.de
> URL: http://www.ita.uni-heidelberg.de/~luigi/
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>



More information about the yt-users mailing list