[Yt-dev] Field definitions, derived fields, whats-in-a-file and the "deliberate_fields" branch

Wed Nov 9 12:46:38 PST 2011

Hi Jeff,

On Wed, Nov 9, 2011 at 3:25 PM, j s oishi <jsoishi at gmail.com> wrote:
> Hi Matt,
>
> This sounds like a much needed overhaul. However, I'm not quite clear
> on exactly what this will entail, or how it will work once
> implemented. Could you or Casey provide an example of a new field or
> two, demonstrating how these dictionaries, fallbacks, and Null
> functions work? I think this is likely a very simple thing, but I'm
> having trouble visualizing it.

Sure!  For the most part, the changes will *all* be internal.  The
user should see very little change, if any.  The idea is that if I ran
a simulation, or wrote a simulation code, that had the field "Phi" in
it, I would write something like:

add_field("Phi", function = lambda a, b: None, ...)

Now the process is much more explicit.  The problem with the above
statement is that it doesn't specify if it's known to sometimes exist,
if it can be generated, and worse than that it needs to be made clear
when to generate and when to use.  Furthermore, as it currently
stands, if we generate this field, then we find one in the file that
is called by the same name, it's not always clear how to set it up --
where do units come from, etc etc.

So the new definition would look like:

add_enzo_field("Phi", function = NullFunc, ...)

This does two things.  It swaps out the "add_field" which added to a
global dictionary for one specific to the simulation frontend.  It
also uses NullFunc, which is an actual honest-to-gosh object with a
specific identity (i.e., we can do if function is NullFunc.)  The old
system globally swapped things, did shared state between unrelated
objects, and so on.  The basic problem was that it was clunky and the
derived / known fields were not clearly separated.

-Matt

>
> thanks,
>
> j
>
> On Wed, Nov 9, 2011 at 9:05 AM, Matthew Turk <matthewturk at gmail.com> wrote:
>> Hi all,
>>
>> Over the last couple months, Casey and I have been working -- on and
>> off! -- on a new branch of the code called "deliberate_fields."  This
>> branch will change, in a substantial but easy-to-update way, how
>> fields are handled in yt.
>>
>> I recognize this email is long.  But if you use non-standard fields, a
>> bunch of derived fields, unit modifications, any of that, it may
>> affect you.  So I *please* ask that you read it and, if you like,
>> contribute back to the discussion.
>>
>> This is one of the items I really want to have done for a hypothetical
>> 2.3 release.
>>
>> = Background =
>>
>> The way fields work currently was designed a bit haphazardly.  They
>> use FieldInfoContainers, objects which share state and which contain
>> unions of the known derived fields and the known IO-based fields.  One
>> of the problems with this is that the only thing that separates a
>> derived field from a known field is that function that generates the
>> field: the IO-based fields all use a lambda which returns None, and
>> the non-IO based fields return actual fields.  This is pretty
>> sub-optimal, and it actually lands us in trouble when (for instance)
>> we have fields wandering around named things like "Thermal_Energy" and
>> "ThermalEnergy"; the mechanism by which one is selected and the other
>> not is problematic, and to get around infinite recursion, hacks have
>> had to be applied.
>>
>> As it stands, to find a field, the shared-state "field info" on a
>> parameter file is queried; this then will try to check universal
>> fields.  But because of how the fields are stored, the field info
>> cascade can also operate in reverse.  The big problem is that the
>> field selection mechanism doesn't seem to have a bus factor >= 1.0.
>> And, it has a number of hacks to make it work with conflicting field
>> definitiosn and the like.
>>
>> Unfortunately, layering these hacks on top of each other makes it much
>> harder for other codes to be supported; translations are not reliable,
>> and sometimes cause too many levels of recursion to be added.
>> Something simpler is necessary.
>>
>> = What this does =
>>
>> Essentially, this creates multi-level, explicit fallbacks.  The field
>> info container, which was a bloated, weird shared state object, is now
>> simply a dictionary subclass with a "fallback" option.  When you
>> create them, you can either create it in isolation (with no fallback)
>> or with a fallback.  When you query it, if it does not have a field,
>> it checks its fallback.  There are, additionally, two new functions
>> for IO: the translation function and the null function.  The first is
>> to translate, for instance, "density" to "Density" and the second is
>> to indicate that a field is expected to be found in an output from the
>> simulation code.
>>
>> There are now affiliated with each simulation code two field info
>> objects: the "known" fields, which may appear in files, and the
>> non-known (i.e., code-specific derived) fields.  These live as the
>> attributes _fieldinfo_fallback and _fieldinfo_known on the
>> StaticOutput sublcass corresponding to a simulation code.  When the
>> Hierarchy (not static output) is instantiated, the first step is to
>> create a new field_info object.  This has, as a fallback, the
>> _fieldinfo_fallback, which itself has as a fallback the
>> universally-known derived fields.  The hierarchy then queries the
>> output file for which fields are available.  This process then looks
>> for a corresponding field in fieldinfo_known, and if it finds it, it
>> adds it to the field_info object, *overriding* any possible derived
>> fields.  (In this manner, for instance, yt will not recalculate a
>> "CoolingTime" field if one exists in the output.)
>>
>> = What it aims to do in the future =
>>
>> This will be utilized in three main ways:
>>
>> 1) Making it more clear which fields belong to which code, and which
>> come from disk and which are derived
>> 2) Help move IO into fields, to optimize for geometries and data containers
>> 3) Make units more clear and specific
>> 4) This is all designed around better supporting the GDF.
>>
>> = Where from here? =
>>
>> It would be hugely beneficial if you could test this and report back.
>> I have created a pull request:
>>
>> https://bitbucket.org/yt_analysis/yt/pull-request/27/field-overhaul-to-utilize-explicit
>>
>> This is by no means a settled matter; I think we need to have testing
>> on this, buy-in from developers and users, and to make sure that old
>> code doesn't beak.  The test cases all pass for me for Enzo.
>>
>> Before this can be merged, I would hope we can get some testing from:
>>
>>  * Enzo
>>  * Nyx
>>  * FLASH
>>  * Orion
>>
>> and any other codes that can hear me.
>>
>> Thanks very much for your time; please let me know if you have any
>> questions, concerns, jokes, comments, improvements, CDs of your band,
>> suggestions, and so on.  For this major of a change I'd like to keep
>> discussion on list, so the record of this is a bit more prominent.
>>
>> Best,
>>
>> Matt
>> _______________________________________________
>> Yt-dev mailing list
>> Yt-dev at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>