[yt-dev] Turn data serialization off by default

Cameron Hummels chummels at gmail.com
Thu Jul 25 11:53:36 PDT 2013


+1.  Thank you, Nathan.


On Thu, Jul 25, 2013 at 9:03 AM, Nathan Goldbaum <nathan12343 at gmail.com>wrote:

> Hi, sorry I wasn't clear, picking directly will work. pf.h.save_object and
> pf.h.load_object will exit after doing nothing.
>
>
> On Thu, Jul 25, 2013 at 9:00 AM, David Collins <dcollins4096 at gmail.com>wrote:
>
>> Thanks for the examples.
>>
>> I'm a little unclear about your last statement-- will pickling the
>> objects directly work with serialization off?
>>
>> Thanks!
>> d.
>>
>>
>> On Thu, Jul 25, 2013 at 12:55 AM, Nathan Goldbaum <goldbaum at ucolick.org>wrote:
>>
>>> Hey David,
>>>
>>> I don't think you can modify the ytcfg object after loading up yt, so
>>> your second example won't work.
>>>
>>> As for your first example, I think that's possible via pickling:
>>>
>>> with open('data.pickle', 'wb') as pkl_file:
>>>     s = cPickle.dumps(proj, pkl_file, protocol=-1)
>>>
>>> You can then load it later like so:
>>>
>>> with open('data.pickle', 'rb') as pkl_file:
>>>     proj = cPickle.load(pkl_file)
>>>
>>> You can do similar things using pf.h.save_object() and load_object(),
>>> but in a bit of a chicken and egg situation, you'll need serialization
>>> turned on in your config parameters for that to work.
>>>
>>> -Nathan
>>>
>>>
>>> On Tue, Jul 23, 2013 at 8:03 PM, David Collins <dcollins4096 at gmail.com>wrote:
>>>
>>>>
>>>> I'm +1 on changing the default.  Thanks for making a announcement about
>>>> the change.
>>>>
>>>> How hard would it be to make an individual routine get serialized on
>>>> demand?  For instance,
>>>> proj = pf.h.proj( ... serizlize = True)
>>>>
>>>> Or, would it work to do
>>>>
>>>> ytcfg['yt', 'serialize'] = 'True'
>>>> do stuff
>>>> ytcfg['yt', 'serialize'] = 'False'
>>>> ?
>>>>
>>>> d.
>>>>
>>>>
>>>>
>>>> On Tue, Jul 23, 2013 at 6:20 PM, j s oishi <jsoishi at gmail.com> wrote:
>>>>
>>>>> Oh god...+100000000000 <sound of coins dinging in 8 bit glory>
>>>>> On Jul 23, 2013 7:08 PM, "Matthew Turk" <matthewturk at gmail.com> wrote:
>>>>>
>>>>>> On Tue, Jul 23, 2013 at 3:27 PM, Nathan Goldbaum <
>>>>>> nathan12343 at gmail.com> wrote:
>>>>>> > Hi all,
>>>>>> >
>>>>>> > I've just issued a PR that will hopefully fix a whole class of buggy
>>>>>> > behavior that both new and experienced yt users commonly run into.
>>>>>> > Specifically, I'd like it if we could turn off data serialization by
>>>>>> > default.  This changes a long-lived default value in yt's
>>>>>> configuration, so
>>>>>> > I wanted to bring this change to the attention of both the yt user
>>>>>> and
>>>>>> > developer community.
>>>>>> >
>>>>>> > What is data serialization?  Currently, yt will save the result of
>>>>>> certain
>>>>>> > expensive calculations, including projections, the structure of the
>>>>>> grid
>>>>>> > hierarchy, and the list of fields present in the data.  While this
>>>>>> does have
>>>>>> > the beneficial effect of saving time when a user needs to
>>>>>> repetitively
>>>>>> > calculate these quantities on the same dataset, it has a number of
>>>>>> features
>>>>>> > which lead to buggy, annoying behavior.
>>>>>> >
>>>>>> > Specifically, If I am developing my simulation code or repeatedly
>>>>>> restarting
>>>>>> > my code, searching for a way to grind past a code crash, I will
>>>>>> quite often
>>>>>> > regenerate the same simulation output file over and over, changing
>>>>>> a line of
>>>>>> > code or switching out the value of a parameter each time.
>>>>>> >
>>>>>> > If yt's data serialization is turned on, it's likely that yt's
>>>>>> > visualizations will correspond to old versions of the data file.
>>>>>>  Since only
>>>>>> > certain operations are serialized, it's also possible for yt to get
>>>>>> into an
>>>>>> > inconsistent state - one operation will show the current data file,
>>>>>> while
>>>>>> > another operation will show an old version.
>>>>>> >
>>>>>> > It's possible to fix a bug in your code, but because yt is still
>>>>>> loading the
>>>>>> > old data, you won't be able to tell that your bug is fixed until
>>>>>> you realize
>>>>>> > that you have .yt and .harrays files littering your filesystem.
>>>>>> >
>>>>>> > I've personally wasted a lot of time due to yt's serialization
>>>>>> 'feature' and
>>>>>> > denizens of our IRC channel and mailing list can attest to how
>>>>>> often new
>>>>>> > users run into this behavior as well.
>>>>>> >
>>>>>> > My pull request only turns off serlialization by default, it
>>>>>> doesn't disable
>>>>>> > the capability completely.  Once the pull request is merged in, you
>>>>>> can turn
>>>>>> > on serialization either by adding an entry to your config file:
>>>>>> >
>>>>>> > $ cat ~/.yt/config
>>>>>> >
>>>>>> > [yt]
>>>>>> > serialize = True
>>>>>> >
>>>>>> > Or on a per-script basis:
>>>>>> >
>>>>>> > from yt.config import ytcfg
>>>>>> > ytcfg['yt', 'serialize'] = 'True'
>>>>>> > from yt.mods import *
>>>>>> >
>>>>>> > The pull request is here:
>>>>>> > https://bitbucket.org/yt_analysis/yt/pull-request/558
>>>>>> >
>>>>>> > I know several of you are big fans of this feature, so if you
>>>>>> object to this
>>>>>> > change please leave a comment on the pull request so we can figure
>>>>>> out a way
>>>>>> > forward.
>>>>>>
>>>>>> I think this is long overdue, for all the reasons you list.
>>>>>> Auto-serialization treated a lot of symptoms that we have since
>>>>>> improved, or that we should address more directly -- speed of
>>>>>> hierarchy construction, saving data that we want to retain, and
>>>>>> detecting fields.
>>>>>>
>>>>>> +1!
>>>>>>
>>>>>> -Matt
>>>>>>
>>>>>> >
>>>>>> > -Nathan
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > yt-dev mailing list
>>>>>> > yt-dev at lists.spacepope.org
>>>>>> > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>>> >
>>>>>> _______________________________________________
>>>>>> yt-dev mailing list
>>>>>> yt-dev at lists.spacepope.org
>>>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> yt-dev mailing list
>>>>> yt-dev at lists.spacepope.org
>>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> -- Sent from a computer.
>>>>
>>>> _______________________________________________
>>>> yt-dev mailing list
>>>> yt-dev at lists.spacepope.org
>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>
>>>>
>>>
>>> _______________________________________________
>>> yt-dev mailing list
>>> yt-dev at lists.spacepope.org
>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>
>>>
>>
>>
>> --
>> -- Sent from a computer.
>>
>> _______________________________________________
>> yt-dev mailing list
>> yt-dev at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>
>>
>
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>


-- 
Cameron Hummels
Postdoctoral Researcher
Steward Observatory
University of Arizona
http://chummels.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20130725/1642174c/attachment.html>


More information about the yt-dev mailing list