[yt-dev] Turn data serialization off by default

Matthew Turk matthewturk at gmail.com
Thu Jul 25 09:03:43 PDT 2013


On Thu, Jul 25, 2013 at 9:00 AM, David Collins <dcollins4096 at gmail.com> wrote:
> Thanks for the examples.
>
> I'm a little unclear about your last statement-- will pickling the objects
> directly work with serialization off?

Pickling should, yes.  Calling save_data I think will not.

>From your original examples, I think we absolutely need to move to the
place where we manually save proejctions -- I think that we all are in
support of this being available, it's jjust the default that's a
bummer.  :)

-Matt

>
> Thanks!
> d.
>
>
> On Thu, Jul 25, 2013 at 12:55 AM, Nathan Goldbaum <goldbaum at ucolick.org>
> wrote:
>>
>> Hey David,
>>
>> I don't think you can modify the ytcfg object after loading up yt, so your
>> second example won't work.
>>
>> As for your first example, I think that's possible via pickling:
>>
>> with open('data.pickle', 'wb') as pkl_file:
>>     s = cPickle.dumps(proj, pkl_file, protocol=-1)
>>
>> You can then load it later like so:
>>
>> with open('data.pickle', 'rb') as pkl_file:
>>     proj = cPickle.load(pkl_file)
>>
>> You can do similar things using pf.h.save_object() and load_object(), but
>> in a bit of a chicken and egg situation, you'll need serialization turned on
>> in your config parameters for that to work.
>>
>> -Nathan
>>
>>
>> On Tue, Jul 23, 2013 at 8:03 PM, David Collins <dcollins4096 at gmail.com>
>> wrote:
>>>
>>>
>>> I'm +1 on changing the default.  Thanks for making a announcement about
>>> the change.
>>>
>>> How hard would it be to make an individual routine get serialized on
>>> demand?  For instance,
>>> proj = pf.h.proj( ... serizlize = True)
>>>
>>> Or, would it work to do
>>>
>>> ytcfg['yt', 'serialize'] = 'True'
>>> do stuff
>>> ytcfg['yt', 'serialize'] = 'False'
>>> ?
>>>
>>> d.
>>>
>>>
>>>
>>> On Tue, Jul 23, 2013 at 6:20 PM, j s oishi <jsoishi at gmail.com> wrote:
>>>>
>>>> Oh god...+100000000000 <sound of coins dinging in 8 bit glory>
>>>>
>>>> On Jul 23, 2013 7:08 PM, "Matthew Turk" <matthewturk at gmail.com> wrote:
>>>>>
>>>>> On Tue, Jul 23, 2013 at 3:27 PM, Nathan Goldbaum
>>>>> <nathan12343 at gmail.com> wrote:
>>>>> > Hi all,
>>>>> >
>>>>> > I've just issued a PR that will hopefully fix a whole class of buggy
>>>>> > behavior that both new and experienced yt users commonly run into.
>>>>> > Specifically, I'd like it if we could turn off data serialization by
>>>>> > default.  This changes a long-lived default value in yt's
>>>>> > configuration, so
>>>>> > I wanted to bring this change to the attention of both the yt user
>>>>> > and
>>>>> > developer community.
>>>>> >
>>>>> > What is data serialization?  Currently, yt will save the result of
>>>>> > certain
>>>>> > expensive calculations, including projections, the structure of the
>>>>> > grid
>>>>> > hierarchy, and the list of fields present in the data.  While this
>>>>> > does have
>>>>> > the beneficial effect of saving time when a user needs to
>>>>> > repetitively
>>>>> > calculate these quantities on the same dataset, it has a number of
>>>>> > features
>>>>> > which lead to buggy, annoying behavior.
>>>>> >
>>>>> > Specifically, If I am developing my simulation code or repeatedly
>>>>> > restarting
>>>>> > my code, searching for a way to grind past a code crash, I will quite
>>>>> > often
>>>>> > regenerate the same simulation output file over and over, changing a
>>>>> > line of
>>>>> > code or switching out the value of a parameter each time.
>>>>> >
>>>>> > If yt's data serialization is turned on, it's likely that yt's
>>>>> > visualizations will correspond to old versions of the data file.
>>>>> > Since only
>>>>> > certain operations are serialized, it's also possible for yt to get
>>>>> > into an
>>>>> > inconsistent state - one operation will show the current data file,
>>>>> > while
>>>>> > another operation will show an old version.
>>>>> >
>>>>> > It's possible to fix a bug in your code, but because yt is still
>>>>> > loading the
>>>>> > old data, you won't be able to tell that your bug is fixed until you
>>>>> > realize
>>>>> > that you have .yt and .harrays files littering your filesystem.
>>>>> >
>>>>> > I've personally wasted a lot of time due to yt's serialization
>>>>> > 'feature' and
>>>>> > denizens of our IRC channel and mailing list can attest to how often
>>>>> > new
>>>>> > users run into this behavior as well.
>>>>> >
>>>>> > My pull request only turns off serlialization by default, it doesn't
>>>>> > disable
>>>>> > the capability completely.  Once the pull request is merged in, you
>>>>> > can turn
>>>>> > on serialization either by adding an entry to your config file:
>>>>> >
>>>>> > $ cat ~/.yt/config
>>>>> >
>>>>> > [yt]
>>>>> > serialize = True
>>>>> >
>>>>> > Or on a per-script basis:
>>>>> >
>>>>> > from yt.config import ytcfg
>>>>> > ytcfg['yt', 'serialize'] = 'True'
>>>>> > from yt.mods import *
>>>>> >
>>>>> > The pull request is here:
>>>>> > https://bitbucket.org/yt_analysis/yt/pull-request/558
>>>>> >
>>>>> > I know several of you are big fans of this feature, so if you object
>>>>> > to this
>>>>> > change please leave a comment on the pull request so we can figure
>>>>> > out a way
>>>>> > forward.
>>>>>
>>>>> I think this is long overdue, for all the reasons you list.
>>>>> Auto-serialization treated a lot of symptoms that we have since
>>>>> improved, or that we should address more directly -- speed of
>>>>> hierarchy construction, saving data that we want to retain, and
>>>>> detecting fields.
>>>>>
>>>>> +1!
>>>>>
>>>>> -Matt
>>>>>
>>>>> >
>>>>> > -Nathan
>>>>> >
>>>>> > _______________________________________________
>>>>> > yt-dev mailing list
>>>>> > yt-dev at lists.spacepope.org
>>>>> > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>> >
>>>>> _______________________________________________
>>>>> yt-dev mailing list
>>>>> yt-dev at lists.spacepope.org
>>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>
>>>>
>>>> _______________________________________________
>>>> yt-dev mailing list
>>>> yt-dev at lists.spacepope.org
>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>
>>>
>>>
>>>
>>> --
>>> -- Sent from a computer.
>>>
>>> _______________________________________________
>>> yt-dev mailing list
>>> yt-dev at lists.spacepope.org
>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>
>>
>>
>> _______________________________________________
>> yt-dev mailing list
>> yt-dev at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>
>
>
>
> --
> -- Sent from a computer.
>
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>



More information about the yt-dev mailing list