[yt-dev] Turn data serialization off by default

Nathan Goldbaum nathan12343 at gmail.com
Tue Jul 23 15:27:05 PDT 2013


Hi all,

I've just issued a PR that will hopefully fix a whole class of buggy
behavior that both new and experienced yt users commonly run into.
 Specifically, I'd like it if we could turn off data serialization by
default.  This changes a long-lived default value in yt's configuration, so
I wanted to bring this change to the attention of both the yt user and
developer community.

What is data serialization?  Currently, yt will save the result of certain
expensive calculations, including projections, the structure of the grid
hierarchy, and the list of fields present in the data.  While this does
have the beneficial effect of saving time when a user needs to repetitively
calculate these quantities on the same dataset, it has a number of features
which lead to buggy, annoying behavior.

Specifically, If I am developing my simulation code or repeatedly
restarting my code, searching for a way to grind past a code crash, I will
quite often regenerate the same simulation output file over and over,
changing a line of code or switching out the value of a parameter each
time.

If yt's data serialization is turned on, it's likely that yt's
visualizations will correspond to old versions of the data file.  Since
only certain operations are serialized, it's also possible for yt to get
into an inconsistent state - one operation will show the current data file,
while another operation will show an old version.

It's possible to fix a bug in your code, but because yt is still loading
the old data, you won't be able to tell that your bug is fixed until you
realize that you have .yt and .harrays files littering your filesystem.

I've personally wasted a lot of time due to yt's serialization 'feature'
and denizens of our IRC channel and mailing list can attest to how often
new users run into this behavior as well.

My pull request only turns off serlialization by default, it doesn't
disable the capability completely.  Once the pull request is merged in, you
can turn on serialization either by adding an entry to your config file:

$ cat ~/.yt/config

[yt]
serialize = True

Or on a per-script basis:

from yt.config import ytcfg
ytcfg['yt', 'serialize'] = 'True'
from yt.mods import *

The pull request is here:
https://bitbucket.org/yt_analysis/yt/pull-request/558

I know several of you are big fans of this feature, so if you object to
this change please leave a comment on the pull request so we can figure out
a way forward.

-Nathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20130723/d5e002f9/attachment.htm>


More information about the yt-dev mailing list