[yt-users] variance versus standard deviation

Nathan Goldbaum nathan12343 at gmail.com
Tue Jan 3 21:12:06 PST 2017


I agree it should be fixed (albeit with backward compatibility shims so
profile.variance still works and raises a deprecation warning so that
scripts don't break).

If one of you could open a bug report about this that would be very much
appreciated.

On Tue, Jan 3, 2017 at 11:07 PM Greg Bryan <gbryan at astro.columbia.edu>
wrote:

> I would argue that you should treat this like a bug and just fix it.  The
> definitions are very standard and this violates the rule of least surprise
> (it also conflicts with numpy usage).  This will break some existing
> scripts but, speaking from painful experience, it is better to correct
> obvious mistakes earlier rather than later…
>
> Cheers,
> Greg
>
> On Jan 3, 2017, at 5:52 PM, Nathan Goldbaum <nathan12343 at gmail.com> wrote:
>
> Ah, I see why you mean. Changing the name here would change the
> public-facing API, so I'm not sure there is a quick, easy fix here.
>
> If you'd like, feel free to open an issue about this so we don't lose
> track:
>
> https://bitbucket.org/yt_analysis/yt/issues/new
>
> On Tue, Jan 3, 2017 at 4:43 PM Andrew Cunningham <ajcunn at gmail.com> wrote:
>
> Nathan,
>
>
>
> Thanks for pointing where to look for the variance computation.  I see in
>
> the profiles.py you linked to, line 192 in "_finalize_storage"
>
>
>
>        all_var = np.sqrt(all_var)
>
>
>
> which later is used to construct "self.variance".  So, after this line,
>
> "all_var" is really a standard deviation.
>
>
>
> Consequently, the example in the cookbook is correct.  The variance member
>
> of the profiles object is really the standard deviation.  There is no
>
> "bug" in the cookbook example.  Rather, this particular data member in the
>
> profile object is just misleadingly named.
>
>
>
> I don't currently have a dev version of yt setup.  I'm more of a humble
>
> user than agile developer into the yt innards.  I'd be happy to set that
>
> up and submit a pull request if that would be helpful, but I'm not sure
>
> what the "right" way to fix this is.
>
>
>
> One could remove the line:
>
>
>
> all_var = np.sqrt(all_var)
>
>
>
> so that the variance member is really the variance and update the cookbook
>
> example to take the square root inline when making the plot of mean and st
>
> dev.  Though, that would break user scripts already out in the wild that
>
> rely on "variance" really being the standard deviation.
>
>
>
>
>
>
>
>
>
> > Hi Andrew,
>
> >
>
> > I think the code is simply using the terms "variance" and "standard
>
> > deviation" interchangeably here.
>
> >
>
> > You can see how the `variance` array attached to the profile object is
>
> > calculated here:
>
> >
>
> >
> https://bitbucket.org/yt_analysis/yt/src/09f0ef297d7068078a021fc8290d9e3519baf82d/yt/data_objects/profiles.py?at=yt&fileviewer=file-view-default#profiles.py-183
>
> >
>
> > I haven't looked up the mathematical definitions in detail, but I believe
>
> > that this is the sample variance at each profile bin (*not* the standard
>
> > deviation of the mean).
>
> >
>
> > If you'd like, please feel free to send a pull request to update that
>
> > cookbook recipe to label the variance with "variance" instead of
>
> > "standard deviation". The code for the cookbook recipe is located here
>
> > in the repository:
>
> >
>
> >
> https://bitbucket.org/yt_analysis/yt/src/09f0ef297d7068078a021fc8290d9e3519baf82d/doc/source/cookbook/profile_with_variance.py?at=yt&fileviewer=file-view-default
>
>
>
>
>
>
>
>
>
>
>
> On Sat, 31 Dec 2016, Andrew Cunningham wrote:
>
>
>
> > In the following cookbook example:
>
> >
>
> >
> http://yt-project.org/doc/cookbook/simple_plots.html#profiles-with-variance
>
> > -values
>
> >
>
> > the variance along the profile is extracted:
>
> >
>
> > variance = prof.variance['gas', 'velocity_magnitude'].value
>
> >
>
> > but is then labelled as the standard deviation when the plot is created:
>
> >
>
> > plt.loglog(radius, variance, label='Standard Deviation')
>
> >
>
> > So, does "prof.variance" return the standard deviation and not the
>
> > variance?  Or, is the plot mislabelled?  Should the plot instead be:
>
> >
>
> > plt.loglog(radius, np.sqrt(variance), label='Standard Deviation')
>
> >
>
> >
>
> >_______________________________________________
>
> yt-users mailing list
>
> yt-users at lists.spacepope.org
>
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
> _______________________________________________
>
> yt-users mailing list
>
> yt-users at lists.spacepope.org
>
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-users-spacepope.org/attachments/20170104/1ece08a9/attachment.htm>


More information about the yt-users mailing list