[yt-users] variance versus standard deviation

Greg Bryan gbryan at astro.columbia.edu
Tue Jan 3 21:04:34 PST 2017


I would argue that you should treat this like a bug and just fix it.  The definitions are very standard and this violates the rule of least surprise (it also conflicts with numpy usage).  This will break some existing scripts but, speaking from painful experience, it is better to correct obvious mistakes earlier rather than later…

Cheers,
Greg

> On Jan 3, 2017, at 5:52 PM, Nathan Goldbaum <nathan12343 at gmail.com> wrote:
> 
> Ah, I see why you mean. Changing the name here would change the public-facing API, so I'm not sure there is a quick, easy fix here.
> 
> If you'd like, feel free to open an issue about this so we don't lose track:
> 
> https://bitbucket.org/yt_analysis/yt/issues/new <https://bitbucket.org/yt_analysis/yt/issues/new>
> 
> On Tue, Jan 3, 2017 at 4:43 PM Andrew Cunningham <ajcunn at gmail.com <mailto:ajcunn at gmail.com>> wrote:
> Nathan,
> 
> 
> 
> Thanks for pointing where to look for the variance computation.  I see in
> 
> the profiles.py you linked to, line 192 in "_finalize_storage"
> 
> 
> 
>        all_var = np.sqrt(all_var)
> 
> 
> 
> which later is used to construct "self.variance".  So, after this line,
> 
> "all_var" is really a standard deviation.
> 
> 
> 
> Consequently, the example in the cookbook is correct.  The variance member
> 
> of the profiles object is really the standard deviation.  There is no
> 
> "bug" in the cookbook example.  Rather, this particular data member in the
> 
> profile object is just misleadingly named.
> 
> 
> 
> I don't currently have a dev version of yt setup.  I'm more of a humble
> 
> user than agile developer into the yt innards.  I'd be happy to set that
> 
> up and submit a pull request if that would be helpful, but I'm not sure
> 
> what the "right" way to fix this is.
> 
> 
> 
> One could remove the line:
> 
> 
> 
> all_var = np.sqrt(all_var)
> 
> 
> 
> so that the variance member is really the variance and update the cookbook
> 
> example to take the square root inline when making the plot of mean and st
> 
> dev.  Though, that would break user scripts already out in the wild that
> 
> rely on "variance" really being the standard deviation.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> > Hi Andrew,
> 
> >
> 
> > I think the code is simply using the terms "variance" and "standard
> 
> > deviation" interchangeably here.
> 
> >
> 
> > You can see how the `variance` array attached to the profile object is
> 
> > calculated here:
> 
> >
> 
> > https://bitbucket.org/yt_analysis/yt/src/09f0ef297d7068078a021fc8290d9e3519baf82d/yt/data_objects/profiles.py?at=yt&fileviewer=file-view-default#profiles.py-183 <https://bitbucket.org/yt_analysis/yt/src/09f0ef297d7068078a021fc8290d9e3519baf82d/yt/data_objects/profiles.py?at=yt&fileviewer=file-view-default#profiles.py-183>
> 
> >
> 
> > I haven't looked up the mathematical definitions in detail, but I believe
> 
> > that this is the sample variance at each profile bin (*not* the standard
> 
> > deviation of the mean).
> 
> >
> 
> > If you'd like, please feel free to send a pull request to update that
> 
> > cookbook recipe to label the variance with "variance" instead of
> 
> > "standard deviation". The code for the cookbook recipe is located here
> 
> > in the repository:
> 
> >
> 
> > https://bitbucket.org/yt_analysis/yt/src/09f0ef297d7068078a021fc8290d9e3519baf82d/doc/source/cookbook/profile_with_variance.py?at=yt&fileviewer=file-view-default <https://bitbucket.org/yt_analysis/yt/src/09f0ef297d7068078a021fc8290d9e3519baf82d/doc/source/cookbook/profile_with_variance.py?at=yt&fileviewer=file-view-default>
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Sat, 31 Dec 2016, Andrew Cunningham wrote:
> 
> 
> 
> > In the following cookbook example:
> 
> >
> 
> > http://yt-project.org/doc/cookbook/simple_plots.html#profiles-with-variance <http://yt-project.org/doc/cookbook/simple_plots.html#profiles-with-variance>
> 
> > -values
> 
> >
> 
> > the variance along the profile is extracted:
> 
> >
> 
> > variance = prof.variance['gas', 'velocity_magnitude'].value
> 
> >
> 
> > but is then labelled as the standard deviation when the plot is created:
> 
> >
> 
> > plt.loglog(radius, variance, label='Standard Deviation')
> 
> >
> 
> > So, does "prof.variance" return the standard deviation and not the
> 
> > variance?  Or, is the plot mislabelled?  Should the plot instead be:
> 
> >
> 
> > plt.loglog(radius, np.sqrt(variance), label='Standard Deviation')
> 
> >
> 
> >
> 
> >_______________________________________________
> 
> yt-users mailing list
> 
> yt-users at lists.spacepope.org <mailto:yt-users at lists.spacepope.org>
> 
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org <http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org>
> 
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-users-spacepope.org/attachments/20170104/2d33bdcf/attachment-0001.htm>


More information about the yt-users mailing list