[yt-users] variance versus standard deviation

Andrew Cunningham ajcunn at gmail.com
Tue Jan 3 14:43:00 PST 2017


Nathan,

Thanks for pointing where to look for the variance computation.  I see in 
the profiles.py you linked to, line 192 in "_finalize_storage"

       all_var = np.sqrt(all_var)

which later is used to construct "self.variance".  So, after this line, 
"all_var" is really a standard deviation.

Consequently, the example in the cookbook is correct.  The variance member 
of the profiles object is really the standard deviation.  There is no 
"bug" in the cookbook example.  Rather, this particular data member in the 
profile object is just misleadingly named.

I don't currently have a dev version of yt setup.  I'm more of a humble 
user than agile developer into the yt innards.  I'd be happy to set that 
up and submit a pull request if that would be helpful, but I'm not sure 
what the "right" way to fix this is.

One could remove the line:

all_var = np.sqrt(all_var)

so that the variance member is really the variance and update the cookbook 
example to take the square root inline when making the plot of mean and st 
dev.  Though, that would break user scripts already out in the wild that 
rely on "variance" really being the standard deviation.




> Hi Andrew,
> 
> I think the code is simply using the terms "variance" and "standard
> deviation" interchangeably here.
> 
> You can see how the `variance` array attached to the profile object is
> calculated here:
>
> https://bitbucket.org/yt_analysis/yt/src/09f0ef297d7068078a021fc8290d9e3519baf82d/yt/data_objects/profiles.py?at=yt&fileviewer=file-view-default#profiles.py-183
> 
> I haven't looked up the mathematical definitions in detail, but I believe
> that this is the sample variance at each profile bin (*not* the standard
> deviation of the mean).
> 
> If you'd like, please feel free to send a pull request to update that
> cookbook recipe to label the variance with "variance" instead of 
> "standard deviation". The code for the cookbook recipe is located here 
> in the repository:
> 
> https://bitbucket.org/yt_analysis/yt/src/09f0ef297d7068078a021fc8290d9e3519baf82d/doc/source/cookbook/profile_with_variance.py?at=yt&fileviewer=file-view-default





On Sat, 31 Dec 2016, Andrew Cunningham wrote:

> In the following cookbook example:
> 
> http://yt-project.org/doc/cookbook/simple_plots.html#profiles-with-variance
> -values
> 
> the variance along the profile is extracted:
> 
> variance = prof.variance['gas', 'velocity_magnitude'].value
> 
> but is then labelled as the standard deviation when the plot is created:
> 
> plt.loglog(radius, variance, label='Standard Deviation')
> 
> So, does "prof.variance" return the standard deviation and not the
> variance?  Or, is the plot mislabelled?  Should the plot instead be:
> 
> plt.loglog(radius, np.sqrt(variance), label='Standard Deviation')
> 
> 
>


More information about the yt-users mailing list