[yt-dev] round-off error in TotalQuantity

Douglas Harvey Rudd drudd at uchicago.edu
Wed Jul 10 11:26:54 PDT 2013


As a followup, I believe there is at least once instance where we would not want to use float64 as the accumulator: if the field itself is integer and the sum does not overflow.  Something like this might be better:

if 'int' in str(data[field].dtype) :
   totals.append(data[field].sum(dtype=np.int64))
elif 'float' in str(data[field].dtype) :
   totals.append(data[field].sum(dtype=np.float64))
else :
  totals.append(data[field].sum())

Using np.float64 in the case of integers as well would alleviate the overflow problem, but would lead to approximate results where the user might expect an exact answer.

Douglas Rudd
Scientific Computing Consultant
Research Computing Center
drudd at uchicago.edu<mailto:drudd at uchicago.edu>



On Jul 10, 2013, at 6:26 AM, Britton Smith <brittonsmith at gmail.com<mailto:brittonsmith at gmail.com>> wrote:

Hi Doug,

I don't seem any reason why we wouldn't want to fix this to always use higher precision.  Could you issue a PR?

Britton


On Wed, Jul 10, 2013 at 2:36 AM, Douglas Harvey Rudd <drudd at uchicago.edu<mailto:drudd at uchicago.edu>> wrote:
Hi all,

I'm working on getting the ARTIO frontend working with a script used by the AGORA collaboration to measure density profiles.  This script uses the "TotalQuantity" quantity to sum all particle masses within a given sphere, then differences a nested set of spheres to measure the mean density in spherical shells.

I noticed a 2% difference at large radius from my own profiling routines, and tracked it down to the precision of the sum, which used the same precision as the input variable, float32 for ARTIO. Since there are a large number of particles, summing 1e5 x 10^-7 floats led to O(1%) error.

While this is clearly not an ideal way to measure density profiles, it's probably a good idea to help avoid such issues in future by changing TotalQuantity (and other places, like _TotalMass) to use a 64-bit accumulator, e.g.
        totals.append(data[field].sum(dtype=np.float64))

The alternative is to promote all float32 input variables to float64, which is wasteful of memory, or otherwise warn the user that they are likely to encounter loss of precision.

Douglas Rudd
Scientific Computing Consultant
Research Computing Center
drudd at uchicago.edu<mailto:drudd at uchicago.edu>



_______________________________________________
yt-dev mailing list
yt-dev at lists.spacepope.org<mailto:yt-dev at lists.spacepope.org>
http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

_______________________________________________
yt-dev mailing list
yt-dev at lists.spacepope.org<mailto:yt-dev at lists.spacepope.org>
http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20130710/02b79048/attachment.htm>


More information about the yt-dev mailing list