[yt-dev] Update on answer tests

Fri Jun 27 12:26:33 PDT 2014

Final update:

I have convinced myself that the new answers are higher fidelity.

 1) There are fewer unit conversion roundtrips.
 2) We are now able to directly alias position to coordinates, which
speeds things up and reduces memory overhead, as well as decreasing
number of unit roundtrip conversions.
 3) The differences, when I reduce everything to the most basic
settings I can, are negligible.  This is primarily by using rint
inside the octree getter.

So unless anyone has a strong objection, I'm ready to get on with my
life!  :)  So maybe we should regenerate the answer tests and move on?

-Matt

On Fri, Jun 27, 2014 at 9:26 AM, Matthew Turk <matthewturk at gmail.com> wrote:
> Hi all,
>
> I'm looking over the answer test failures which came from the change
> in output/input units.  I spent a good portion of yesterday and some
> of this morning digging into this, examining which mesh cells each
> particle got deposited in, and I think I've come up with a *reason*,
> if not an *answer*.
>
> What seems to be happening is that at some point as a result of the
> additional convert_to_units calls, there is a drift in particle
> positions at the scale of a few NULP.  Unfortunately, there are a
> handful of particles that this causes to shift between zones in the
> mesh of the octree -- not enough to change the octree structure, but
> enough to cause a difference.  I was able to reduce the number of
> differences by using IEEE754 rounding, rather than simple truncation,
> during cell assignment.
>
> When the deposition is done, this is only a relative difference of
> ~1e-16, but when projected (and especially, when projected with a
> *weight*) this gets amplified to the point that it triggers our answer
> tests to fail.
>
> I was able to prove to myself that this is the case by comparing the
> results of truncating to float32 precision inside the get() function,
> which means all oct-identification for deposition occurs at the scale
> of 32 bits rather than 64.  I then compared the old results (with the
> truncation in precision) to the new results (with the same truncation
> in precision) and got identical results, all of which passed the
> answer test suite.  This doesn't solve the problem, but it points to a
> reason for it.
>
> Even with that reason, however, I'm really quite dissatisfied with the
> idea that we're introducing this jitter in the first place.  It seems,
> to my eye, to be coming from multiple calls to unit conversion, etc
> etc, that introduce slight differences.  I've attempted to reduce
> these calls in the PR.
>
> There was an additional issue, in that we were iterating over sets of
> files, which introduced the possibility of iteration order
> differences.  That's been addressed in an outstanding PR.
>
> Anyway, I'll send an update once I've completely tracked this down.
>
> -Matt