[yt-dev] Should we stop "yield"ing tests?

Fri Oct 9 00:41:21 PDT 2015

Hi Nathan,

*Spolier alert* - I think you make a pretty strong case for ending the
practice of yielding tests.  In particular, if this makes it easier for
developers to debug failing or erroring tests, I'm definitely on board.  I
agree that we should be encouraging people to run tests locally.  I know I
should be doing more of that.

As for your points against doing this, I personally think they do not
outweigh the benefits.  The PR may be large, but it should be confined to
testing files and the changes should all be very straightforward.  Your
second point against is probably the most compelling, but I think if one
were to find themselves in a situation with a failing test, the best thing
would be to just run that single test over and over again locally until it
passed.

To conclude my own novella of a response to your Ulysses, I am +1 on this.

Britton

On Thu, Oct 8, 2015 at 5:11 PM, Nathan Goldbaum <nathan12343 at gmail.com>
wrote:

> Hi all,
>
> Sorry for the novel of an e-mail! I wanted to be as detailed as possible,
> I hope it isn't too much.
>
> Right now our test suite does a lot of test yielding. Under this paradigm,
> every yielded test counts as a "dot" in the nose testing output and
> contributes to the number of tests printed at the end of the test suite. In
> addition, yielding tests makes it so that if a test function has a failing
> test, all of the tests yielded by that function will be run despite the
> fact that one or more tests failed.
>
> It's definitely nice to see that we're running thousands of tests, and I'd
> happy with continuing to do it, except I've learned that this approach adds
> some technical hurdles for people who actually need to deal with the test
> suite.
>
> Personally, I've noticed that it makes debugging failing tests much more
> annoying than it needs to be. Since a test is yielded, the traceback
> generated by a failing or erroring test ends up somewhere in nose rather
> than in the yt test suite function that actually yielded the test. This can
> make it difficult to determine where a failing test is coming from in the
> test suite, particularly if the test that gets yielded is just an assert.
>
> In addition, Kacper tells me that our practice of yielding tests makes it
> difficult to simplify our jenkins setup since yielded tests are not
> parallel-safe.
>
> I'd like to propose a modification to the yt codebase and developer guide.
> Rather than encouraging that all asserts and test classes be yielded, we
> should instead *not* yield them and just call them as regular functions or
> classes. To make that concrete, the following set of tests from the NMSU
> ART answer tests would go from looking like this:
>
> http://paste.yt-project.org/show/5944/
>
> to looking like this:
>
> http://paste.yt-project.org/show/5946/
>
> Each individual assert and test class instantiation would no longer count
> as an individual test in nose's test statistics.  Instead, referring to the
> example above, the test_d9p function would be the only test reported by
> nose, even though the test function does many asserts and instantiates many
> test class instances.
>
> For me, the main win would be that it would be easier to determine which
> exact test failed, because the traceback reported by nose due to the test
> failure would include the line in the test file that produced a failing
> assert or caused an unhandled exception to be raised.
>
> To make that concrete, I've just made the following modification to the
> `test_dimensionless` function to force a test to fail:
>
> http://paste.yt-project.org/show/5947/
>
> Running `nosetests units` in the root of the repository, I get the
> following traceback:
>
> http://paste.yt-project.org/show/5948/
>
> Note how the test traceback does *not* include the `test_dimensionless`
> function.  If I instead make it so the failing test is not yielded:
>
> http://paste.yt-project.org/show/5949/
>
> I get a much nicer traceback:
>
> http://paste.yt-project.org/show/5950/
>
> I can see a few reasons why we might not want to do this.
>
> 1. This is an invasive, large-scale change. It might be automatable (I
> think I could write an emacs macro to do this, for example), but it would
> in the end be difficult to review.
>
> 2. Since test functions would terminate on the first failure, it might
> lead to annoying debug cycles where one assert gets fixed but then the next
> assert fails, forcing people to wait for the full test suite to be rerun
> just to get to the next failing test.
>
> For the second point, I think we can remedy this just by improving our
> testing docs to encourage people to run tests locally as much as possible
> and also explain better how to run only a specific test function from the
> command line.
>
> If people are interested in making this modification globally, I think I
> could take it on as part of my general efforts to clean up the codebase.
>
> -Nathan
>
>
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20151009/d987645f/attachment.html>