[yt-dev] Fwd: [yt_analysis/yt] Answer testing plugin for Nose (pull request #308)

Nathan Goldbaum nathan12343 at gmail.com
Thu Oct 18 19:48:57 PDT 2012


Just like to add my two cents about nose and testing in general.

This morning Ryan reported a bug in some new functionality I added last 
week.   The issue was due to my not allowing for completely general 
inputs into a derived field, which I think is a pretty common error.  
When YT handed the functions the unanticipated inputs, it barfed, and 
Ryan's analysis was stalled until we could find a fix.

Matt took it upon himself to write new tests that cover all of the 
different ways field access can work (i.e. flattened arrays of field 
values and field values covering a 3D grid).  This test was pretty 
trivial for Matt to write, and this is important, was written with such 
generality that it covered all of the possible field access methods for 
all of the universal fields.  You can see for yourself here: 
https://bitbucket.org/MatthewTurk/yt/src/4f48186e3644/yt/data_objects/tests/test_fields.py

Nose has a nice feature that allows a human running the test to be 
instantly dumped into a running pdb session as soon as an error occurs.  
This means my debugging workflow was now almost completely automated.  I 
tell nose to run the test, when something breaks I'm dumped into a pdb 
session, and then I can poke around and figure out why the error 
happened and how to fix it.  As soon as I fix the bug, I run nose again, 
and so on until all of the tests pass.

Even better, these tests are now run *every time* a commit happens in 
yt_analysis/yt.  If something breaks in the future, we'll know immediately.

This completely changes my workflow.  Now, I can write a set of tests 
for functionality that doesn't exist yet, write some code that 
implements the functionality, and then iteratively run nose with --pdb, 
and fix bugs as they pop up.  Even better, if I add new tests and new 
functionality but break the old tests, I'll know instantly.

Writing tests takes time.  Writing code without tests takes more time, 
in the end.

This probably doesn't come as a revelation to most of the people reading 
it, but I thought I'd share my feelings from this morning for posterity.

-Nathan

On 10/18/12 7:35 PM, Matthew Turk wrote:
> Hi all,
>
> I'm really excited about this, but I would like to hear feedback as
> well as solicit help with it.  There are a lot of new tests we can
> write, particularly for frontend bits that have gotten us in the past.
>   We can also use Nose to measure performance over time, which would be
> a nice way of checking for regressions or improvements.
>
> As I note in the PR, I'd like to get a discussion going about this --
> any feedback, would be very, very welcome.  Does this meet our needs
> for answer testing?  Will you be willing to write tests for a given
> frontend?  What else could be added or improved?
>
> I'd also like to suggest that we have a Hangout or IRC meeting to get
> some builds set up and actually try this out on a couple different
> machines.  My best times would be Tuesday at 4PM EST or Wednesday at
> 2PM EST.
>
> -Matt
>
> ---------- Forwarded message ----------
> From: Matthew Turk <pullrequests-noreply at bitbucket.org>
> Date: Thu, Oct 18, 2012 at 10:28 PM
> Subject: [yt_analysis/yt] Answer testing plugin for Nose (pull request #308)
>
>
> A new pull request has been opened by Matthew Turk.
>
> MatthewTurk/yt has changes to be pulled into yt_analysis/yt.
>
> https://bitbucket.org/yt_analysis/yt/pull-request/308/answer-testing-plugin-for-nose
>
> Title: Answer testing plugin for Nose
>
> This pull request includes an answer testing plugin for Nose, as well
> as a mechanism by which this plugin can be used to upload new results
> and compare existing results to a gold standard, stored in Amazon.
>
> ## How does Answer Testing work now?
>
> Currently, Answer Testing in yt works by running a completely
> home-grown test runner, discoverer, and storage system.  This works on
> a single parameter file at a time, and there is little flexibility in
> how the parameter files are tested.  For instance, you cannot select
> fields based on the code that generated the pf.  This catches many but
> not all errors, and can only test Enzo and FLASH.
>
> When a new set of "reliable" tests has been identified, it is tarred
> up and uploaded.  No one ever really used them, and it's difficult to
> run them unless you're on Matt's machine.
>
> ## What does this do?
>
> There are two ways in which this can function:
>
>   * Pull down results for a given parameter file and compare the
> locally-created results against them
>   * Run new results and upload those to S3
>
> These are not meant to co-exist.  In fact, the ideal method of
> operation is that when the answer tests are changed *intentionally*,
> new gold standards are generated and pushed to S3 by one of a trusted
> set of users.  (New users can be added, with the privs necessary to
> push a new set of tests.)
>
> This adds a new config option to `~/.yt/config` in the `[yt]` section:
> `test_data_dir`, which is where parameter files (such as
> "IsolatedGalaxy" and "DD0010" from yt's distribution) can be found.
> When the nosetests are run, any parameter files it finds in that
> directory will be used as answer testing input.  In
> `yt/frontends/enzo/tests/test_outputs.py` is the Enzo frontend tests
> that rely on parameter files.  Note that right now, the standard AMR
> tests are quite extensive and generate a lot of data; I am still in
> the process of creating new tests to replicate the old answer tests,
> and also slimming it down for big datasets.
>
> To run a comparison, you must first run "develop" so that the new nose
> plugin becomes available.  Then, in the yt directory,
>
> `nosetests --with-answer-testing frontends/enzo/ --answer-compare=gold001`
>
> To run a set of tests and *store* them:
>
> `nosetests --with-answer-testing frontends/enzo/ --answer-store
> --answer-name=gold001`
>
> We can now not only run answer tests, but we don't have to manage
> (manually or otherwise) the uploads.  yt will do this for us, using
> boto.  Down the road we can swap out Amazon for any
> OpenStack-compliant cloud provider, such as SDSC's cloud.
>
> Additionally, we can now add answer testing of small data to Shining
> Panda.  In the future, we can add answer testing of large data with
> lower frequency, as well.
>
> ## What's Next?
>
> Because there's a lot to take in, I'd like to suggest this PR not be
> accepted as-is.  There are a few items that need to be done first:
>
>   * The developer community needs to be brought in on this; I would
> like to suggest either a hangout or an IRC meeting to discuss how this
> works.  I'd also encourage others to pull this PR, run the nosetests
> command that compares data, and figure out if they like how it looks.
>   * The old tests all need to be replicated.  This means things like
> projections turned into pixel buffers, field statistics (without
> storing the fields)
>   * Tests need to be added for other frontends.  I am currently working
> with other frontend maintainers to get data, but once we've gotten it,
> we need to add tests as is done for Enzo.  This means FLASH, Nyx,
> Orion, as well as any others that would like to be on the testing
> suite.
>
> I'd like to encourage specific comments on lines of code to be left
> here, as well as comments on the actual structure of the code, but
> I'll be forwarding this PR to yt-dev and asking for broader comments
> there.  I think that having a single, integrated testing system that
> can test a subset of parameter files (as well as auto-discover them)
> will be extremely valuable for ensuring maintainability.  I'm really
> excited about this.
>
> Changes to be pulled:
>
>
>
> --
> This is an issue notification from bitbucket.org.
> You are receiving this either because you are the participating
> in a pull request, or you are following it.
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org




More information about the yt-dev mailing list