[yt-dev] Testing Infrastructure: Datasets for ART, Orion, FLASH, etc ...

Matthew Turk matthewturk at gmail.com
Fri Oct 12 15:07:45 PDT 2012


Hi Casey,

On Fri, Oct 12, 2012 at 3:02 PM, Casey W. Stark <caseywstark at gmail.com> wrote:
> Hey Matt.
>
> I would like to provide the data for Nyx. Not sure what sort of output would
> be useful though.

Two outputs, one small and one slightly less small, that are
self-contained.  Cosmology, with particles, would be ideal.

>
> So I knew of some of the tests you and Anthony added, but there are 500 unit
> tests now? Isn't that a bit strange?

Well, the tests I added also check random data being supplied to
projections and profiles in several different configurations:

* Lazy reading on/off
* 1, 2, 4, 8 grid patches of random data

Plus, I've also checked several different aspects.  So if you have two
different aspects of something that's getting checked for 4 different
processor layouts, that's 8 already.  As an example, here's the
covering_grid test:

https://bitbucket.org/yt_analysis/yt/src/2d91e2e7f12a/yt/data_objects/tests/test_covering_grid.py?at=tip

You can see that they multiply fast -- this one tests a bunch of
sub-aspects of the covering grid, and each yield inside the iterator
adds a new test.  Quickly adds up!

-Matt

>
> - Casey
>
>
> On Fri, Oct 12, 2012 at 2:54 PM, Matthew Turk <matthewturk at gmail.com> wrote:
>>
>> Hi all,
>>
>> Today at UCSC, Nathan, Chris (Moody) and I sat down and went through
>> what we wanted to accomplish with testing.  This comes back to the
>> age-old dichotomy between unit testing and answer testing.  But what
>> this really comes back to, now that we had the opportunity to think
>> about it, is the difference between testing components and
>> functionality versus testing frontends.
>>
>> So the idea here is:
>>
>> Unit tests => Cover, using either manually inserted data values or
>> randomly generated "parameter files", individual units of the code.
>> Stephen and I have written a bunch in the last couple days.  We have
>> nearly 500, and they take < 1 minute to run.
>>
>> Frontend/Answer tests => Cover a large portion of high-level
>> functionality that touches a lot of the code, but do so by running
>> things like projections, profiles, etc on actual data from actual
>> simulation codes, which then get compared to reference values that are
>> stored somewhere.  Currently we have ~550 answer tests, and they run
>> every 30 minutes on moving7_0010 (comes wit yt) and once a day on
>> JHK-DD0030 (on yt-project.org/data/ as IsolatedGalaxy .)  We do not
>> have automated FLASH testing.
>>
>> The next step is:
>>
>> 1) Getting a bunch of non-proprietary sets of data that are small
>> *and* medium, for each code base we want to test.  This data must be
>> non-proprietary!  For small, I would say they can be trivially small.
>> For medium, I'd prefer in the 0.5 - 5 gb range for size-on-disk.  I
>> would think that GasSloshing and WindTunnel could work for FLASH.  But
>> we still need ART data (from Chris Moody), GDF or Piernik data (from
>> Kacper), Orion data (if possible), Nyx data (if possible).  I will
>> handle adding RAMSES data in the 3.0 branch.
>> 2) Getting a mechanism to run answer tests that isn't "Matt's
>> desktop."  I've emailed Shining Panda about this, but if they don't
>> have the ability to provide us with a FLOSS license, I think we can
>> identify some funding to do this.
>> 3) Have a mechanism to display and collate results.  ShiningPanda
>> would do this if we were on their systems.
>> 4) Make it much easier to flag individual tests as needing updates.  I
>> think the Data Hub will be the end place for this, but this is lower
>> priority.
>> 5) Migrate answer testing to use unit testing framework, as most of
>> what we've done there re-implements stuff that is in the unit testing
>> frameworks.  This will mean we can much more easily handle
>> test-discovery, which is a huge plus.
>>
>> Ultimately, the end product of all of this is that we should
>> eventually have a method for running a single set of tests that do
>> test discovery that loads up a bunch of different data outputs, runs
>> answer tests on all of them, runs the unit tests, etc etc.  I think it
>> just needs the last 25% to finish up the infrastructure.
>>
>> So: those of you out there who have access to any datasets of types
>> otehr than FLASH or Enzo, can you provide non-proprietary, medium-size
>> and small-size datasets?  I'd like to have two for every code base, at
>> least.
>>
>> So: those of you who want to help out, would you be interested in
>> looking at the answer_testing framework with me?  I am happy to
>> discuss it over email or IRC to convert it to the numpy testing
>> format, which will be much easier to maintain in the long run and make
>> it much easier to have a single testing system that works for
>> everything.
>>
>> -Matt
>> _______________________________________________
>> yt-dev mailing list
>> yt-dev at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>
>
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>



More information about the yt-dev mailing list