Hi Matt,<br><br>Your response here and the one to Casey answered my question.  It wasn't clear to me that testing the same function with different configurations was counting as separate tests.  I understand now.  Thanks!<br>

<br>Britton<br><br><div class="gmail_quote">On Fri, Oct 12, 2012 at 6:09 PM, Matthew Turk <span dir="ltr"><<a href="mailto:matthewturk@gmail.com" target="_blank">matthewturk@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi Britton,<br>

<div class="im"><br>

On Fri, Oct 12, 2012 at 3:06 PM, Britton Smith <<a href="mailto:brittonsmith@gmail.com">brittonsmith@gmail.com</a>> wrote:<br>

</div><div class="im">> What is the number of tests of unique functionality?<br>

<br>

</div>I'm not sure I know what you mean?  For the most part, we're testing<br>

many aspects of individual components.  As an example, we now have a<br>

whole bunch of tests that address different aspects (each of which is<br>

relatively tricky and sensitive to changes in the code base) of<br>

covering grids, projections, profiles, and so on.  So I guess in a<br>

sense, we're really well-testing about 5 different pieces of the code<br>

in the unit tests.  In the answer tests we test a much broader section<br>

of the code, but it takes longer and requires reference data.<br>

<br>

-Matt<br>

<div class="HOEnZb"><div class="h5"><br>

><br>

><br>

> On Fri, Oct 12, 2012 at 6:04 PM, Nathan Goldbaum <<a href="mailto:nathan12343@gmail.com">nathan12343@gmail.com</a>><br>

> wrote:<br>

>><br>

>> In this terminology, each assert statement is a test.  It's quite easy to<br>

>> make dozens of new tests inside a couple of nested for loops.<br>

>><br>

>> On Oct 12, 2012, at 3:02 PM, Casey W. Stark wrote:<br>

>><br>

>> Hey Matt.<br>

>><br>

>> I would like to provide the data for Nyx. Not sure what sort of output<br>

>> would be useful though.<br>

>><br>

>> So I knew of some of the tests you and Anthony added, but there are 500<br>

>> unit tests now? Isn't that a bit strange?<br>

>><br>

>> - Casey<br>

>><br>

>><br>

>> On Fri, Oct 12, 2012 at 2:54 PM, Matthew Turk <<a href="mailto:matthewturk@gmail.com">matthewturk@gmail.com</a>><br>

>> wrote:<br>

>>><br>

>>> Hi all,<br>

>>><br>

>>> Today at UCSC, Nathan, Chris (Moody) and I sat down and went through<br>

>>> what we wanted to accomplish with testing.  This comes back to the<br>

>>> age-old dichotomy between unit testing and answer testing.  But what<br>

>>> this really comes back to, now that we had the opportunity to think<br>

>>> about it, is the difference between testing components and<br>

>>> functionality versus testing frontends.<br>

>>><br>

>>> So the idea here is:<br>

>>><br>

>>> Unit tests => Cover, using either manually inserted data values or<br>

>>> randomly generated "parameter files", individual units of the code.<br>

>>> Stephen and I have written a bunch in the last couple days.  We have<br>

>>> nearly 500, and they take < 1 minute to run.<br>

>>><br>

>>> Frontend/Answer tests => Cover a large portion of high-level<br>

>>> functionality that touches a lot of the code, but do so by running<br>

>>> things like projections, profiles, etc on actual data from actual<br>

>>> simulation codes, which then get compared to reference values that are<br>

>>> stored somewhere.  Currently we have ~550 answer tests, and they run<br>

>>> every 30 minutes on moving7_0010 (comes wit yt) and once a day on<br>

>>> JHK-DD0030 (on <a href="http://yt-project.org/data/" target="_blank">yt-project.org/data/</a> as IsolatedGalaxy .)  We do not<br>

>>> have automated FLASH testing.<br>

>>><br>

>>> The next step is:<br>

>>><br>

>>> 1) Getting a bunch of non-proprietary sets of data that are small<br>

>>> *and* medium, for each code base we want to test.  This data must be<br>

>>> non-proprietary!  For small, I would say they can be trivially small.<br>

>>> For medium, I'd prefer in the 0.5 - 5 gb range for size-on-disk.  I<br>

>>> would think that GasSloshing and WindTunnel could work for FLASH.  But<br>

>>> we still need ART data (from Chris Moody), GDF or Piernik data (from<br>

>>> Kacper), Orion data (if possible), Nyx data (if possible).  I will<br>

>>> handle adding RAMSES data in the 3.0 branch.<br>

>>> 2) Getting a mechanism to run answer tests that isn't "Matt's<br>

>>> desktop."  I've emailed Shining Panda about this, but if they don't<br>

>>> have the ability to provide us with a FLOSS license, I think we can<br>

>>> identify some funding to do this.<br>

>>> 3) Have a mechanism to display and collate results.  ShiningPanda<br>

>>> would do this if we were on their systems.<br>

>>> 4) Make it much easier to flag individual tests as needing updates.  I<br>

>>> think the Data Hub will be the end place for this, but this is lower<br>

>>> priority.<br>

>>> 5) Migrate answer testing to use unit testing framework, as most of<br>

>>> what we've done there re-implements stuff that is in the unit testing<br>

>>> frameworks.  This will mean we can much more easily handle<br>

>>> test-discovery, which is a huge plus.<br>

>>><br>

>>> Ultimately, the end product of all of this is that we should<br>

>>> eventually have a method for running a single set of tests that do<br>

>>> test discovery that loads up a bunch of different data outputs, runs<br>

>>> answer tests on all of them, runs the unit tests, etc etc.  I think it<br>

>>> just needs the last 25% to finish up the infrastructure.<br>

>>><br>

>>> So: those of you out there who have access to any datasets of types<br>

>>> otehr than FLASH or Enzo, can you provide non-proprietary, medium-size<br>

>>> and small-size datasets?  I'd like to have two for every code base, at<br>

>>> least.<br>

>>><br>

>>> So: those of you who want to help out, would you be interested in<br>

>>> looking at the answer_testing framework with me?  I am happy to<br>

>>> discuss it over email or IRC to convert it to the numpy testing<br>

>>> format, which will be much easier to maintain in the long run and make<br>

>>> it much easier to have a single testing system that works for<br>

>>> everything.<br>

>>><br>

>>> -Matt<br>

>>> _______________________________________________<br>

>>> yt-dev mailing list<br>

>>> <a href="mailto:yt-dev@lists.spacepope.org">yt-dev@lists.spacepope.org</a><br>

>>> <a href="http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org" target="_blank">http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org</a><br>

>><br>

>><br>

>> _______________________________________________<br>

>> yt-dev mailing list<br>

>> <a href="mailto:yt-dev@lists.spacepope.org">yt-dev@lists.spacepope.org</a><br>

>> <a href="http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org" target="_blank">http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org</a><br>

>><br>

>><br>

>><br>

>> _______________________________________________<br>

>> yt-dev mailing list<br>

>> <a href="mailto:yt-dev@lists.spacepope.org">yt-dev@lists.spacepope.org</a><br>

>> <a href="http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org" target="_blank">http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org</a><br>

>><br>

><br>

><br>

> _______________________________________________<br>

> yt-dev mailing list<br>

> <a href="mailto:yt-dev@lists.spacepope.org">yt-dev@lists.spacepope.org</a><br>

> <a href="http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org" target="_blank">http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org</a><br>

><br>

_______________________________________________<br>

yt-dev mailing list<br>

<a href="mailto:yt-dev@lists.spacepope.org">yt-dev@lists.spacepope.org</a><br>

<a href="http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org" target="_blank">http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org</a><br>

</div></div></blockquote></div><br>