[Yt-dev] A Mission Statement for yt

Matthew Turk matthewturk at gmail.com
Sun Jun 19 14:03:22 PDT 2011


I am also +1 on this draft.

On Sat, Jun 18, 2011 at 7:03 AM, Sam Skillman <samskillman at gmail.com> wrote:
> This looks great to me.  It's a good mix of what we can do now, what we want
> to do in the future, and the manner in which we'll get there.  +1
>
> On Fri, Jun 17, 2011 at 3:20 PM, j s oishi <jsoishi at gmail.com> wrote:
>>
>> This is looking awesome. Let me add a tiny bit of refinement:
>>
>> Draft 3
>> --------
>> The yt project aims to produce an integrated science environment
>> for collaboratively asking and answering astrophysical
>> questions. To do so, it will encompass the creation of initial
>> conditions, the execution of simulations, and the detailed
>> exploration and visualization of the resultant data.  It will
>> also provide a standard framework based on physical quantities
>> for interoperability between codes.
>>
>> Development of yt is driven by a commitment to Open Science
>> principles as manifested in participatory development,
>> reproducibility, documented and approachable code, a friendly and
>> helpful community of users and developers, and Free and Libre
>> Open Source Software.
>>
>>
>> > The yt project aims to produce an integrated science environment for
>> > collaboratively asking and answering astrophysical questions,
>> > encompassing
>> > the creation of initial conditions, the execution of simulations and the
>> > detailed exploration and visualization of the resultant data.  It will
>> > provide a standard framework based on physical quantities for
>> > interoperability
>> > between codes. Development of yt is driven by a commitment to Open
>> > Science
>> > principles
>> > as manifested in participatory development, reproducibility, documented
>> > and
>> > approachable code, a friendly and helpful community of users and
>> > developers,
>> > and Free and Libre Open Source Software.
>> >
>> >
>> >
>> > On 6/14/11 1:02 AM, Matthew Turk wrote:
>> >>
>> >> Hi all,
>> >>
>> >> tl;dr summary: New bullet points below, along with a first draft at a
>> >> proper prose solidification.  More comments still requested.
>> >>
>> >> Thanks everyone for your thoughtful responses.  Having this discussion
>> >> as a group is really the only way to have the outcome of it be
>> >> meaningful; I'm glad we could have as much of a discussion as we
>> >> already have, and I hope that moving forward we can keep talking about
>> >> this and refining it to sort of steer ourselves.
>> >>
>> >> In reading over your replies, it's become clear that the mission goal
>> >> bullet points in that original list were a bit ... well, shall we say,
>> >> under-ambitious?  So let's take the gloves off a bit.  Chris, your
>> >> comments in particular made me realize that my own feelings about this
>> >> project we've got going are a bit more ambitious; Brian, Britton and
>> >> Jeff, yours did as well.  And this didn't come through in the bullet
>> >> points, although it was alluded to.  I'll put my comments at the
>> >> bottom, along with an updated list of bullet points, after I respond
>> >> to a couple things that were brought up.
>> >>
>> >> On Mon, Jun 13, 2011 at 5:06 PM, j s oishi<jsoishi at gmail.com>  wrote:
>> >>>
>> >>> Hi all,
>> >>>
>> >>> This is already a great discussion, and it seems to me that there are
>> >>> a lot of great ideas that I only want to echo. First, I think what
>> >>> Matt started with is a great foundation: yt should continue to be a
>> >>> collaborative, open-source tool for reproducible physical analysis of
>> >>> simulation data. I think the idea that not only is yt itself open
>> >>> source, but the *entire software stack upon which it rests* is also
>> >>> open source is a point worth emphasizing. If nothing else, a user gets
>> >>> not only yt but also an amazing toolkit for doing numerical
>> >>> computation free of cost and restrictive licenses.
>> >>
>> >> That is a very, very good point -- and the impetus for its initial
>> >> creation, actually.
>> >>
>> >>> Second, I think that Brian's point about data provenance and
>> >>> reproducibility of an entire project is really a direction I would
>> >>> love to see yt move in. yt should allow (and encourage!)
>> >>> reproducibility beyond analysis to include simulation initialization,
>> >>> runtime, and final, reduced data products. Furthermore, I believe it
>> >>> should be able to do this in a cross-code manner: imagine having a set
>> >>> of descriptions (perhaps in the form of yt scripts, perhaps in some
>> >>> other machine/human readable format) that describe initial conditions,
>> >>> runtime parameters, analysis outputs and data products that could be
>> >>> run on Enzo and Ramses. We could move beyond code comparison test
>> >>> problems to real inter-code reproducibility.
>> >>
>> >> Yes.  Yes, and more yes; I firmly believe in this.  Looking,
>> >> realistically, at where we are and where we are going, I believe this
>> >> is an utterly feasible goal, and the timescale is not terribly great
>> >> -- effort simply needs to be applied in that direction.
>> >>
>> >> We can have a longer discussion about this, but I think having this
>> >> item in the mission statement for now is sufficient.
>> >>
>> >>> Finally, I think that Britton is right that we should also continue to
>> >>> emphasize that yt is a tool for physical reasoning on simulation data,
>> >>> and that *it*, not *you the user* make all necessary manipulations to
>> >>> get simulation data to physical quantities.
>> >>>
>> >>> Thanks again for starting such an interesting discussion. I look
>> >>> forward to moving forward with yt.
>> >>>
>> >>> j
>> >>>
>> >>> On Mon, Jun 13, 2011 at 4:23 PM, Brian O'Shea<bwoshea at gmail.com>
>> >>>  wrote:
>> >>>>
>> >>>> Hi Matt,
>> >>>>
>> >>>> This may not be something specifically for the mission statement
>> >>>> (depending
>> >>>> on how wordy we want to get), but I'm very interested in using yt (or
>> >>>> something that encompasses yt) as a workflow tool so that my
>> >>>> simulations
>> >>>> are
>> >>>> completely reproducible.  What I could imagine is something like
>> >>>> this:
>> >>
>> >> I very much like the workflow you laid out, although I would contend
>> >> we should address more directly the task of running the simulation.
>> >> On some level, it becomes a realizable goal to execute the main loop
>> >> of the code in Python without any real overhead.  This will have the
>> >> side effect of providing much easier access to the data during the
>> >> course of the simulation.
>> >>
>> >> I would also scratch out "Enzo" and replace it with "Simulation Code"
>> >> -- while pragmatically I recognize your simulations will likely be
>> >> conducted in Enzo for the purposes of this provenance tracking, I feel
>> >> it should be said that for the mission statement I believe in a
>> >> code-neutral direction.
>> >>
>> >> One difficulty here is the idea of actually moving the data.  It is
>> >> not clear the me that moving data around in file systems is a
>> >> tractable, solvable problem.  That is a good thing to strive for, but
>> >> I personally can't wrap my head around it.  Stephen?  Britton?
>> >>
>> >>>> 1.  Generate initial conditions, cosmological or otherwise.  IC
>> >>>> parameter
>> >>>> file goes into a database, along with details about the code that's
>> >>>> used
>> >>>> to
>> >>>> generate my ICs (inits/MUSIC/grafic hash, outputs of make show-config
>> >>>> and
>> >>>> make show-flags, etc.)
>> >>>>
>> >>>> 2.  Run simulation.  Run-related and performance information is
>> >>>> collected in
>> >>>> a database.  (what supercomputer?  How many CPUs?  Environmental
>> >>>> variables?
>> >>>> Which version of MPI?  What date(s) did the job run on?  What nodes?
>> >>>>  Copy
>> >>>> of Enzo restart parameter files and perhaps hierarchy files, for
>> >>>> later
>> >>>> query?)
>> >>>>
>> >>>> 3.  Back up Enzo data to mass storage (or perhaps some subset of the
>> >>>> data,
>> >>>> depending on how big the sim is).  What directory is it in?  Should
>> >>>> it
>> >>>> be
>> >>>> world-readable, group-readable, etc.?
>> >>>>
>> >>>> 4.  Do analysis.  Record all details of analysis and plot making, so
>> >>>> that I
>> >>>> can go back and retrace all details.
>> >>>>
>> >>>> At that point, I would know _precisely_ how and with what
>> >>>> commands/code/parameter files/etc. the plots that are in my papers,
>> >>>> and
>> >>>> everything leading up to that, is generated.  This helps when you go
>> >>>> back to
>> >>>> deal with the referee ("how DID I make that stupid plot?  What was
>> >>>> sigma8
>> >>>> again?"), but also for reproducibility, since in principle somebody
>> >>>> could
>> >>>> just go back, look at the database, and be able to do precisely what
>> >>>> I
>> >>>> did.
>> >>>> Also, if somebody wanted to use archival data - something we hope to
>> >>>> do
>> >>>> more
>> >>>> of in the future, as simulations grow in expense and complexity -
>> >>>> there'd be
>> >>>> no confusion about the provenance of that data.
>> >>>>
>> >>>> If I had to sum this up in a sentence, it'd be "Transform yt into a
>> >>>> tool
>> >>>> for
>> >>>> easily and transparently tracking all aspects of simulation
>> >>>> generation,
>> >>>> execution, and analysis for the purposes of reproducibility."
>> >>
>> >> That's a great sentence.
>> >>
>> >> Britton: I like your additions very much.  It had completely slipped
>> >> my mind that one of the most useful features of yt is its physical and
>> >> geometric object selection.
>> >>
>> >> Chris: I don't think you are stepping outside the scope of what we
>> >> could generously call "The yt project" with what you mention.  There
>> >> are the technical goals, and the broader community goals.  The goals
>> >> of open science, reproducibility, and cultivating a community of
>> >> scientists willing to share scripts, analysis routines, and even
>> >> analysis modules are certainly part of what I think we are all
>> >> striving for.  And this goal isn't composed of just deploying
>> >> infrastructure, rolling it out, but also providing a welcoming and
>> >> friendly community of people willing to help.  For instance, it's
>> >> great that scripts written to generate phase plots from Orion outputs
>> >> can be used nearly unmodified on Enzo outputs.  (I remember when Jeff
>> >> worked so hard to make this so --
>> >> http://www.flickr.com/photos/matthewturk/2598141965/ )  But even more
>> >> than that, I think it's amazing that people are willing to share these
>> >> scripts.
>> >>
>> >> So yes, let's bake that right into the mission statement.
>> >>
>> >> As for your comments about the microphysical solvers, believe me when
>> >> I say they are not falling on deaf ears.  Moving toward an open
>> >> source, community-driven model for microphysical solvers is an issue
>> >> near and dear to my own heart, having spent several years of my life
>> >> writing a primordial chemistry solver.  I believe there is a place for
>> >> interfacing with that sort of project and endeavor inside yt -- in
>> >> particular, interfacing with specific APIs and so forth to seamlessly
>> >> calculate cooling times or EOS or opacities.  Let's revisit this issue
>> >> in the future.
>> >>
>> >> (Although, if we step back for a second and look at what's in yt ...
>> >> boundary condition calculations, cooling time calculations, gravity,
>> >> ... the mind does wander.)
>> >>
>> >> The revised bullet points I have:
>> >>
>> >> = What is the mission? =
>> >>  * To create a fun, community-led, open source tool for asking and
>> >> answering astrophysical questions through simulations, analysis and
>> >> visualization that allows one to ask astrophysical questions of
>> >> simulation data independent of the code used to produce that data.
>> >>  * To create a friendly, helpful community of scientists
>> >>  * To further the goals of Open Science
>> >>  * To construct an environment that encompasses the generation of
>> >> data, starting from initial conditions, through simulations, and
>> >> finally resulting in publication-quality plots
>> >>  * To create reproducible, cross-code questions and answers from
>> >> astrophysical data
>> >>  * To present simulation data in physical terms, rather than strictly
>> >> in simulation and data format terms
>> >>  * To construct a consistent language for asking questions of
>> >> simulation data from many sources
>> >>  * To encourage researchers to participate in constructing a community
>> >> code
>> >>  * To provide a place to create and share analysis codes, recipes, and
>> >> other things that can be helpful to others seeking to answer similar
>> >> scientific questions.
>> >>
>> >> The next step in this is to try to distill it down into a sentence or
>> >> two.  I've included my first pass at this.  Not all items have to be
>> >> included -- they can be shuffled off and left implicit in the proper
>> >> mission statement, but can show up in the broader directions.  The
>> >> ultimate goal of this is to provide both the short-form "elevator
>> >> pitch" and then augment that with what we could generously call
>> >> strategy documents.
>> >>
>> >> Draft 1:
>> >>
>> >> The yt project aims to produce an integrated science environment for
>> >> asking and answering astrophysical questions, encompassing the
>> >> creation of initial conditions, the execution of simulations and the
>> >> detailed exploration and visualization of the resultant data.
>> >> Development of yt is driven by a commitment to Open Science principles
>> >> as manifested in participatory development, reproducibility, a
>> >> friendly and helpful community of users and developers, and Free and
>> >> Libre Open Source Software.
>> >>
>> >> I'm not terribly satisfied with this draft.  I don't quite know how to
>> >> work in two things that I think should be stated -- that the end goal
>> >> is, ideally, a community project (whose bus factor is equal to the
>> >> number of users :) and that we want to focus on the physical
>> >> underpinnings of simulations when asking questions rather than, say,
>> >> the specifics of unformatted fortran or HDF5.  I think that the
>> >> broader focus (as an integrated science environment) comes across, but
>> >> the other core aspects are a bit underserved.
>> >>
>> >> Edits and suggestions?
>> >>
>> >> Thanks again, everyone.  I'm glad we're having this conversation.
>> >>
>> >> -Matt
>> >>
>> >>>> Anyway, maybe that's unrealistic, but it'd be awesome.  The few
>> >>>> workflow
>> >>>> tools that I have been exposed to suffer from excessive generality,
>> >>>> and
>> >>>> thus
>> >>>> are a bit too cumbersome to be easy to use, and thus too cumbersome
>> >>>> to
>> >>>> be
>> >>>> actually used.
>> >>>>
>> >>>> --Brian
>> >>>>
>> >>>> On Mon, Jun 13, 2011 at 12:43 PM, Matthew Turk<matthewturk at gmail.com>
>> >>>> wrote:
>> >>>>>
>> >>>>> Hi everyone,
>> >>>>>
>> >>>>> I hope you'll take the opportunity to read and respond to this
>> >>>>> email,
>> >>>>> even if you're not a heavy-developer, or even a heavy-user, of yt.
>> >>>>> Your feedback and contributions would be greatly, greatly
>> >>>>> appreciated,
>> >>>>> particularly as this will help guide where yt development,
>> >>>>> community-building and (optimistically) use will go.  I know that
>> >>>>> sometimes the signal-to-noise on the yt lists can be a bit low, but
>> >>>>> I
>> >>>>> think this is a particularly useful discussion to have.
>> >>>>>
>> >>>>> A few of us have been brainstorming, in person, in IRC, etc about
>> >>>>> the
>> >>>>> direction yt has been going.  There are a number of reasons for
>> >>>>> doing
>> >>>>> this -- to provide focus, to provide an idea of the
>> >>>>> off-in-the-distance goal, and to have a public statement of what
>> >>>>> we're
>> >>>>> about, which shows ambition, concern for the values that go into a
>> >>>>> scientific code, and an interest in providing access to that code.
>> >>>>> This boils down to coming up with a mission statement, which will
>> >>>>> help
>> >>>>> both focus our goals on what we want to provide, as well as describe
>> >>>>> those areas we do not want to provide.  Much of this is based on the
>> >>>>> contents of “The Art of Community” by Jono Bacon, specifically
>> >>>>> around
>> >>>>> page 71 in the PDF available on www.artofcommunityonline.org/get/ .
>> >>>>>
>> >>>>> “Mission statements are intended to be consistent and should rarely
>> >>>>> change, even if the tasks that achieve that mission change
>> >>>>> regularly.
>> >>>>> When building your mission statement, always have its longevity in
>> >>>>> mind. Remember, your mission statement is your slam-dunking,
>> >>>>> audacious
>> >>>>> goal. For many communities these missions can take decades or even
>> >>>>> longer to achieve. Their  purpose is to not only describe the finish
>> >>>>> line, but to help the community stay on track.”
>> >>>>>
>> >>>>> To develop a mission statement, which will act as a precursor to a
>> >>>>> strategic plan, we need to construct answers to three questions.
>> >>>>> These will provide the initial basis for a broader mission
>> >>>>> statement.
>> >>>>> For reference, here are some “principles” we came up with several
>> >>>>> years ago:
>> >>>>>
>> >>>>> http://yt.enzotools.org/principles.html
>> >>>>>
>> >>>>> As I mentioned above, a few of us have been spitballing answers to
>> >>>>> these questions, and it has reached the point where we really need
>> >>>>> to
>> >>>>> bring this forward, to conduct these discussions in public, to bring
>> >>>>> some clarity and engagement to the process.  Ultimately, once we
>> >>>>> have
>> >>>>> sketched out a couple broad goals and bullet points, this can then
>> >>>>> be
>> >>>>> distilled into a short, pithy block of text that serves as a
>> >>>>> "Mission
>> >>>>> Statement."  Below are some potential bullet points, but I feel
>> >>>>> strongly that it's important that these get refined and discussed.
>> >>>>>
>> >>>>> = What is the mission? =
>> >>>>>  * To create a fun, community-led, open source tool for asking and
>> >>>>> answering astrophysical questions through simulations, analysis and
>> >>>>> visualization
>> >>>>>  * To create reproducible, cross-code questions and answers from
>> >>>>> astrophysical data
>> >>>>>  * To construct a consistent language for asking questions of
>> >>>>> simulation data from many sources
>> >>>>>  * To encourage researchers to participate in constructing a
>> >>>>> community
>> >>>>> code
>> >>>>>
>> >>>>> = What are the opportunities and areas of collaboration? =
>> >>>>>  * Development of new tools, new techniques, and adding support for
>> >>>>> new
>> >>>>> codes.
>> >>>>>  * Adding components to the GUI
>> >>>>>  * Providing outreach-capable frontends
>> >>>>>  * Improving visualization qualities
>> >>>>>  * Adding new methods of accessing data
>> >>>>>  * Performance analysis&  optimization
>> >>>>>  * Deployment to new platforms
>> >>>>>  * Designing new web pages
>> >>>>>  * Writing documentation and recipes
>> >>>>>  * Spreading the word
>> >>>>>  * Support for Cartesian non-astrophysical simulations (weather,
>> >>>>> earthquakes)
>> >>>>>  * Extension to non-Cartesian coordinate systems
>> >>>>>  * Mentoring new developers
>> >>>>>
>> >>>>> = What are the skills required? =
>> >>>>>  * Thoughtful process
>> >>>>>  * Careful quality control
>> >>>>>  * Ability to communicate
>> >>>>>  * An investment in “the answer”
>> >>>>>  * Eagerness to participate in an open fashion
>> >>>>>
>> >>>>> What other bullets, ideas, inclinations do people have?  If we can
>> >>>>> start a discussion, maybe we can draft some text.  This would
>> >>>>> certainly help with focusing our strategies for presenting yt to
>> >>>>> others, directing our development in conjunction with our scientific
>> >>>>> goals, and collaborating as a community.
>> >>>>>
>> >>>>> Thanks very much for any thoughts,
>> >>>>>
>> >>>>> Matt
>> >>>>> _______________________________________________
>> >>>>> Yt-dev mailing list
>> >>>>> Yt-dev at lists.spacepope.org
>> >>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>> >>>>
>> >>>> _______________________________________________
>> >>>> Yt-dev mailing list
>> >>>> Yt-dev at lists.spacepope.org
>> >>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>> >>>>
>> >>>>
>> >>> _______________________________________________
>> >>> Yt-dev mailing list
>> >>> Yt-dev at lists.spacepope.org
>> >>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>> >>
>> >> _______________________________________________
>> >> Yt-dev mailing list
>> >> Yt-dev at lists.spacepope.org
>> >> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>> >>
>> > _______________________________________________
>> > Yt-dev mailing list
>> > Yt-dev at lists.spacepope.org
>> > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>> >
>> _______________________________________________
>> Yt-dev mailing list
>> Yt-dev at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>



More information about the yt-dev mailing list