[Yt-dev] A Mission Statement for yt

Sam Skillman samskillman at gmail.com
Sat Jun 18 07:03:15 PDT 2011


This looks great to me.  It's a good mix of what we can do now, what we want
to do in the future, and the manner in which we'll get there.  +1

On Fri, Jun 17, 2011 at 3:20 PM, j s oishi <jsoishi at gmail.com> wrote:

> This is looking awesome. Let me add a tiny bit of refinement:
>
> Draft 3
> --------
> The yt project aims to produce an integrated science environment
> for collaboratively asking and answering astrophysical
> questions. To do so, it will encompass the creation of initial
> conditions, the execution of simulations, and the detailed
> exploration and visualization of the resultant data.  It will
> also provide a standard framework based on physical quantities
> for interoperability between codes.
>
> Development of yt is driven by a commitment to Open Science
> principles as manifested in participatory development,
> reproducibility, documented and approachable code, a friendly and
> helpful community of users and developers, and Free and Libre
> Open Source Software.
>
>
> > The yt project aims to produce an integrated science environment for
> > collaboratively asking and answering astrophysical questions,
> encompassing
> > the creation of initial conditions, the execution of simulations and the
> > detailed exploration and visualization of the resultant data.  It will
> > provide a standard framework based on physical quantities for
> > interoperability
> > between codes. Development of yt is driven by a commitment to Open
> Science
> > principles
> > as manifested in participatory development, reproducibility, documented
> and
> > approachable code, a friendly and helpful community of users and
> developers,
> > and Free and Libre Open Source Software.
> >
> >
> >
> > On 6/14/11 1:02 AM, Matthew Turk wrote:
> >>
> >> Hi all,
> >>
> >> tl;dr summary: New bullet points below, along with a first draft at a
> >> proper prose solidification.  More comments still requested.
> >>
> >> Thanks everyone for your thoughtful responses.  Having this discussion
> >> as a group is really the only way to have the outcome of it be
> >> meaningful; I'm glad we could have as much of a discussion as we
> >> already have, and I hope that moving forward we can keep talking about
> >> this and refining it to sort of steer ourselves.
> >>
> >> In reading over your replies, it's become clear that the mission goal
> >> bullet points in that original list were a bit ... well, shall we say,
> >> under-ambitious?  So let's take the gloves off a bit.  Chris, your
> >> comments in particular made me realize that my own feelings about this
> >> project we've got going are a bit more ambitious; Brian, Britton and
> >> Jeff, yours did as well.  And this didn't come through in the bullet
> >> points, although it was alluded to.  I'll put my comments at the
> >> bottom, along with an updated list of bullet points, after I respond
> >> to a couple things that were brought up.
> >>
> >> On Mon, Jun 13, 2011 at 5:06 PM, j s oishi<jsoishi at gmail.com>  wrote:
> >>>
> >>> Hi all,
> >>>
> >>> This is already a great discussion, and it seems to me that there are
> >>> a lot of great ideas that I only want to echo. First, I think what
> >>> Matt started with is a great foundation: yt should continue to be a
> >>> collaborative, open-source tool for reproducible physical analysis of
> >>> simulation data. I think the idea that not only is yt itself open
> >>> source, but the *entire software stack upon which it rests* is also
> >>> open source is a point worth emphasizing. If nothing else, a user gets
> >>> not only yt but also an amazing toolkit for doing numerical
> >>> computation free of cost and restrictive licenses.
> >>
> >> That is a very, very good point -- and the impetus for its initial
> >> creation, actually.
> >>
> >>> Second, I think that Brian's point about data provenance and
> >>> reproducibility of an entire project is really a direction I would
> >>> love to see yt move in. yt should allow (and encourage!)
> >>> reproducibility beyond analysis to include simulation initialization,
> >>> runtime, and final, reduced data products. Furthermore, I believe it
> >>> should be able to do this in a cross-code manner: imagine having a set
> >>> of descriptions (perhaps in the form of yt scripts, perhaps in some
> >>> other machine/human readable format) that describe initial conditions,
> >>> runtime parameters, analysis outputs and data products that could be
> >>> run on Enzo and Ramses. We could move beyond code comparison test
> >>> problems to real inter-code reproducibility.
> >>
> >> Yes.  Yes, and more yes; I firmly believe in this.  Looking,
> >> realistically, at where we are and where we are going, I believe this
> >> is an utterly feasible goal, and the timescale is not terribly great
> >> -- effort simply needs to be applied in that direction.
> >>
> >> We can have a longer discussion about this, but I think having this
> >> item in the mission statement for now is sufficient.
> >>
> >>> Finally, I think that Britton is right that we should also continue to
> >>> emphasize that yt is a tool for physical reasoning on simulation data,
> >>> and that *it*, not *you the user* make all necessary manipulations to
> >>> get simulation data to physical quantities.
> >>>
> >>> Thanks again for starting such an interesting discussion. I look
> >>> forward to moving forward with yt.
> >>>
> >>> j
> >>>
> >>> On Mon, Jun 13, 2011 at 4:23 PM, Brian O'Shea<bwoshea at gmail.com>
>  wrote:
> >>>>
> >>>> Hi Matt,
> >>>>
> >>>> This may not be something specifically for the mission statement
> >>>> (depending
> >>>> on how wordy we want to get), but I'm very interested in using yt (or
> >>>> something that encompasses yt) as a workflow tool so that my
> simulations
> >>>> are
> >>>> completely reproducible.  What I could imagine is something like this:
> >>
> >> I very much like the workflow you laid out, although I would contend
> >> we should address more directly the task of running the simulation.
> >> On some level, it becomes a realizable goal to execute the main loop
> >> of the code in Python without any real overhead.  This will have the
> >> side effect of providing much easier access to the data during the
> >> course of the simulation.
> >>
> >> I would also scratch out "Enzo" and replace it with "Simulation Code"
> >> -- while pragmatically I recognize your simulations will likely be
> >> conducted in Enzo for the purposes of this provenance tracking, I feel
> >> it should be said that for the mission statement I believe in a
> >> code-neutral direction.
> >>
> >> One difficulty here is the idea of actually moving the data.  It is
> >> not clear the me that moving data around in file systems is a
> >> tractable, solvable problem.  That is a good thing to strive for, but
> >> I personally can't wrap my head around it.  Stephen?  Britton?
> >>
> >>>> 1.  Generate initial conditions, cosmological or otherwise.  IC
> >>>> parameter
> >>>> file goes into a database, along with details about the code that's
> used
> >>>> to
> >>>> generate my ICs (inits/MUSIC/grafic hash, outputs of make show-config
> >>>> and
> >>>> make show-flags, etc.)
> >>>>
> >>>> 2.  Run simulation.  Run-related and performance information is
> >>>> collected in
> >>>> a database.  (what supercomputer?  How many CPUs?  Environmental
> >>>> variables?
> >>>> Which version of MPI?  What date(s) did the job run on?  What nodes?
> >>>>  Copy
> >>>> of Enzo restart parameter files and perhaps hierarchy files, for later
> >>>> query?)
> >>>>
> >>>> 3.  Back up Enzo data to mass storage (or perhaps some subset of the
> >>>> data,
> >>>> depending on how big the sim is).  What directory is it in?  Should it
> >>>> be
> >>>> world-readable, group-readable, etc.?
> >>>>
> >>>> 4.  Do analysis.  Record all details of analysis and plot making, so
> >>>> that I
> >>>> can go back and retrace all details.
> >>>>
> >>>> At that point, I would know _precisely_ how and with what
> >>>> commands/code/parameter files/etc. the plots that are in my papers,
> and
> >>>> everything leading up to that, is generated.  This helps when you go
> >>>> back to
> >>>> deal with the referee ("how DID I make that stupid plot?  What was
> >>>> sigma8
> >>>> again?"), but also for reproducibility, since in principle somebody
> >>>> could
> >>>> just go back, look at the database, and be able to do precisely what I
> >>>> did.
> >>>> Also, if somebody wanted to use archival data - something we hope to
> do
> >>>> more
> >>>> of in the future, as simulations grow in expense and complexity -
> >>>> there'd be
> >>>> no confusion about the provenance of that data.
> >>>>
> >>>> If I had to sum this up in a sentence, it'd be "Transform yt into a
> tool
> >>>> for
> >>>> easily and transparently tracking all aspects of simulation
> generation,
> >>>> execution, and analysis for the purposes of reproducibility."
> >>
> >> That's a great sentence.
> >>
> >> Britton: I like your additions very much.  It had completely slipped
> >> my mind that one of the most useful features of yt is its physical and
> >> geometric object selection.
> >>
> >> Chris: I don't think you are stepping outside the scope of what we
> >> could generously call "The yt project" with what you mention.  There
> >> are the technical goals, and the broader community goals.  The goals
> >> of open science, reproducibility, and cultivating a community of
> >> scientists willing to share scripts, analysis routines, and even
> >> analysis modules are certainly part of what I think we are all
> >> striving for.  And this goal isn't composed of just deploying
> >> infrastructure, rolling it out, but also providing a welcoming and
> >> friendly community of people willing to help.  For instance, it's
> >> great that scripts written to generate phase plots from Orion outputs
> >> can be used nearly unmodified on Enzo outputs.  (I remember when Jeff
> >> worked so hard to make this so --
> >> http://www.flickr.com/photos/matthewturk/2598141965/ )  But even more
> >> than that, I think it's amazing that people are willing to share these
> >> scripts.
> >>
> >> So yes, let's bake that right into the mission statement.
> >>
> >> As for your comments about the microphysical solvers, believe me when
> >> I say they are not falling on deaf ears.  Moving toward an open
> >> source, community-driven model for microphysical solvers is an issue
> >> near and dear to my own heart, having spent several years of my life
> >> writing a primordial chemistry solver.  I believe there is a place for
> >> interfacing with that sort of project and endeavor inside yt -- in
> >> particular, interfacing with specific APIs and so forth to seamlessly
> >> calculate cooling times or EOS or opacities.  Let's revisit this issue
> >> in the future.
> >>
> >> (Although, if we step back for a second and look at what's in yt ...
> >> boundary condition calculations, cooling time calculations, gravity,
> >> ... the mind does wander.)
> >>
> >> The revised bullet points I have:
> >>
> >> = What is the mission? =
> >>  * To create a fun, community-led, open source tool for asking and
> >> answering astrophysical questions through simulations, analysis and
> >> visualization that allows one to ask astrophysical questions of
> >> simulation data independent of the code used to produce that data.
> >>  * To create a friendly, helpful community of scientists
> >>  * To further the goals of Open Science
> >>  * To construct an environment that encompasses the generation of
> >> data, starting from initial conditions, through simulations, and
> >> finally resulting in publication-quality plots
> >>  * To create reproducible, cross-code questions and answers from
> >> astrophysical data
> >>  * To present simulation data in physical terms, rather than strictly
> >> in simulation and data format terms
> >>  * To construct a consistent language for asking questions of
> >> simulation data from many sources
> >>  * To encourage researchers to participate in constructing a community
> >> code
> >>  * To provide a place to create and share analysis codes, recipes, and
> >> other things that can be helpful to others seeking to answer similar
> >> scientific questions.
> >>
> >> The next step in this is to try to distill it down into a sentence or
> >> two.  I've included my first pass at this.  Not all items have to be
> >> included -- they can be shuffled off and left implicit in the proper
> >> mission statement, but can show up in the broader directions.  The
> >> ultimate goal of this is to provide both the short-form "elevator
> >> pitch" and then augment that with what we could generously call
> >> strategy documents.
> >>
> >> Draft 1:
> >>
> >> The yt project aims to produce an integrated science environment for
> >> asking and answering astrophysical questions, encompassing the
> >> creation of initial conditions, the execution of simulations and the
> >> detailed exploration and visualization of the resultant data.
> >> Development of yt is driven by a commitment to Open Science principles
> >> as manifested in participatory development, reproducibility, a
> >> friendly and helpful community of users and developers, and Free and
> >> Libre Open Source Software.
> >>
> >> I'm not terribly satisfied with this draft.  I don't quite know how to
> >> work in two things that I think should be stated -- that the end goal
> >> is, ideally, a community project (whose bus factor is equal to the
> >> number of users :) and that we want to focus on the physical
> >> underpinnings of simulations when asking questions rather than, say,
> >> the specifics of unformatted fortran or HDF5.  I think that the
> >> broader focus (as an integrated science environment) comes across, but
> >> the other core aspects are a bit underserved.
> >>
> >> Edits and suggestions?
> >>
> >> Thanks again, everyone.  I'm glad we're having this conversation.
> >>
> >> -Matt
> >>
> >>>> Anyway, maybe that's unrealistic, but it'd be awesome.  The few
> workflow
> >>>> tools that I have been exposed to suffer from excessive generality,
> and
> >>>> thus
> >>>> are a bit too cumbersome to be easy to use, and thus too cumbersome to
> >>>> be
> >>>> actually used.
> >>>>
> >>>> --Brian
> >>>>
> >>>> On Mon, Jun 13, 2011 at 12:43 PM, Matthew Turk<matthewturk at gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> Hi everyone,
> >>>>>
> >>>>> I hope you'll take the opportunity to read and respond to this email,
> >>>>> even if you're not a heavy-developer, or even a heavy-user, of yt.
> >>>>> Your feedback and contributions would be greatly, greatly
> appreciated,
> >>>>> particularly as this will help guide where yt development,
> >>>>> community-building and (optimistically) use will go.  I know that
> >>>>> sometimes the signal-to-noise on the yt lists can be a bit low, but I
> >>>>> think this is a particularly useful discussion to have.
> >>>>>
> >>>>> A few of us have been brainstorming, in person, in IRC, etc about the
> >>>>> direction yt has been going.  There are a number of reasons for doing
> >>>>> this -- to provide focus, to provide an idea of the
> >>>>> off-in-the-distance goal, and to have a public statement of what
> we're
> >>>>> about, which shows ambition, concern for the values that go into a
> >>>>> scientific code, and an interest in providing access to that code.
> >>>>> This boils down to coming up with a mission statement, which will
> help
> >>>>> both focus our goals on what we want to provide, as well as describe
> >>>>> those areas we do not want to provide.  Much of this is based on the
> >>>>> contents of “The Art of Community” by Jono Bacon, specifically around
> >>>>> page 71 in the PDF available on www.artofcommunityonline.org/get/ .
> >>>>>
> >>>>> “Mission statements are intended to be consistent and should rarely
> >>>>> change, even if the tasks that achieve that mission change regularly.
> >>>>> When building your mission statement, always have its longevity in
> >>>>> mind. Remember, your mission statement is your slam-dunking,
> audacious
> >>>>> goal. For many communities these missions can take decades or even
> >>>>> longer to achieve. Their  purpose is to not only describe the finish
> >>>>> line, but to help the community stay on track.”
> >>>>>
> >>>>> To develop a mission statement, which will act as a precursor to a
> >>>>> strategic plan, we need to construct answers to three questions.
> >>>>> These will provide the initial basis for a broader mission statement.
> >>>>> For reference, here are some “principles” we came up with several
> >>>>> years ago:
> >>>>>
> >>>>> http://yt.enzotools.org/principles.html
> >>>>>
> >>>>> As I mentioned above, a few of us have been spitballing answers to
> >>>>> these questions, and it has reached the point where we really need to
> >>>>> bring this forward, to conduct these discussions in public, to bring
> >>>>> some clarity and engagement to the process.  Ultimately, once we have
> >>>>> sketched out a couple broad goals and bullet points, this can then be
> >>>>> distilled into a short, pithy block of text that serves as a "Mission
> >>>>> Statement."  Below are some potential bullet points, but I feel
> >>>>> strongly that it's important that these get refined and discussed.
> >>>>>
> >>>>> = What is the mission? =
> >>>>>  * To create a fun, community-led, open source tool for asking and
> >>>>> answering astrophysical questions through simulations, analysis and
> >>>>> visualization
> >>>>>  * To create reproducible, cross-code questions and answers from
> >>>>> astrophysical data
> >>>>>  * To construct a consistent language for asking questions of
> >>>>> simulation data from many sources
> >>>>>  * To encourage researchers to participate in constructing a
> community
> >>>>> code
> >>>>>
> >>>>> = What are the opportunities and areas of collaboration? =
> >>>>>  * Development of new tools, new techniques, and adding support for
> new
> >>>>> codes.
> >>>>>  * Adding components to the GUI
> >>>>>  * Providing outreach-capable frontends
> >>>>>  * Improving visualization qualities
> >>>>>  * Adding new methods of accessing data
> >>>>>  * Performance analysis&  optimization
> >>>>>  * Deployment to new platforms
> >>>>>  * Designing new web pages
> >>>>>  * Writing documentation and recipes
> >>>>>  * Spreading the word
> >>>>>  * Support for Cartesian non-astrophysical simulations (weather,
> >>>>> earthquakes)
> >>>>>  * Extension to non-Cartesian coordinate systems
> >>>>>  * Mentoring new developers
> >>>>>
> >>>>> = What are the skills required? =
> >>>>>  * Thoughtful process
> >>>>>  * Careful quality control
> >>>>>  * Ability to communicate
> >>>>>  * An investment in “the answer”
> >>>>>  * Eagerness to participate in an open fashion
> >>>>>
> >>>>> What other bullets, ideas, inclinations do people have?  If we can
> >>>>> start a discussion, maybe we can draft some text.  This would
> >>>>> certainly help with focusing our strategies for presenting yt to
> >>>>> others, directing our development in conjunction with our scientific
> >>>>> goals, and collaborating as a community.
> >>>>>
> >>>>> Thanks very much for any thoughts,
> >>>>>
> >>>>> Matt
> >>>>> _______________________________________________
> >>>>> Yt-dev mailing list
> >>>>> Yt-dev at lists.spacepope.org
> >>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >>>>
> >>>> _______________________________________________
> >>>> Yt-dev mailing list
> >>>> Yt-dev at lists.spacepope.org
> >>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >>>>
> >>>>
> >>> _______________________________________________
> >>> Yt-dev mailing list
> >>> Yt-dev at lists.spacepope.org
> >>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >>
> >> _______________________________________________
> >> Yt-dev mailing list
> >> Yt-dev at lists.spacepope.org
> >> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >>
> > _______________________________________________
> > Yt-dev mailing list
> > Yt-dev at lists.spacepope.org
> > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20110618/579cd3f6/attachment.htm>


More information about the yt-dev mailing list