[Yt-dev] yt documentation, standards, implementation

Matthew Turk matthewturk at gmail.com
Fri Jun 4 09:25:42 PDT 2010


Hi Jeff, everyone else,

Okay.  I think we are in large agreement about a couple things we
want, and a couple things that are problematic, which I'll try to
summarize here.

 * The Numpy docstring standard is useful.  Adhering to some variant
of this, possibly with a narrative string that includes variable
names, is a good idea.
 * Cross-referencing between objects is essential.  This means that we
need to have a canonical "DataSource" reference in docstrings for
everything that accepts a 1-, 2- or 3-D data source, for instance.
 * Implicit in Jeff's comment above about pf.h? not returning anything
useful is actually something that should be handled: overloading of
methods is hard to document.  As of right now, if you have a data
object like a sphere, it's completely unmentioned in the documentation
that you can do sphere["Density"] and get back Density.
 * The idea of starting points is super interesting, and one I think
that needs to be explored.
 * We need a Big List Of Functions, which might also include "Big List
Of Data Sources", "Big List Of Quantities", "Big List Of Fields" and
"Big List Of Callbacks."
 * The current mechanism for generating data sources
("pf.h.something(...)"), which made sense when it was first
implemented, doesn't necessarily help us in the problem of getting
info from docstrings.
 * We need to be more aggressive in documenting things from a
programmatic standpoint.  I'd like to quickly point out that Stephen
Skory has really been great with writing narrative documentation for
his contributions.
 * Single-page-per-function docs are fun, and I am going to try to
implement that immediately, so that we can all see how poorly set up
the docstrings are as of right now.  ;-)  Related to this, I'm going
to ditch the disqus comments at the bottom of every page of the docs
-- we've had zero so far, and the more I think about it the less
useful it seems to me even if people *were* to comment.

Perhaps this idea of entry points and roads is a way to move: we can
provide entry points to objects, and then show people roads to other
objects.  I'm just about ready at this point to scrap most of the
documentation that I've written and start anew, but unfortunately
that's a daunting task.  Documenting is not only hard to sit down and
do ... but it's hard to get right.  I know that when I write
documentation, I bring with me my knowledge and preconceptions about
how things can and should be done, and that gets in the way of laying
down a thread that goes from "Simulation" to "Plots."

I remember, I forget if it was Brian O'Shea or Britton Smith that said
this, but someone told me back in 2008, "There are two types of things
you can do with yt.  'Things you can do with yt', and 'Things you can
do with yt if you're Matt.'"  I'd thought we'd gotten away from that,
but the abject failure of the online documentation and the inability
of the existing narrative documentation to take a user and move them
from Beginner to (even) Intermediate has begun to indicate to me that
we have not.

If you go to

http://yt.enzotools.org/doc/

right now, the table of contents is ridiculously long, largely because
of the cookbook.  I think that it should be substantially stripped
down, and I'm going to propose a new table of contents, which I'll do
my best to flesh out.  (This will also be the outline of my workshop
tutorial, I think.)

 * Introduction (move Jeff's "Analysis Philosophy" section here)
 * Installing and Starting Up
 * Just Looking At Some Data: new and revised introduction, no real
advanced stuff, nothing but meaty walkthrough
 * Cookbook
 * Intermediate and Advanced yt: less narrative, more discussion of
some underlying components like profiles, projections, extracted
regions
 * Extensions: this section will probably be moved, as-is
 * Contributing Code
 * Asking for Help
 * FAQ
 * YT Methods
 * API Docs: this is where we will have the many-many-pages of API
docs, hopefully organized relatively well.  I'll have a stab at this
today or soon.
 * Changelog

Anyway, I think maybe that would be an improvement ... but I'm not
sure I am yet on the right track.  Any thoughts or suggestions?

Thanks, everyone, for your helpful ideas and discussion.  I think
these kind of discussions are important, because they help us to
re-evaluate the public facing side of a code we're all contributing to
and that we can all be proud of.

-Matt


On Thu, Jun 3, 2010 at 5:34 PM, j s oishi <jsoishi at gmail.com> wrote:
> Hi,
>
> I just want to briefly make a few extra comments, though I won't add
> much to the already excellent discussion. First, I vote for the Numpy
> standard as well. However, I think Dave raises a very important point:
> dealing with the multidimensional nature of OOP documentation will be
> the biggest challenge. I have recently gone through and documented a
> very large, extremely object oriented code base (entirely for my own
> benefit, as I was the only novice user at the time). I found one of
> the biggest challenges in OOP is the notion of multiple entry points
> to various objects and their connection to one another (inheritance,
> especially). I don't really have a solution to the problems Dave
> mentions, but perhaps if we are clear about return values and can
> provide some way of linking input parameters to their objects, that
> might help a lot toward a "if you like this, why don't you try that"
> style.
>
> As a side note, not only does the help(pf.h) not reveal much
> information about class relationships, in iyt
>
> In [1]: pf.h?
>
> returns nothing even remotely useful. This is definitely something
> that needs resolution, as there is a growing number of interactive yt
> users coming from idl-land who may think there is no on-line help if ?
> fails.
>
> Like Tom and Matt, I had a great experience with an early manual
> typeset with one thing per page. This is a great idea, and I think we
> can improve it by setting out some kind of entry point for large yt
> concepts. For example with matplotlib, I always start with something
> like
>
> fig = figure()
> ax = fig.add_axis([0.1,0.1,0.8,0.8])
>
> and that axis object is my entry point to nearly the entire plot. If
> we had something that was somewhere in between an API doc and the
> manual at large that gave a few starting points and then linked to the
> API doc for that entry point, perhaps that might help some.
>
> j
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>



More information about the yt-dev mailing list