[Yt-dev] yt documentation, standards, implementation

Sam Skillman samskillman at gmail.com
Tue Jun 1 08:10:30 PDT 2010


 Hi all,

First of all let me say that I'm a fan of moving to more complete
docstrings.  As Matt said, the help() or ? capability in python is really a
great feature that makes it easy to explore an analysis/computation package
and has been crucial for me with packages such as numpy, matplotlib, scipy,
etc.  I think that if we were to move to a standardized format for the
docstrings both beginning and advanced users of yt would also benefit. For
example, if you pull up the help on pc.add_slice, one gets:

In [4]: ?pc.add_slice
Type: instancemethod
Base Class: <type 'instancemethod'>
String Form: <bound method PlotCollection.add_slice of
<yt.raven.PlotCollection.PlotCollection object at 0x1047dcdd0>>
Namespace: Interactive
File:
/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/yt-2.0dev-py2.6-macosx-10.6-i386.egg/yt/raven/PlotCollection.py
Definition: pc.add_slice(self, *args, **kwargs)
Docstring:
    Generate a slice through *field* along *axis*, optionally at
    [axis]=*coord*, with the *center* attribute given (some
    degeneracy with *coord*, but not complete), with *use_colorbar*
    specifying whether the plot is naked or not and optionally
    providing pre-existing Matplotlib *figure* and *axes* objects.
    *fig_size* in (height_inches, width_inches)

This is tricky because we have moved to *args, **kwargs for the arguments,
and then the docstring does not explain what the args, kwargs are.
 Explicitly listing the arguments along with their types and a quick
description would be invaluable to new users.

As for the format of the docstrings, I am fine with the NumPy standard.  I
would suggest that whatever the standard becomes that there be a boilerplate
example that can just be copied into any new function where developers can
edit where necessary, and stick this example somewhere like the doc folder.
 I'm thinking of something like:

    """
    DESCRIPTION

    CAVEATS

    Parameters
    ----------
    ARG : TYPE
        ARG_DESCRIPTION
    KWARG : TYPE, optional
        KWARG_DESCRIPTION

    Returns
    -------
    VAL : TYPE

    Raises
    ------
    ERROR
        ERROR_CASE.

    See Also
    --------
    OTHER_FUNCTIONS

    Examples
    --------
    >>> SETUP
    >>> EXECUTE
    >>> SHOW_RESULTS

    """

This should help reduce the overhead in properly documenting newly written
functions.

Finally, since I do seem to read the majority of commits coming in, I'd be
happy to be the guy to remind someone when their docstrings aren't up to
snuff.

Cheers,
Sam

On Sun, May 30, 2010 at 4:18 PM, Matthew Turk <matthewturk at gmail.com> wrote:

> Hi all,
>
> This last Friday I had a chance to talk to Tom Abel and Oliver Hahn
> (both CC'd on this message) about their experiences with using yt, and
> they brought up some points which I've now had a chance to think
> about, and which I find very interesting, certainly as something to
> discuss.  Here are my notes on it, along with a proposal for moving
> forward.
>
> As a quick note, what really hit home that we need better
> documentation was trying to make a thin projection.  The definition of
> what a 'source' could be wasn't there, there were no examples, and I
> had to go look at the source to figure out what the parameters were
> even called.  I think that's not ... good.
>
> Python Inline Documentation
> ===========================
>
> One of the coolest things about Python is the help() function, which
> prints out the function signature and the contents of the doc string.
> In the source code, the docstring is inline in the function, like so:
>
> def some_function(a, b, c):
>    """
>    This function does something.
>    """
>    return a+b+c
>
> The output of help(some_function) would look like this:
>
> >>> help(some_function)
> Help on function some_function in module __main__:
>
> some_function(a, b, c)
>    This function does something.
> >>>
>
> Generated Documentation
> =======================
>
> The yt docs are generated using an extension to Sphinx called autodoc.
>  What this does, as you can see by going to the API docs and clicking
> "view source" (which, counterintuitively, displays the doc source and
> not the source code of the functions) is at documentaion build time,
> pull all the docstrings from the source and render them in the
> document.  Ideally, we would want something that renders nicely as
> well as looks good in the inline help -- and to maximize the detail
> without becoming encumbering.
>
> For most of the functions in yt that have docstrings, they have been
> written in a narrative style, with parameters inside asterisks, so
> that they would render nicely in the API docs:
>
>
> http://yt.enzotools.org/doc/modules/amrcode.html#yt-lagos-outputtypes-output-types
>
> But, it's becoming clear that perhaps this is not the best approach.
> I think a combination of narrative and explicit parameter declaration
> would be better.  The NumPy/SciPy projects have a CodingStandards
> description:
>
> http://projects.scipy.org/numpy/wiki/CodingStyleGuidelines
>
> that covers docstrings, with a very detailed example of a completely
> filled out docstring here:
>
> http://svn.scipy.org/svn/numpy/trunk/doc/example.py
>
> As an example, the 'tensorsolve' function is defined here:
>
> http://svn.scipy.org/svn/numpy/trunk/numpy/linalg/linalg.py
>
> and the API docs are here:
>
>
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.tensorsolve.html
>
> This looks great, I think.  yt is a bit more class-oriented than
> NumPy, but I believe that we should strive for a similar level of
> detail as well as a similar style: presenting parameters, what those
> parameters can be, and a brief word on the return type.
>
> Ideal Type Of Documentation
> ===========================
>
> A few weeks ago, Tom and I were chatting and he mentioned to me a
> Pascal manual.  In this manual, there was a single function on every
> page: a description, parameters (often repeated between functions, but
> explicitly listed for each), and an example.  My first Unix manual was
> exactly like this, and I remember it being one of the best sets of
> documentation I've ever used.  I believe this is the model NumPy and
> SciPy are striving for, as well.
>
> I think this is what yt should strive for, too.  One page per class or
> function, with a description, parameters, and examples -- just like
> mentioned above.  In doing so, I think that the online help -- which
> right now is sort of helpful, but not amazingly helpful, would become
> much more useful.
>
> The fact that on the mailing lists we get questions asking us about
> fundamental operations in yt is, I think, an indictment of the way
> it's presented.  As the Enzo Workshop revs up, a couple of us will be
> writing talks about using Enzo, using yt, etc, and I think this is a
> time to harness that momentum to reorganize and rewrite some of the
> doc strings.  Of course, I would take the lead on the initial rewrite,
> as I'm the one who wrote all the bad docstrings.
>
> What does everyone think about this?
>
> Action Items
> ============
>
> (It wouldn't be a long email about procedures if we didn't use a
> buzzword like 'action items' :)
>
> Firstly: a vote and a request for comments.
>
> Do we want to agree on the NumPy standard for docstrings?  What does
> everyone think about this idea, of a set of docstring guidelines, and
> trying to focus on a better set of API documentation, to be used both
> in generated form and inline via help()?
>
> If we can agree on the NumPy standard, I believe that I should be able
> to convert most of the docstrings with some relative ease; it's mostly
> going to be a matter of typing, copy/pasting, etc.  I will copy a
> style guide into doc/, which will be largely taken from the NumPy
> style guide, but I will additionally add a document with examples for
> common strings: I would prefer we have a single, consistent manner for
> referring to things like AMR3DData as a source, for instance.  I will
> then go through and convert all the doc strings that I am familiar
> with.  This would leave us with three files:
>
>    * Example docstring, which can be read in verbatim and edited.
>    * List of yt idioms for cross-referencing and describing things.
>    * File describing this standard, largely pulling from the NumPy
> standard.
>
> The next thing will be, going forward, how do we ensure that the doc
> strings are correctly inserted with new code?  I am more guilty of
> this than I would care to admit (I sometimes fall into the camp of
> thinking that functions with well-named parameters are
> self-documenting, which is probably a mistake!) but I think having
> someone agree to review incoming changesets for documentation updates,
> and then to email the committer if they do not have a sufficient
> docstring.  My inclination is to suggest that someone who already
> reviews incoming changesets to do this, which I think means either me,
> Sam or Stephen.  Sam, would you be willing to take this on?  It should
> be relatively straightforward.
>
> Additionally, would anyone volunteer to help me out with rewriting
> some of the existing docstrings?  In particular, for code you have
> contributed?
>
> The End
> =======
>
> I think that if we really take the docstrings seriously, then the
> documentation on the whole will vastly improve.  I am in the process
> of rewriting some sections, removing the old-style tutorial and trying
> to better walk the user through the process of getting up and running.
>  The current documentation has a lot of information, but it's not very
> good at getting people up and running in anything other than the most
> simple manner.  I think that getting started on improving the
> docstrings will also help refocus efforts toward better documentation
> on the whole.  And, I'd like to end by admitting culpability for the
> sorry state of the docstrings we currently have.  But I think this
> might be good, in the long run, because it'll help out with getting us
> on track for a better code that's much easier to use!
>
> And finally, thanks to Tom and Oliver for taking the time to chat with
> me about this -- I really appreciate their thoughtful feedback on
> this.
>
> Best,
>
> Matt
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>



-- 
Samuel W. Skillman
DOE Computational Science Graduate Fellow
Center for Astrophysics and Space Astronomy
University of Colorado at Boulder
samuel.skillman[at]colorado.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20100601/a88d0428/attachment.htm>


More information about the yt-dev mailing list