[yt-svn] commit/yt: 6 new changesets

Wed Jul 30 08:41:24 PDT 2014

6 new commits in yt:

https://bitbucket.org/yt_analysis/yt/commits/b896ec2eda8b/
Changeset:   b896ec2eda8b
Branch:      yt-3.0
User:        chummels
Date:        2014-07-30 07:02:17
Summary:     Updating parallel docs.
Affected #:  1 file

diff -r d2c88a547e71b223491ca4a00fda7bc783eb2c3a -r b896ec2eda8b2bed2a2674d3473322c469aa7777 doc/source/analyzing/parallel_computation.rst

--- a/doc/source/analyzing/parallel_computation.rst
+++ b/doc/source/analyzing/parallel_computation.rst
@@ -1,7 +1,7 @@
 .. _parallel-computation:
 
-Parallel Computation With YT
-============================
+Parallel Computation With ``yt``
+================================
 
 ``yt`` has been instrumented with the ability to compute many -- most, even --
 quantities in parallel.  This utilizes the package 
@@ -15,23 +15,23 @@
 
 Currently, ``yt`` is able to perform the following actions in parallel:
 
- * Projections (:ref:`projection-plots`)
- * Slices (:ref:`slice-plots`)
- * Cutting planes (oblique slices) (:ref:`off-axis-slices`)
- * Derived Quantities (total mass, angular momentum, etc) (:ref:`creating_derived_quantities`,
-   :ref:`derived-quantities`)
- * 1-, 2-, and 3-D profiles (:ref:`generating-profiles-and-histograms`)
- * Halo finding (:ref:`halo_finding`)
- * Volume rendering (:ref:`volume_rendering`)
- * Isocontours & flux calculations (:ref:`extracting-isocontour-information`)
+* Projections (:ref:`projection-plots`)
+* Slices (:ref:`slice-plots`)
+* Cutting planes (oblique slices) (:ref:`off-axis-slices`)
+* Derived Quantities (total mass, angular momentum, etc) (:ref:`creating_derived_quantities`,
+  :ref:`derived-quantities`)
+* 1-, 2-, and 3-D profiles (:ref:`generating-profiles-and-histograms`)
+* Halo finding (:ref:`halo_finding`)
+* Volume rendering (:ref:`volume_rendering`)
+* Isocontours & flux calculations (:ref:`extracting-isocontour-information`)
 
 This list covers just about every action ``yt`` can take!  Additionally, almost all
 scripts will benefit from parallelization with minimal modification.  The goal
 of Parallel-``yt`` has been to retain API compatibility and abstract all
 parallelism.
 
-Setting Up Parallel YT
-----------------------
+Setting Up Parallel ``yt``
+--------------------------
 
 To run scripts in parallel, you must first install `mpi4py
 <http://code.google.com/p/mpi4py>`_ as well as an MPI library, if one is not
@@ -73,10 +73,10 @@
 work in parallel -- and no additional work is necessary to parallelize those
 processes.
 
-Running a ``yt`` script in parallel
+Running a ``yt`` Script in Parallel
 -----------------------------------
 
-Many basic ``yt`` operations will run in parallel if yt's parallelism is enabled at
+Many basic ``yt`` operations will run in parallel if ``yt``'s parallelism is enabled at
 startup.  For example, the following script finds the maximum density location
 in the simulation and then makes a plot of the projected density:
 
@@ -105,7 +105,7 @@
    If you run into problems, the you can use :ref:`remote-debugging` to examine
    what went wrong.
 
-Creating Parallel and Serial Sections in a script
+Creating Parallel and Serial Sections in a Script
 +++++++++++++++++++++++++++++++++++++++++++++++++
 
 Many ``yt`` operations will automatically run in parallel (see the next section for
@@ -162,7 +162,7 @@
 --------------------
 
 In order to divide up the work, ``yt`` will attempt to send different tasks to
-different processors.  However, to minimize inter-process communication, YT
+different processors.  However, to minimize inter-process communication, ``yt``
 will decompose the information in different ways based on the task.
 
 Spatial Decomposition
@@ -175,8 +175,8 @@
 
 The following operations use spatial decomposition:
 
-  * Halo finding
-  * Volume rendering
+* :ref:`halo_finding`
+* :ref:`volume_rendering`
 
 Grid Decomposition
 ++++++++++++++++++
@@ -188,27 +188,68 @@
 
 The following operations use chunk decomposition:
 
-  * Projections
-  * Slices
-  * Cutting planes
-  * Derived Quantities
-  * 1-, 2-, and 3-D profiles
-  * Isocontours & flux calculations
+* Projections (see :ref:`available-objects`)
+* Slices (see :ref:`available-objects`)
+* Cutting planes (see :ref:`available-objects`)
+* Derived Quantities (see :ref:`derived-quantities`)
+* 1-, 2-, and 3-D profiles (see :ref:`generating-profiles-and-histograms`)
+* Isocontours & flux calculations (see :ref:`surfaces`)
 
-Object-Based
-++++++++++++
+Parallelization over Multiple Objects and Datasets
+++++++++++++++++++++++++++++++++++++++++++++++++++
 
-In a fashion similar to grid decomposition, computation can be parallelized
-over objects. This is especially useful for
+If you have a set of computational steps that need to apply identically and 
+independently to several different objects or datasets, a so-called 
 `embarrassingly parallel <http://en.wikipedia.org/wiki/Embarrassingly_parallel>`_
-tasks where the items to be worked on can be split into separate chunks and
-saved to a list. The list is then split up and each MPI task performs parts of
-it independently.
+task, ``yt`` can do that easily.  See the sections below on 
+:ref:`parallelizing-your-analysis` and :ref:`parallel-time-series-analysis`.
+
+Use of ``piter()``
+^^^^^^^^^^^^^^^^^^
+
+If you use parallelism over objects or datasets, you will encounter
+the ``piter()`` function.  ``piter`` is a parallel iterator, which effectively
+doles out each item of a DatasetSeries object to a different processor.  In
+serial processing, you might iterate over a DatasetSeries by:
+
+.. code-block:: python
+
+    for dataset in dataset_series:
+        <process>
+
+But in parallel, you can use ``piter()`` to force each dataset to go to
+a different processor:
+
+.. code-block:: python
+
+    yt.enable_parallelism()
+    for dataset in dataset_series.piter():
+        <process>
+
+However, because things are being done in parallel, you cannot share other
+data structures in the processing loop to store the outputs of your *<processing>*
+step.  But ``piter()`` provides functionality for this.  You may define an
+empty dictionary and include it as the keyword argument ``storage`` to 
+``piter()`` and it will be able to store processed data from within the
+``piter`` loop as the ``sto`` object for use afterwards.  After the loop is 
+finished, the dictionary is re-aggragated from all of the processors, and you 
+can use the contents:
+
+.. code-block:: python
+
+    yt.enable_parallelism()
+    my_dictionary = {}
+    for dataset in dataset_series.piter(storage=my_dictionary):
+        <process>
+        sto.result = <some information processed for this dataset>
+        sto.result_id = <some identfier for this dataset>
+
+    print my_dictionary
 
 .. _parallelizing-your-analysis:
 
-Parallelizing Your Analysis
----------------------------
+Parallelizing over Multiple Objects
+-----------------------------------
 
 It is easy within ``yt`` to parallelize a list of tasks, as long as those tasks
 are independent of one another. Using object-based parallelism, the function
@@ -279,8 +320,8 @@
 
 .. _parallel-time-series-analysis:
 
-Parallel Time Series Analysis
------------------------------
+Parallelization over Multiple Datasets (including Time Series)
+--------------------------------------------------------------
 
 The same ``parallel_objects`` machinery discussed above is turned on by
 default when using a :class:`~yt.data_objects.time_series.DatasetSeries` object
@@ -294,15 +335,23 @@
    import yt
    yt.enable_parallelism()
 
+   # Load all of the DD*/output_* files into a DatasetSeries object
+   # in this case it is a Time Series
    ts = yt.load("DD*/output_*")
 
+   # Define an empty storage dictionary for collecting information
+   # in parallel through processing
    storage = {}
 
+   # Use piter() to iterate over the time series, one proc per dataset
+   # and store the resulting information from each dataset in
+   # the storage dictionary
    for sto, ds in ts.piter(storage=storage):
        sphere = ds.sphere("max", (1.0, "pc"))
        sto.result = sphere.quantities.angular_momentum_vector()
        sto.result_id = str(ds)
 
+   # Print out the angular momentum vector for all of the datasets
    for L in sorted(storage.items()):
        print L
 
@@ -311,10 +360,10 @@
 processor.
 
 You can also request a fixed number of processors to calculate each
-angular momentum vector.  For example, this script will calculate each angular
-momentum vector using 4 workgroups, splitting up the pool available processors.
-Note that parallel=1 implies that the analysis will be run using 1 workgroup, 
-whereas parallel=True will run with Nprocs workgroups.
+angular momentum vector.  For example, the following script will calculate each 
+angular momentum vector using 4 workgroups, splitting up the pool available 
+processors.  Note that parallel=1 implies that the analysis will be run using 
+1 workgroup, whereas parallel=True will run with Nprocs workgroups.
 
 .. code-block:: python
 
@@ -366,31 +415,31 @@
 two-dimensional representations of data.  All three have been parallelized in a
 chunk-based fashion.
 
- * Projections: projections are parallelized utilizing a quad-tree approach.
-   Data is loaded for each processor, typically by a process that consolidates
-   open/close/read operations, and each grid is then iterated over and cells are
-   deposited into a data structure that stores values corresponding to positions
-   in the two-dimensional plane.  This provides excellent load balancing, and in
-   serial is quite fast.  However, the operation by which quadtrees are joined
-   across processors scales poorly; while memory consumption scales well, the
-   time to completion does not.  As such, projections can often be done very
-   fast when operating only on a single processor!  The quadtree algorithm can
-   be used inline (and, indeed, it is for this reason that it is slow.)  It is
-   recommended that you attempt to project in serial before projecting in
-   parallel; even for the very largest datasets (Enzo 1024^3 root grid with 7
-   levels of refinement) in the absence of IO the quadtree algorithm takes only
-   three minutes or so on a decent processor.
+* **Projections**: projections are parallelized utilizing a quad-tree approach.
+  Data is loaded for each processor, typically by a process that consolidates
+  open/close/read operations, and each grid is then iterated over and cells are
+  deposited into a data structure that stores values corresponding to positions
+  in the two-dimensional plane.  This provides excellent load balancing, and in
+  serial is quite fast.  However, the operation by which quadtrees are joined
+  across processors scales poorly; while memory consumption scales well, the
+  time to completion does not.  As such, projections can often be done very
+  fast when operating only on a single processor!  The quadtree algorithm can
+  be used inline (and, indeed, it is for this reason that it is slow.)  It is
+  recommended that you attempt to project in serial before projecting in
+  parallel; even for the very largest datasets (Enzo 1024^3 root grid with 7
+  levels of refinement) in the absence of IO the quadtree algorithm takes only
+  three minutes or so on a decent processor.
 
- * Slices: to generate a slice, chunks that intersect a given slice are iterated
-   over and their finest-resolution cells are deposited.  The chunks are
-   decomposed via standard load balancing.  While this operation is parallel,
-   **it is almost never necessary to slice a dataset in parallel**, as all data is
-   loaded on demand anyway.  The slice operation has been parallelized so as to
-   enable slicing when running *in situ*.
+* **Slices**: to generate a slice, chunks that intersect a given slice are iterated
+  over and their finest-resolution cells are deposited.  The chunks are
+  decomposed via standard load balancing.  While this operation is parallel,
+  **it is almost never necessary to slice a dataset in parallel**, as all data is
+  loaded on demand anyway.  The slice operation has been parallelized so as to
+  enable slicing when running *in situ*.
 
- * Cutting planes: cutting planes are parallelized exactly as slices are.
-   However, in contrast to slices, because the data-selection operation can be
-   much more time consuming, cutting planes often benefit from parallelism.
+* **Cutting planes**: cutting planes are parallelized exactly as slices are.
+  However, in contrast to slices, because the data-selection operation can be
+  much more time consuming, cutting planes often benefit from parallelism.
 
 Object-Based
 ++++++++++++
@@ -437,6 +486,7 @@
 roughly 1 MB of memory per 5,000 particles, although recent work has improved
 this and the memory requirement is now smaller than this. But this is a good
 starting point for beginning to calculate the memory required for halo-finding.
+For more information, see :ref:`halo_finding`.
 
 **Volume Rendering**
 
@@ -449,81 +499,82 @@
 number of chunks.  In order to keep work distributed evenly, typically the
 number of processors should be no greater than one-eighth or one-quarter the
 number of processors that were used to produce the dataset.
+For more information, see :ref:`volume_rendering`.
 
 Additional Tips
 ---------------
 
-  * Don't be afraid to change how a parallel job is run. Change the
-    number of processors, or memory allocated, and see if things work better
-    or worse. After all, it's just a computer, it doesn't pass moral judgment!
+* Don't be afraid to change how a parallel job is run. Change the
+  number of processors, or memory allocated, and see if things work better
+  or worse. After all, it's just a computer, it doesn't pass moral judgment!
 
-  * Similarly, human time is more valuable than computer time. Try increasing
-    the number of processors, and see if the runtime drops significantly.
-    There will be a sweet spot between speed of run and the waiting time in
-    the job scheduler queue; it may be worth trying to find it.
+* Similarly, human time is more valuable than computer time. Try increasing
+  the number of processors, and see if the runtime drops significantly.
+  There will be a sweet spot between speed of run and the waiting time in
+  the job scheduler queue; it may be worth trying to find it.
 
-  * If you are using object-based parallelism but doing CPU-intensive computations
-    on each object, you may find that setting ``num_procs`` equal to the 
-    number of processors per compute node can lead to significant speedups.
-    By default, most mpi implementations will assign tasks to processors on a
-    'by-slot' basis, so this setting will tell ``yt`` to do computations on a single
-    object using only the processors on a single compute node.  A nice application
-    for this type of parallelism is calculating a list of derived quantities for 
-    a large number of simulation outputs.
+* If you are using object-based parallelism but doing CPU-intensive computations
+  on each object, you may find that setting ``num_procs`` equal to the 
+  number of processors per compute node can lead to significant speedups.
+  By default, most mpi implementations will assign tasks to processors on a
+  'by-slot' basis, so this setting will tell ``yt`` to do computations on a single
+  object using only the processors on a single compute node.  A nice application
+  for this type of parallelism is calculating a list of derived quantities for 
+  a large number of simulation outputs.
 
-  * It is impossible to tune a parallel operation without understanding what's
-    going on. Read the documentation, look at the underlying code, or talk to
-    other ``yt`` users. Get informed!
+* It is impossible to tune a parallel operation without understanding what's
+  going on. Read the documentation, look at the underlying code, or talk to
+  other ``yt`` users. Get informed!
     
-  * Sometimes it is difficult to know if a job is cpu, memory, or disk
-    intensive, especially if the parallel job utilizes several of the kinds of
-    parallelism discussed above. In this case, it may be worthwhile to put
-    some simple timers in your script (as below) around different parts.
+* Sometimes it is difficult to know if a job is cpu, memory, or disk
+  intensive, especially if the parallel job utilizes several of the kinds of
+  parallelism discussed above. In this case, it may be worthwhile to put
+  some simple timers in your script (as below) around different parts.
+  
+.. code-block:: python
     
-    .. code-block:: python
-    
-       import yt
-       import time
+   import yt
+   import time
 
-       yt.enable_parallelism()
+   yt.enable_parallelism()
 
-       ds = yt.load("DD0152")
-       t0 = time.time()
-       bigstuff, hugestuff = StuffFinder(ds)
-       BigHugeStuffParallelFunction(ds, bigstuff, hugestuff)
-       t1 = time.time()
-       for i in range(1000000):
-           tinystuff, ministuff = GetTinyMiniStuffOffDisk("in%06d.txt" % i)
-           array = TinyTeensyParallelFunction(ds, tinystuff, ministuff)
-           SaveTinyMiniStuffToDisk("out%06d.txt" % i, array)
-       t2 = time.time()
+   ds = yt.load("DD0152")
+   t0 = time.time()
+   bigstuff, hugestuff = StuffFinder(ds)
+   BigHugeStuffParallelFunction(ds, bigstuff, hugestuff)
+   t1 = time.time()
+   for i in range(1000000):
+       tinystuff, ministuff = GetTinyMiniStuffOffDisk("in%06d.txt" % i)
+       array = TinyTeensyParallelFunction(ds, tinystuff, ministuff)
+       SaveTinyMiniStuffToDisk("out%06d.txt" % i, array)
+   t2 = time.time()
+   
+   if yt.is_root()
+       print "BigStuff took %.5e sec, TinyStuff took %.5e sec" % (t1 - t0, t2 - t1)
+  
+* Remember that if the script handles disk IO explicitly, and does not use
+  a built-in ``yt`` function to write data to disk,
+  care must be taken to
+  avoid `race-conditions <http://en.wikipedia.org/wiki/Race_conditions>`_.
+  Be explicit about which MPI task writes to disk using a construction
+  something like this:
+  
+.. code-block:: python
        
-       if yt.is_root()
-           print "BigStuff took %.5e sec, TinyStuff took %.5e sec" % (t1 - t0, t2 - t1)
-  
-  * Remember that if the script handles disk IO explicitly, and does not use
-    a built-in ``yt`` function to write data to disk,
-    care must be taken to
-    avoid `race-conditions <http://en.wikipedia.org/wiki/Race_conditions>`_.
-    Be explicit about which MPI task writes to disk using a construction
-    something like this:
-    
-    .. code-block:: python
-       
-       if yt.is_root()
-           file = open("out.txt", "w")
-           file.write(stuff)
-           file.close()
+   if yt.is_root()
+       file = open("out.txt", "w")
+       file.write(stuff)
+       file.close()
 
-  * Many supercomputers allow users to ssh into the nodes that their job is
-    running on.
-    Many job schedulers send the names of the nodes that are
-    used in the notification emails, or a command like ``qstat -f NNNN``, where
-    ``NNNN`` is the job ID, will also show this information.
-    By ssh-ing into nodes, the memory usage of each task can be viewed in
-    real-time as the job runs (using ``top``, for example),
-    and can give valuable feedback about the
-    resources the task requires.
+* Many supercomputers allow users to ssh into the nodes that their job is
+  running on.
+  Many job schedulers send the names of the nodes that are
+  used in the notification emails, or a command like ``qstat -f NNNN``, where
+  ``NNNN`` is the job ID, will also show this information.
+  By ssh-ing into nodes, the memory usage of each task can be viewed in
+  real-time as the job runs (using ``top``, for example),
+  and can give valuable feedback about the
+  resources the task requires.
     
 An Advanced Worked Example
 --------------------------
@@ -532,19 +583,19 @@
 simulation.  This script was designed to analyze a set of 100 outputs on
 Gordon, running on 128 processors.  This script goes through three phases:
 
- #. Define a new derived field, which calculates the fraction of ionized
-    hydrogen as a function only of the total hydrogen density.
- #. Load a time series up, specifying ``parallel = 8``.  This means that it
-    will decompose into 8 jobs.  So if we ran on 128 processors, we would have
-    16 processors assigned to each output in the time series.
- #. Creating a big cube that will hold our results for this set of processors.
-    Note that this will be only for each output considered by this processor,
-    and this cube will not necessarily be filled in in every cell.
- #. For each output, distribute the grids to each of the sixteen processors
-    working on that output.  Each of these takes the max of the ionized
-    redshift in their zone versus the accumulation cube.
- #. Iterate over slabs and find the maximum redshift in each slab of our
-    accumulation cube.
+#. Define a new derived field, which calculates the fraction of ionized
+   hydrogen as a function only of the total hydrogen density.
+#. Load a time series up, specifying ``parallel = 8``.  This means that it
+   will decompose into 8 jobs.  So if we ran on 128 processors, we would have
+   16 processors assigned to each output in the time series.
+#. Creating a big cube that will hold our results for this set of processors.
+   Note that this will be only for each output considered by this processor,
+   and this cube will not necessarily be filled in in every cell.
+#. For each output, distribute the grids to each of the sixteen processors
+   working on that output.  Each of these takes the max of the ionized
+   redshift in their zone versus the accumulation cube.
+#. Iterate over slabs and find the maximum redshift in each slab of our
+   accumulation cube.
 
 At the end, the root processor (of the global calculation) writes out an
 ionization cube that contains the redshift of first reionization for each zone


https://bitbucket.org/yt_analysis/yt/commits/f01c599d02c4/
Changeset:   f01c599d02c4
Branch:      yt-3.0
User:        chummels
Date:        2014-07-30 16:29:03
Summary:     Correcting a header fix with unit conversions.
Affected #:  3 files

diff -r b896ec2eda8b2bed2a2674d3473322c469aa7777 -r f01c599d02c427a9685f5762aea09bd7faf53c02 doc/source/analyzing/analysis_modules/clump_finding.rst
--- a/doc/source/analyzing/analysis_modules/clump_finding.rst
+++ b/doc/source/analyzing/analysis_modules/clump_finding.rst
@@ -13,8 +13,8 @@
 the result of user-specified functions, such as checking for gravitational 
 boundedness.  A sample recipe can be found in :ref:`cookbook-find_clumps`.
 
-The clump finder requires a data container and a field over which the 
-contouring is to be performed.
+The clump finder requires a data object (see :ref:`data-objects`) and a field 
+over which the contouring is to be performed.
 
 .. code:: python
 

diff -r b896ec2eda8b2bed2a2674d3473322c469aa7777 -r f01c599d02c427a9685f5762aea09bd7faf53c02 doc/source/analyzing/index.rst
--- a/doc/source/analyzing/index.rst
+++ b/doc/source/analyzing/index.rst
@@ -13,5 +13,5 @@
    generating_processed_data
    time_series_analysis
    parallel_computation
+   analysis_modules/index
    external_analysis
-   analysis_modules/index

diff -r b896ec2eda8b2bed2a2674d3473322c469aa7777 -r f01c599d02c427a9685f5762aea09bd7faf53c02 doc/source/analyzing/units/fields_and_unit_conversion.rst
--- a/doc/source/analyzing/units/fields_and_unit_conversion.rst
+++ b/doc/source/analyzing/units/fields_and_unit_conversion.rst
@@ -1,7 +1,7 @@
 .. _data_selection_and_fields:
 
-Data selection and fields
-=========================
+Fields and Unit Conversion
+==========================
 
 .. notebook:: 2)_Fields_and_unit_conversion.ipynb
 


https://bitbucket.org/yt_analysis/yt/commits/2e0447cb0745/
Changeset:   2e0447cb0745
Branch:      yt-3.0
User:        chummels
Date:        2014-07-30 16:35:03
Summary:     Reworking parallel docs to address nathan's comments.
Affected #:  1 file

diff -r f01c599d02c427a9685f5762aea09bd7faf53c02 -r 2e0447cb07459383523da7206ce7b8a8366eeaed doc/source/analyzing/parallel_computation.rst
--- a/doc/source/analyzing/parallel_computation.rst
+++ b/doc/source/analyzing/parallel_computation.rst
@@ -226,14 +226,14 @@
     for dataset in dataset_series.piter():
         <process>
 
-However, because things are being done in parallel, you cannot share other
-data structures in the processing loop to store the outputs of your *<processing>*
-step.  But ``piter()`` provides functionality for this.  You may define an
-empty dictionary and include it as the keyword argument ``storage`` to 
-``piter()`` and it will be able to store processed data from within the
-``piter`` loop as the ``sto`` object for use afterwards.  After the loop is 
-finished, the dictionary is re-aggragated from all of the processors, and you 
-can use the contents:
+In order to store information from the parallel processing step to 
+a data structure that exists on all of the processors operating in parallel
+we offer the ``storage`` keyword in the ``piter`` function.
+You may define an empty dictionary and include it as the keyword argument 
+``storage`` to ``piter()``.  Then, during the processing step, you can access
+this dictionary as the ``sto`` object.  After the 
+loop is finished, the dictionary is re-aggragated from all of the processors, 
+and you can access the contents:
 
 .. code-block:: python
 


https://bitbucket.org/yt_analysis/yt/commits/5c0c25898880/
Changeset:   5c0c25898880
Branch:      yt-3.0
User:        chummels
Date:        2014-07-30 16:42:25
Summary:     Adding parallel cross-references through the sections of the docs which can use parallelism.
Affected #:  3 files

diff -r 2e0447cb07459383523da7206ce7b8a8366eeaed -r 5c0c2589888026a8061205b538462b81d5fc191f doc/source/visualizing/sketchfab.rst
--- a/doc/source/visualizing/sketchfab.rst
+++ b/doc/source/visualizing/sketchfab.rst
@@ -44,7 +44,8 @@
 :meth:`~yt.data_objects.data_containers.YTSelectionContainer3D.extract_isocontours`.  To
 calculate a flux, call
 :meth:`~yt.data_objects.data_containers.YTSelectionContainer3D.calculate_isocontour_flux`.
-both of these operations will run in parallel.
+both of these operations will run in parallel.  For more information on enabling
+parallelism in ``yt``, see :ref:`parallel-computation`.
 
 Alternatively, you can make an object called ``YTSurfaceBase`` that makes
 this process much easier.  You can create one of these objects by specifying a

diff -r 2e0447cb07459383523da7206ce7b8a8366eeaed -r 5c0c2589888026a8061205b538462b81d5fc191f doc/source/visualizing/streamlines.rst
--- a/doc/source/visualizing/streamlines.rst
+++ b/doc/source/visualizing/streamlines.rst
@@ -111,4 +111,5 @@
 each processor has access to all of the streamlines through the use of
 a reduction operation.
 
-Parallel usage is specified using the standard ``--parallel`` flag.
+For more information on enabling parallelism in ``yt``, see 
+:ref:`parallel-computation`.

diff -r 2e0447cb07459383523da7206ce7b8a8366eeaed -r 5c0c2589888026a8061205b538462b81d5fc191f doc/source/visualizing/volume_rendering.rst
--- a/doc/source/visualizing/volume_rendering.rst
+++ b/doc/source/visualizing/volume_rendering.rst
@@ -290,6 +290,8 @@
     limiting the number of MPI tasks you can use.  This is also being addressed
     in current development by using image plane decomposition.
 
+For more information about enabling parallelism, see :ref:`parallel-computation`.
+
 OpenMP Parallelization
 ----------------------
 
@@ -329,6 +331,8 @@
     provide a good enough speedup by default that it is preferable to launching
     the MPI tasks.
 
+For more information about enabling parallelism, see :ref:`parallel-computation`.
+
 Opacity
 -------
 


https://bitbucket.org/yt_analysis/yt/commits/d332b6859d27/
Changeset:   d332b6859d27
Branch:      yt-3.0
User:        chummels
Date:        2014-07-30 17:30:17
Summary:     Adding cross-references through simple_plots cookbook recipes.
Affected #:  2 files

diff -r 5c0c2589888026a8061205b538462b81d5fc191f -r d332b6859d275b2dc93a4b8f50499b46b1553722 doc/source/cookbook/simple_plots.rst
--- a/doc/source/cookbook/simple_plots.rst
+++ b/doc/source/cookbook/simple_plots.rst
@@ -8,13 +8,8 @@
 Simple Slices
 ~~~~~~~~~~~~~
 
-This script shows the simplest way to make a slice through a dataset.
-
-Note that, by default,
-:class:`~yt.visualization.plot_window.SlicePlot` shifts the
-coordinates on the axes such that the origin is at the center of the
-slice.  To instead use the coordinates as defined in the dataset, use
-the optional argument: ``origin="native"``
+This script shows the simplest way to make a slice through a dataset.  See
+:ref:`slice-plots` for more information.
 
 .. yt_cookbook:: simple_slice.py
 
@@ -24,7 +19,8 @@
 This is the simplest way to make a projection through a dataset.  There are
 several different :ref:`projection-types`, but non-weighted line integrals
 and weighted line integrals are the two most common.  Here we create 
-density projections (non-weighted line integral):
+density projections (non-weighted line integral).  
+See :ref:`projection-plots` for more information.
 
 .. yt_cookbook:: simple_projection.py
 
@@ -32,7 +28,8 @@
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 And here we produce density-weighted temperature projections (weighted line 
-integral) for the same dataset as the non-weighted projections above:
+integral) for the same dataset as the non-weighted projections above.
+See :ref:`projection-plots` for more information.
 
 .. yt_cookbook:: simple_projection_weighted.py
 
@@ -42,6 +39,7 @@
 This demonstrates how to make a phase plot.  Phase plots can be thought of as
 two-dimensional histograms, where the value is either the weighted-average or
 the total accumulation in a cell.
+See :ref:`how-to-make-2d-profiles` for more information.
 
 .. yt_cookbook:: simple_phase.py
 
@@ -51,6 +49,7 @@
 Often, one wants to examine the distribution of one variable as a function of
 another.  This shows how to see the distribution of mass in a simulation, with
 respect to the total mass in the simulation.
+See :ref:`how-to-make-2d-profiles` for more information.
 
 .. yt_cookbook:: simple_pdf.py
 
@@ -60,6 +59,7 @@
 This is a "profile," which is a 1D histogram.  This can be thought of as either
 the total accumulation (when weight_field is set to ``None``) or the average 
 (when a weight_field is supplied.)
+See :ref:`how-to-make-1d-profiles` for more information.
 
 .. yt_cookbook:: simple_profile.py
 
@@ -67,6 +67,7 @@
 ~~~~~~~~~~~~~~~~~~~~~~
 
 This shows how to make a profile of a quantity with respect to the radius.
+See :ref:`how-to-make-1d-profiles` for more information.
 
 .. yt_cookbook:: simple_radial_profile.py
 
@@ -75,6 +76,7 @@
 
 This is a simple example of overplotting multiple 1D profiles from a number 
 of datasets to show how they evolve over time.
+See :ref:`how-to-make-1d-profiles` for more information.
 
 .. yt_cookbook:: time_series_profiles.py
 
@@ -86,6 +88,7 @@
 This shows how to plot the variance for a 1D profile.  In this example, we 
 manually create a 1D profile object, which gives us access to the variance 
 data.
+See :ref:`how-to-make-1d-profiles` for more information.
 
 .. yt_cookbook:: profile_with_variance.py
 
@@ -97,6 +100,7 @@
 :class:`~yt.visualization.plot_window.ProjectionPlot` some of the overhead of
 creating the data object can be reduced, and better performance squeezed out.
 This recipe shows how to add multiple fields to a single plot.
+See :ref:`slice-plots` and :ref:`projection-plots` for more information.
 
 .. yt_cookbook:: simple_slice_with_multiple_fields.py 
 
@@ -105,6 +109,7 @@
 
 One can create slices from any arbitrary angle, not just those aligned with
 the x,y,z axes.
+See :ref:`off-axis-slices` for more information.
 
 .. yt_cookbook:: simple_off_axis_slice.py
 
@@ -115,14 +120,18 @@
 
 Like off-axis slices, off-axis projections can be created from any arbitrary 
 viewing angle.
+See :ref:`off-axis-projections` for more information.
 
 .. yt_cookbook:: simple_off_axis_projection.py
 
+.. _cookbook-simple_volume_rendering:
+
 Simple Volume Rendering
 ~~~~~~~~~~~~~~~~~~~~~~~
 
 Volume renderings are 3D projections rendering isocontours in any arbitrary
 field (e.g. density, temperature, pressure, etc.)
+See :ref:`volume_rendering` for more information.
 
 .. yt_cookbook:: simple_volume_rendering.py
 
@@ -144,11 +153,10 @@
 cover normal use cases, sometimes more direct access to the underlying
 Matplotlib engine is necessary.  This recipe shows how to modify the plot
 window :class:`matplotlib.axes.Axes` object directly.
+See :ref:`matplotlib-customization` for more information.
 
 .. yt_cookbook:: simple_slice_matplotlib_example.py 
 
-.. _cookbook-simple_volume_rendering:
-
 Image Background Colors
 ~~~~~~~~~~~~~~~~~~~~~~~
 

diff -r 5c0c2589888026a8061205b538462b81d5fc191f -r d332b6859d275b2dc93a4b8f50499b46b1553722 doc/source/visualizing/plots.rst
--- a/doc/source/visualizing/plots.rst
+++ b/doc/source/visualizing/plots.rst
@@ -118,7 +118,13 @@
     yt.SlicePlot(ds, 'z', 'density', center=[0.2, 0.3, 0.8],
                  width = (10,'kpc')).save()
 
-The plot center is relative to the simulation coordinate system.  If supplied
+Note that, by default,
+:class:`~yt.visualization.plot_window.SlicePlot` shifts the
+coordinates on the axes such that the origin is at the center of the
+slice.  To instead use the coordinates as defined in the dataset, use
+the optional argument: ``origin="native"``
+
+If supplied
 without units, the center is assumed by in code units.  Optionally, you can
 supply 'c' or 'm' for the center.  These two choices will center the plot on the
 center of the simulation box and the coordinate of the maximum density cell,
@@ -521,6 +527,8 @@
    slc.set_buff_size(1600)
    slc.save()
 
+.. _matplotlib-customization:
+
 Further customization via matplotlib
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 


https://bitbucket.org/yt_analysis/yt/commits/7a4a92f88879/
Changeset:   7a4a92f88879
Branch:      yt-3.0
User:        chummels
Date:        2014-07-30 17:41:17
Summary:     Merged in chummels/yt/yt-3.0 (pull request #1103)

Updating parallel docs.
Affected #:  9 files

diff -r c259c6a5f8ac42834e77c4056bdb635de93620c8 -r 7a4a92f8887989509680ce553cb982ea205479de doc/source/analyzing/analysis_modules/clump_finding.rst
--- a/doc/source/analyzing/analysis_modules/clump_finding.rst
+++ b/doc/source/analyzing/analysis_modules/clump_finding.rst
@@ -13,8 +13,8 @@
 the result of user-specified functions, such as checking for gravitational 
 boundedness.  A sample recipe can be found in :ref:`cookbook-find_clumps`.
 
-The clump finder requires a data container and a field over which the 
-contouring is to be performed.
+The clump finder requires a data object (see :ref:`data-objects`) and a field 
+over which the contouring is to be performed.
 
 .. code:: python
 

diff -r c259c6a5f8ac42834e77c4056bdb635de93620c8 -r 7a4a92f8887989509680ce553cb982ea205479de doc/source/analyzing/index.rst
--- a/doc/source/analyzing/index.rst
+++ b/doc/source/analyzing/index.rst
@@ -13,5 +13,5 @@
    generating_processed_data
    time_series_analysis
    parallel_computation
+   analysis_modules/index
    external_analysis
-   analysis_modules/index

diff -r c259c6a5f8ac42834e77c4056bdb635de93620c8 -r 7a4a92f8887989509680ce553cb982ea205479de doc/source/analyzing/parallel_computation.rst
--- a/doc/source/analyzing/parallel_computation.rst
+++ b/doc/source/analyzing/parallel_computation.rst
@@ -1,7 +1,7 @@
 .. _parallel-computation:
 
-Parallel Computation With YT
-============================
+Parallel Computation With ``yt``
+================================
 
 ``yt`` has been instrumented with the ability to compute many -- most, even --
 quantities in parallel.  This utilizes the package 
@@ -15,23 +15,23 @@
 
 Currently, ``yt`` is able to perform the following actions in parallel:
 
- * Projections (:ref:`projection-plots`)
- * Slices (:ref:`slice-plots`)
- * Cutting planes (oblique slices) (:ref:`off-axis-slices`)
- * Derived Quantities (total mass, angular momentum, etc) (:ref:`creating_derived_quantities`,
-   :ref:`derived-quantities`)
- * 1-, 2-, and 3-D profiles (:ref:`generating-profiles-and-histograms`)
- * Halo finding (:ref:`halo_finding`)
- * Volume rendering (:ref:`volume_rendering`)
- * Isocontours & flux calculations (:ref:`extracting-isocontour-information`)
+* Projections (:ref:`projection-plots`)
+* Slices (:ref:`slice-plots`)
+* Cutting planes (oblique slices) (:ref:`off-axis-slices`)
+* Derived Quantities (total mass, angular momentum, etc) (:ref:`creating_derived_quantities`,
+  :ref:`derived-quantities`)
+* 1-, 2-, and 3-D profiles (:ref:`generating-profiles-and-histograms`)
+* Halo finding (:ref:`halo_finding`)
+* Volume rendering (:ref:`volume_rendering`)
+* Isocontours & flux calculations (:ref:`extracting-isocontour-information`)
 
 This list covers just about every action ``yt`` can take!  Additionally, almost all
 scripts will benefit from parallelization with minimal modification.  The goal
 of Parallel-``yt`` has been to retain API compatibility and abstract all
 parallelism.
 
-Setting Up Parallel YT
-----------------------
+Setting Up Parallel ``yt``
+--------------------------
 
 To run scripts in parallel, you must first install `mpi4py
 <http://code.google.com/p/mpi4py>`_ as well as an MPI library, if one is not
@@ -73,10 +73,10 @@
 work in parallel -- and no additional work is necessary to parallelize those
 processes.
 
-Running a ``yt`` script in parallel
+Running a ``yt`` Script in Parallel
 -----------------------------------
 
-Many basic ``yt`` operations will run in parallel if yt's parallelism is enabled at
+Many basic ``yt`` operations will run in parallel if ``yt``'s parallelism is enabled at
 startup.  For example, the following script finds the maximum density location
 in the simulation and then makes a plot of the projected density:
 
@@ -105,7 +105,7 @@
    If you run into problems, the you can use :ref:`remote-debugging` to examine
    what went wrong.
 
-Creating Parallel and Serial Sections in a script
+Creating Parallel and Serial Sections in a Script
 +++++++++++++++++++++++++++++++++++++++++++++++++
 
 Many ``yt`` operations will automatically run in parallel (see the next section for
@@ -162,7 +162,7 @@
 --------------------
 
 In order to divide up the work, ``yt`` will attempt to send different tasks to
-different processors.  However, to minimize inter-process communication, YT
+different processors.  However, to minimize inter-process communication, ``yt``
 will decompose the information in different ways based on the task.
 
 Spatial Decomposition
@@ -175,8 +175,8 @@
 
 The following operations use spatial decomposition:
 
-  * Halo finding
-  * Volume rendering
+* :ref:`halo_finding`
+* :ref:`volume_rendering`
 
 Grid Decomposition
 ++++++++++++++++++
@@ -188,27 +188,68 @@
 
 The following operations use chunk decomposition:
 
-  * Projections
-  * Slices
-  * Cutting planes
-  * Derived Quantities
-  * 1-, 2-, and 3-D profiles
-  * Isocontours & flux calculations
+* Projections (see :ref:`available-objects`)
+* Slices (see :ref:`available-objects`)
+* Cutting planes (see :ref:`available-objects`)
+* Derived Quantities (see :ref:`derived-quantities`)
+* 1-, 2-, and 3-D profiles (see :ref:`generating-profiles-and-histograms`)
+* Isocontours & flux calculations (see :ref:`surfaces`)
 
-Object-Based
-++++++++++++
+Parallelization over Multiple Objects and Datasets
+++++++++++++++++++++++++++++++++++++++++++++++++++
 
-In a fashion similar to grid decomposition, computation can be parallelized
-over objects. This is especially useful for
+If you have a set of computational steps that need to apply identically and 
+independently to several different objects or datasets, a so-called 
 `embarrassingly parallel <http://en.wikipedia.org/wiki/Embarrassingly_parallel>`_
-tasks where the items to be worked on can be split into separate chunks and
-saved to a list. The list is then split up and each MPI task performs parts of
-it independently.
+task, ``yt`` can do that easily.  See the sections below on 
+:ref:`parallelizing-your-analysis` and :ref:`parallel-time-series-analysis`.
+
+Use of ``piter()``
+^^^^^^^^^^^^^^^^^^
+
+If you use parallelism over objects or datasets, you will encounter
+the ``piter()`` function.  ``piter`` is a parallel iterator, which effectively
+doles out each item of a DatasetSeries object to a different processor.  In
+serial processing, you might iterate over a DatasetSeries by:
+
+.. code-block:: python
+
+    for dataset in dataset_series:
+        <process>
+
+But in parallel, you can use ``piter()`` to force each dataset to go to
+a different processor:
+
+.. code-block:: python
+
+    yt.enable_parallelism()
+    for dataset in dataset_series.piter():
+        <process>
+
+In order to store information from the parallel processing step to 
+a data structure that exists on all of the processors operating in parallel
+we offer the ``storage`` keyword in the ``piter`` function.
+You may define an empty dictionary and include it as the keyword argument 
+``storage`` to ``piter()``.  Then, during the processing step, you can access
+this dictionary as the ``sto`` object.  After the 
+loop is finished, the dictionary is re-aggragated from all of the processors, 
+and you can access the contents:
+
+.. code-block:: python
+
+    yt.enable_parallelism()
+    my_dictionary = {}
+    for dataset in dataset_series.piter(storage=my_dictionary):
+        <process>
+        sto.result = <some information processed for this dataset>
+        sto.result_id = <some identfier for this dataset>
+
+    print my_dictionary
 
 .. _parallelizing-your-analysis:
 
-Parallelizing Your Analysis
----------------------------
+Parallelizing over Multiple Objects
+-----------------------------------
 
 It is easy within ``yt`` to parallelize a list of tasks, as long as those tasks
 are independent of one another. Using object-based parallelism, the function
@@ -279,8 +320,8 @@
 
 .. _parallel-time-series-analysis:
 
-Parallel Time Series Analysis
------------------------------
+Parallelization over Multiple Datasets (including Time Series)
+--------------------------------------------------------------
 
 The same ``parallel_objects`` machinery discussed above is turned on by
 default when using a :class:`~yt.data_objects.time_series.DatasetSeries` object
@@ -294,15 +335,23 @@
    import yt
    yt.enable_parallelism()
 
+   # Load all of the DD*/output_* files into a DatasetSeries object
+   # in this case it is a Time Series
    ts = yt.load("DD*/output_*")
 
+   # Define an empty storage dictionary for collecting information
+   # in parallel through processing
    storage = {}
 
+   # Use piter() to iterate over the time series, one proc per dataset
+   # and store the resulting information from each dataset in
+   # the storage dictionary
    for sto, ds in ts.piter(storage=storage):
        sphere = ds.sphere("max", (1.0, "pc"))
        sto.result = sphere.quantities.angular_momentum_vector()
        sto.result_id = str(ds)
 
+   # Print out the angular momentum vector for all of the datasets
    for L in sorted(storage.items()):
        print L
 
@@ -311,10 +360,10 @@
 processor.
 
 You can also request a fixed number of processors to calculate each
-angular momentum vector.  For example, this script will calculate each angular
-momentum vector using 4 workgroups, splitting up the pool available processors.
-Note that parallel=1 implies that the analysis will be run using 1 workgroup, 
-whereas parallel=True will run with Nprocs workgroups.
+angular momentum vector.  For example, the following script will calculate each 
+angular momentum vector using 4 workgroups, splitting up the pool available 
+processors.  Note that parallel=1 implies that the analysis will be run using 
+1 workgroup, whereas parallel=True will run with Nprocs workgroups.
 
 .. code-block:: python
 
@@ -366,31 +415,31 @@
 two-dimensional representations of data.  All three have been parallelized in a
 chunk-based fashion.
 
- * Projections: projections are parallelized utilizing a quad-tree approach.
-   Data is loaded for each processor, typically by a process that consolidates
-   open/close/read operations, and each grid is then iterated over and cells are
-   deposited into a data structure that stores values corresponding to positions
-   in the two-dimensional plane.  This provides excellent load balancing, and in
-   serial is quite fast.  However, the operation by which quadtrees are joined
-   across processors scales poorly; while memory consumption scales well, the
-   time to completion does not.  As such, projections can often be done very
-   fast when operating only on a single processor!  The quadtree algorithm can
-   be used inline (and, indeed, it is for this reason that it is slow.)  It is
-   recommended that you attempt to project in serial before projecting in
-   parallel; even for the very largest datasets (Enzo 1024^3 root grid with 7
-   levels of refinement) in the absence of IO the quadtree algorithm takes only
-   three minutes or so on a decent processor.
+* **Projections**: projections are parallelized utilizing a quad-tree approach.
+  Data is loaded for each processor, typically by a process that consolidates
+  open/close/read operations, and each grid is then iterated over and cells are
+  deposited into a data structure that stores values corresponding to positions
+  in the two-dimensional plane.  This provides excellent load balancing, and in
+  serial is quite fast.  However, the operation by which quadtrees are joined
+  across processors scales poorly; while memory consumption scales well, the
+  time to completion does not.  As such, projections can often be done very
+  fast when operating only on a single processor!  The quadtree algorithm can
+  be used inline (and, indeed, it is for this reason that it is slow.)  It is
+  recommended that you attempt to project in serial before projecting in
+  parallel; even for the very largest datasets (Enzo 1024^3 root grid with 7
+  levels of refinement) in the absence of IO the quadtree algorithm takes only
+  three minutes or so on a decent processor.
 
- * Slices: to generate a slice, chunks that intersect a given slice are iterated
-   over and their finest-resolution cells are deposited.  The chunks are
-   decomposed via standard load balancing.  While this operation is parallel,
-   **it is almost never necessary to slice a dataset in parallel**, as all data is
-   loaded on demand anyway.  The slice operation has been parallelized so as to
-   enable slicing when running *in situ*.
+* **Slices**: to generate a slice, chunks that intersect a given slice are iterated
+  over and their finest-resolution cells are deposited.  The chunks are
+  decomposed via standard load balancing.  While this operation is parallel,
+  **it is almost never necessary to slice a dataset in parallel**, as all data is
+  loaded on demand anyway.  The slice operation has been parallelized so as to
+  enable slicing when running *in situ*.
 
- * Cutting planes: cutting planes are parallelized exactly as slices are.
-   However, in contrast to slices, because the data-selection operation can be
-   much more time consuming, cutting planes often benefit from parallelism.
+* **Cutting planes**: cutting planes are parallelized exactly as slices are.
+  However, in contrast to slices, because the data-selection operation can be
+  much more time consuming, cutting planes often benefit from parallelism.
 
 Object-Based
 ++++++++++++
@@ -437,6 +486,7 @@
 roughly 1 MB of memory per 5,000 particles, although recent work has improved
 this and the memory requirement is now smaller than this. But this is a good
 starting point for beginning to calculate the memory required for halo-finding.
+For more information, see :ref:`halo_finding`.
 
 **Volume Rendering**
 
@@ -449,81 +499,82 @@
 number of chunks.  In order to keep work distributed evenly, typically the
 number of processors should be no greater than one-eighth or one-quarter the
 number of processors that were used to produce the dataset.
+For more information, see :ref:`volume_rendering`.
 
 Additional Tips
 ---------------
 
-  * Don't be afraid to change how a parallel job is run. Change the
-    number of processors, or memory allocated, and see if things work better
-    or worse. After all, it's just a computer, it doesn't pass moral judgment!
+* Don't be afraid to change how a parallel job is run. Change the
+  number of processors, or memory allocated, and see if things work better
+  or worse. After all, it's just a computer, it doesn't pass moral judgment!
 
-  * Similarly, human time is more valuable than computer time. Try increasing
-    the number of processors, and see if the runtime drops significantly.
-    There will be a sweet spot between speed of run and the waiting time in
-    the job scheduler queue; it may be worth trying to find it.
+* Similarly, human time is more valuable than computer time. Try increasing
+  the number of processors, and see if the runtime drops significantly.
+  There will be a sweet spot between speed of run and the waiting time in
+  the job scheduler queue; it may be worth trying to find it.
 
-  * If you are using object-based parallelism but doing CPU-intensive computations
-    on each object, you may find that setting ``num_procs`` equal to the 
-    number of processors per compute node can lead to significant speedups.
-    By default, most mpi implementations will assign tasks to processors on a
-    'by-slot' basis, so this setting will tell ``yt`` to do computations on a single
-    object using only the processors on a single compute node.  A nice application
-    for this type of parallelism is calculating a list of derived quantities for 
-    a large number of simulation outputs.
+* If you are using object-based parallelism but doing CPU-intensive computations
+  on each object, you may find that setting ``num_procs`` equal to the 
+  number of processors per compute node can lead to significant speedups.
+  By default, most mpi implementations will assign tasks to processors on a
+  'by-slot' basis, so this setting will tell ``yt`` to do computations on a single
+  object using only the processors on a single compute node.  A nice application
+  for this type of parallelism is calculating a list of derived quantities for 
+  a large number of simulation outputs.
 
-  * It is impossible to tune a parallel operation without understanding what's
-    going on. Read the documentation, look at the underlying code, or talk to
-    other ``yt`` users. Get informed!
+* It is impossible to tune a parallel operation without understanding what's
+  going on. Read the documentation, look at the underlying code, or talk to
+  other ``yt`` users. Get informed!
     
-  * Sometimes it is difficult to know if a job is cpu, memory, or disk
-    intensive, especially if the parallel job utilizes several of the kinds of
-    parallelism discussed above. In this case, it may be worthwhile to put
-    some simple timers in your script (as below) around different parts.
+* Sometimes it is difficult to know if a job is cpu, memory, or disk
+  intensive, especially if the parallel job utilizes several of the kinds of
+  parallelism discussed above. In this case, it may be worthwhile to put
+  some simple timers in your script (as below) around different parts.
+  
+.. code-block:: python
     
-    .. code-block:: python
-    
-       import yt
-       import time
+   import yt
+   import time
 
-       yt.enable_parallelism()
+   yt.enable_parallelism()
 
-       ds = yt.load("DD0152")
-       t0 = time.time()
-       bigstuff, hugestuff = StuffFinder(ds)
-       BigHugeStuffParallelFunction(ds, bigstuff, hugestuff)
-       t1 = time.time()
-       for i in range(1000000):
-           tinystuff, ministuff = GetTinyMiniStuffOffDisk("in%06d.txt" % i)
-           array = TinyTeensyParallelFunction(ds, tinystuff, ministuff)
-           SaveTinyMiniStuffToDisk("out%06d.txt" % i, array)
-       t2 = time.time()
+   ds = yt.load("DD0152")
+   t0 = time.time()
+   bigstuff, hugestuff = StuffFinder(ds)
+   BigHugeStuffParallelFunction(ds, bigstuff, hugestuff)
+   t1 = time.time()
+   for i in range(1000000):
+       tinystuff, ministuff = GetTinyMiniStuffOffDisk("in%06d.txt" % i)
+       array = TinyTeensyParallelFunction(ds, tinystuff, ministuff)
+       SaveTinyMiniStuffToDisk("out%06d.txt" % i, array)
+   t2 = time.time()
+   
+   if yt.is_root()
+       print "BigStuff took %.5e sec, TinyStuff took %.5e sec" % (t1 - t0, t2 - t1)
+  
+* Remember that if the script handles disk IO explicitly, and does not use
+  a built-in ``yt`` function to write data to disk,
+  care must be taken to
+  avoid `race-conditions <http://en.wikipedia.org/wiki/Race_conditions>`_.
+  Be explicit about which MPI task writes to disk using a construction
+  something like this:
+  
+.. code-block:: python
        
-       if yt.is_root()
-           print "BigStuff took %.5e sec, TinyStuff took %.5e sec" % (t1 - t0, t2 - t1)
-  
-  * Remember that if the script handles disk IO explicitly, and does not use
-    a built-in ``yt`` function to write data to disk,
-    care must be taken to
-    avoid `race-conditions <http://en.wikipedia.org/wiki/Race_conditions>`_.
-    Be explicit about which MPI task writes to disk using a construction
-    something like this:
-    
-    .. code-block:: python
-       
-       if yt.is_root()
-           file = open("out.txt", "w")
-           file.write(stuff)
-           file.close()
+   if yt.is_root()
+       file = open("out.txt", "w")
+       file.write(stuff)
+       file.close()
 
-  * Many supercomputers allow users to ssh into the nodes that their job is
-    running on.
-    Many job schedulers send the names of the nodes that are
-    used in the notification emails, or a command like ``qstat -f NNNN``, where
-    ``NNNN`` is the job ID, will also show this information.
-    By ssh-ing into nodes, the memory usage of each task can be viewed in
-    real-time as the job runs (using ``top``, for example),
-    and can give valuable feedback about the
-    resources the task requires.
+* Many supercomputers allow users to ssh into the nodes that their job is
+  running on.
+  Many job schedulers send the names of the nodes that are
+  used in the notification emails, or a command like ``qstat -f NNNN``, where
+  ``NNNN`` is the job ID, will also show this information.
+  By ssh-ing into nodes, the memory usage of each task can be viewed in
+  real-time as the job runs (using ``top``, for example),
+  and can give valuable feedback about the
+  resources the task requires.
     
 An Advanced Worked Example
 --------------------------
@@ -532,19 +583,19 @@
 simulation.  This script was designed to analyze a set of 100 outputs on
 Gordon, running on 128 processors.  This script goes through three phases:
 
- #. Define a new derived field, which calculates the fraction of ionized
-    hydrogen as a function only of the total hydrogen density.
- #. Load a time series up, specifying ``parallel = 8``.  This means that it
-    will decompose into 8 jobs.  So if we ran on 128 processors, we would have
-    16 processors assigned to each output in the time series.
- #. Creating a big cube that will hold our results for this set of processors.
-    Note that this will be only for each output considered by this processor,
-    and this cube will not necessarily be filled in in every cell.
- #. For each output, distribute the grids to each of the sixteen processors
-    working on that output.  Each of these takes the max of the ionized
-    redshift in their zone versus the accumulation cube.
- #. Iterate over slabs and find the maximum redshift in each slab of our
-    accumulation cube.
+#. Define a new derived field, which calculates the fraction of ionized
+   hydrogen as a function only of the total hydrogen density.
+#. Load a time series up, specifying ``parallel = 8``.  This means that it
+   will decompose into 8 jobs.  So if we ran on 128 processors, we would have
+   16 processors assigned to each output in the time series.
+#. Creating a big cube that will hold our results for this set of processors.
+   Note that this will be only for each output considered by this processor,
+   and this cube will not necessarily be filled in in every cell.
+#. For each output, distribute the grids to each of the sixteen processors
+   working on that output.  Each of these takes the max of the ionized
+   redshift in their zone versus the accumulation cube.
+#. Iterate over slabs and find the maximum redshift in each slab of our
+   accumulation cube.
 
 At the end, the root processor (of the global calculation) writes out an
 ionization cube that contains the redshift of first reionization for each zone

diff -r c259c6a5f8ac42834e77c4056bdb635de93620c8 -r 7a4a92f8887989509680ce553cb982ea205479de doc/source/analyzing/units/fields_and_unit_conversion.rst
--- a/doc/source/analyzing/units/fields_and_unit_conversion.rst
+++ b/doc/source/analyzing/units/fields_and_unit_conversion.rst
@@ -1,7 +1,7 @@
 .. _data_selection_and_fields:
 
-Data selection and fields
-=========================
+Fields and Unit Conversion
+==========================
 
 .. notebook:: 2)_Fields_and_unit_conversion.ipynb
 

diff -r c259c6a5f8ac42834e77c4056bdb635de93620c8 -r 7a4a92f8887989509680ce553cb982ea205479de doc/source/cookbook/simple_plots.rst
--- a/doc/source/cookbook/simple_plots.rst
+++ b/doc/source/cookbook/simple_plots.rst
@@ -8,13 +8,8 @@
 Simple Slices
 ~~~~~~~~~~~~~
 
-This script shows the simplest way to make a slice through a dataset.
-
-Note that, by default,
-:class:`~yt.visualization.plot_window.SlicePlot` shifts the
-coordinates on the axes such that the origin is at the center of the
-slice.  To instead use the coordinates as defined in the dataset, use
-the optional argument: ``origin="native"``
+This script shows the simplest way to make a slice through a dataset.  See
+:ref:`slice-plots` for more information.
 
 .. yt_cookbook:: simple_slice.py
 
@@ -24,7 +19,8 @@
 This is the simplest way to make a projection through a dataset.  There are
 several different :ref:`projection-types`, but non-weighted line integrals
 and weighted line integrals are the two most common.  Here we create 
-density projections (non-weighted line integral):
+density projections (non-weighted line integral).  
+See :ref:`projection-plots` for more information.
 
 .. yt_cookbook:: simple_projection.py
 
@@ -32,7 +28,8 @@
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 And here we produce density-weighted temperature projections (weighted line 
-integral) for the same dataset as the non-weighted projections above:
+integral) for the same dataset as the non-weighted projections above.
+See :ref:`projection-plots` for more information.
 
 .. yt_cookbook:: simple_projection_weighted.py
 
@@ -42,6 +39,7 @@
 This demonstrates how to make a phase plot.  Phase plots can be thought of as
 two-dimensional histograms, where the value is either the weighted-average or
 the total accumulation in a cell.
+See :ref:`how-to-make-2d-profiles` for more information.
 
 .. yt_cookbook:: simple_phase.py
 
@@ -51,6 +49,7 @@
 Often, one wants to examine the distribution of one variable as a function of
 another.  This shows how to see the distribution of mass in a simulation, with
 respect to the total mass in the simulation.
+See :ref:`how-to-make-2d-profiles` for more information.
 
 .. yt_cookbook:: simple_pdf.py
 
@@ -60,6 +59,7 @@
 This is a "profile," which is a 1D histogram.  This can be thought of as either
 the total accumulation (when weight_field is set to ``None``) or the average 
 (when a weight_field is supplied.)
+See :ref:`how-to-make-1d-profiles` for more information.
 
 .. yt_cookbook:: simple_profile.py
 
@@ -67,6 +67,7 @@
 ~~~~~~~~~~~~~~~~~~~~~~
 
 This shows how to make a profile of a quantity with respect to the radius.
+See :ref:`how-to-make-1d-profiles` for more information.
 
 .. yt_cookbook:: simple_radial_profile.py
 
@@ -75,6 +76,7 @@
 
 This is a simple example of overplotting multiple 1D profiles from a number 
 of datasets to show how they evolve over time.
+See :ref:`how-to-make-1d-profiles` for more information.
 
 .. yt_cookbook:: time_series_profiles.py
 
@@ -86,6 +88,7 @@
 This shows how to plot the variance for a 1D profile.  In this example, we 
 manually create a 1D profile object, which gives us access to the variance 
 data.
+See :ref:`how-to-make-1d-profiles` for more information.
 
 .. yt_cookbook:: profile_with_variance.py
 
@@ -97,6 +100,7 @@
 :class:`~yt.visualization.plot_window.ProjectionPlot` some of the overhead of
 creating the data object can be reduced, and better performance squeezed out.
 This recipe shows how to add multiple fields to a single plot.
+See :ref:`slice-plots` and :ref:`projection-plots` for more information.
 
 .. yt_cookbook:: simple_slice_with_multiple_fields.py 
 
@@ -105,6 +109,7 @@
 
 One can create slices from any arbitrary angle, not just those aligned with
 the x,y,z axes.
+See :ref:`off-axis-slices` for more information.
 
 .. yt_cookbook:: simple_off_axis_slice.py
 
@@ -115,14 +120,18 @@
 
 Like off-axis slices, off-axis projections can be created from any arbitrary 
 viewing angle.
+See :ref:`off-axis-projections` for more information.
 
 .. yt_cookbook:: simple_off_axis_projection.py
 
+.. _cookbook-simple_volume_rendering:
+
 Simple Volume Rendering
 ~~~~~~~~~~~~~~~~~~~~~~~
 
 Volume renderings are 3D projections rendering isocontours in any arbitrary
 field (e.g. density, temperature, pressure, etc.)
+See :ref:`volume_rendering` for more information.
 
 .. yt_cookbook:: simple_volume_rendering.py
 
@@ -144,11 +153,10 @@
 cover normal use cases, sometimes more direct access to the underlying
 Matplotlib engine is necessary.  This recipe shows how to modify the plot
 window :class:`matplotlib.axes.Axes` object directly.
+See :ref:`matplotlib-customization` for more information.
 
 .. yt_cookbook:: simple_slice_matplotlib_example.py 
 
-.. _cookbook-simple_volume_rendering:
-
 Image Background Colors
 ~~~~~~~~~~~~~~~~~~~~~~~
 

diff -r c259c6a5f8ac42834e77c4056bdb635de93620c8 -r 7a4a92f8887989509680ce553cb982ea205479de doc/source/visualizing/plots.rst
--- a/doc/source/visualizing/plots.rst
+++ b/doc/source/visualizing/plots.rst
@@ -118,7 +118,13 @@
     yt.SlicePlot(ds, 'z', 'density', center=[0.2, 0.3, 0.8],
                  width = (10,'kpc')).save()
 
-The plot center is relative to the simulation coordinate system.  If supplied
+Note that, by default,
+:class:`~yt.visualization.plot_window.SlicePlot` shifts the
+coordinates on the axes such that the origin is at the center of the
+slice.  To instead use the coordinates as defined in the dataset, use
+the optional argument: ``origin="native"``
+
+If supplied
 without units, the center is assumed by in code units.  Optionally, you can
 supply 'c' or 'm' for the center.  These two choices will center the plot on the
 center of the simulation box and the coordinate of the maximum density cell,
@@ -521,6 +527,8 @@
    slc.set_buff_size(1600)
    slc.save()
 
+.. _matplotlib-customization:
+
 Further customization via matplotlib
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 

diff -r c259c6a5f8ac42834e77c4056bdb635de93620c8 -r 7a4a92f8887989509680ce553cb982ea205479de doc/source/visualizing/sketchfab.rst
--- a/doc/source/visualizing/sketchfab.rst
+++ b/doc/source/visualizing/sketchfab.rst
@@ -44,7 +44,8 @@
 :meth:`~yt.data_objects.data_containers.YTSelectionContainer3D.extract_isocontours`.  To
 calculate a flux, call
 :meth:`~yt.data_objects.data_containers.YTSelectionContainer3D.calculate_isocontour_flux`.
-both of these operations will run in parallel.
+both of these operations will run in parallel.  For more information on enabling
+parallelism in ``yt``, see :ref:`parallel-computation`.
 
 Alternatively, you can make an object called ``YTSurfaceBase`` that makes
 this process much easier.  You can create one of these objects by specifying a

diff -r c259c6a5f8ac42834e77c4056bdb635de93620c8 -r 7a4a92f8887989509680ce553cb982ea205479de doc/source/visualizing/streamlines.rst
--- a/doc/source/visualizing/streamlines.rst
+++ b/doc/source/visualizing/streamlines.rst
@@ -111,4 +111,5 @@
 each processor has access to all of the streamlines through the use of
 a reduction operation.
 
-Parallel usage is specified using the standard ``--parallel`` flag.
+For more information on enabling parallelism in ``yt``, see 
+:ref:`parallel-computation`.

diff -r c259c6a5f8ac42834e77c4056bdb635de93620c8 -r 7a4a92f8887989509680ce553cb982ea205479de doc/source/visualizing/volume_rendering.rst
--- a/doc/source/visualizing/volume_rendering.rst
+++ b/doc/source/visualizing/volume_rendering.rst
@@ -290,6 +290,8 @@
     limiting the number of MPI tasks you can use.  This is also being addressed
     in current development by using image plane decomposition.
 
+For more information about enabling parallelism, see :ref:`parallel-computation`.
+
 OpenMP Parallelization
 ----------------------
 
@@ -329,6 +331,8 @@
     provide a good enough speedup by default that it is preferable to launching
     the MPI tasks.
 
+For more information about enabling parallelism, see :ref:`parallel-computation`.
+
 Opacity
 -------

Repository URL: https://bitbucket.org/yt_analysis/yt/

--

This is a commit notification from bitbucket.org. You are receiving
this because you have the service enabled, addressing the recipient of
this email.