[yt-svn] commit/yt: MatthewTurk: Merged in brittonsmith/yt (pull request #2326)

Wed Sep 7 11:44:47 PDT 2016

1 new commit in yt:

https://bitbucket.org/yt_analysis/yt/commits/ed78789acef4/
Changeset:   ed78789acef4
Branch:      yt
User:        MatthewTurk
Date:        2016-09-07 18:44:20+00:00
Summary:     Merged in brittonsmith/yt (pull request #2326)

Adding save_as_dataset for clump finder
Affected #:  13 files

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d doc/source/analyzing/analysis_modules/clump_finding.rst

--- a/doc/source/analyzing/analysis_modules/clump_finding.rst
+++ b/doc/source/analyzing/analysis_modules/clump_finding.rst
@@ -13,8 +13,14 @@
 the result of user-specified functions, such as checking for gravitational
 boundedness.  A sample recipe can be found in :ref:`cookbook-find_clumps`.
 
+Setting up the Clump Finder
+---------------------------
+
 The clump finder requires a data object (see :ref:`data-objects`) and a field
-over which the contouring is to be performed.
+over which the contouring is to be performed.  The data object is then used
+to create the initial
+:class:`~yt.analysis_modules.level_sets.clump_handling.Clump` object that
+acts as the base for clump finding.
 
 .. code:: python
 
@@ -28,11 +34,15 @@
 
    master_clump = Clump(data_source, ("gas", "density"))
 
+Clump Validators
+----------------
+
 At this point, every isolated contour will be considered a clump,
 whether this is physical or not.  Validator functions can be added to
 determine if an individual contour should be considered a real clump.
-These functions are specified with the ``Clump.add_validator`` function.
-Current, two validators exist: a minimum number of cells and gravitational
+These functions are specified with the
+:func:`~yt.analysis_modules.level_sets.clump_handling.Clump.add_validator`
+function.  Current, two validators exist: a minimum number of cells and gravitational
 boundedness.
 
 .. code:: python
@@ -52,7 +62,8 @@
        return (clump["gas", "cell_mass"].sum() >= min_mass)
    add_validator("minimum_gas_mass", _minimum_gas_mass)
 
-The ``add_validator`` function adds the validator to a registry that can
+The :func:`~yt.analysis_modules.level_sets.clump_validators.add_validator`
+function adds the validator to a registry that can
 be accessed by the clump finder.  Then, the validator can be added to the
 clump finding just like the others.
 
@@ -60,9 +71,15 @@
 
    master_clump.add_validator("minimum_gas_mass", ds.quan(1.0, "Msun"))
 
-The clump finding algorithm accepts the ``Clump`` object, the initial minimum
-and maximum of the contouring field, and the step size.  The lower value of the
-contour finder will be continually multiplied by the step size.
+Running the Clump Finder
+------------------------
+
+Clump finding then proceeds by calling the
+:func:`~yt.analysis_modules.level_sets.clump_handling.find_clumps` function.
+This function accepts the
+:class:`~yt.analysis_modules.level_sets.clump_handling.Clump` object, the initial
+minimum and maximum of the contouring field, and the step size.  The lower value
+of the contour finder will be continually multiplied by the step size.
 
 .. code:: python
 
@@ -71,41 +88,27 @@
    step = 2.0
    find_clumps(master_clump, c_min, c_max, step)
 
-After the clump finding has finished, the master clump will represent the top
-of a hierarchy of clumps.  The ``children`` attribute within a ``Clump`` object
-contains a list of all sub-clumps.  Each sub-clump is also a ``Clump`` object
-with its own ``children`` attribute, and so on.
+Calculating Clump Quantities
+----------------------------
 
-A number of helper routines exist for examining the clump hierarchy.
-
-.. code:: python
-
-   # Write a text file of the full hierarchy.
-   write_clump_index(master_clump, 0, "%s_clump_hierarchy.txt" % ds)
-
-   # Write a text file of only the leaf nodes.
-   write_clumps(master_clump,0, "%s_clumps.txt" % ds)
-
-   # Get a list of just the leaf nodes.
-   leaf_clumps = get_lowest_clumps(master_clump)
-
-``Clump`` objects can be used like all other data containers.
-
-.. code:: python
-
-   print(leaf_clumps[0]["gas", "density"])
-   print(leaf_clumps[0].quantities.total_mass())
-
-The writing functions will write out a series or properties about each
-clump by default.  Additional properties can be appended with the
-``Clump.add_info_item`` function.
+By default, a number of quantities will be calculated for each clump when the
+clump finding process has finished.  The default quantities are: ``total_cells``,
+``cell_mass``, ``mass_weighted_jeans_mass``, ``volume_weighted_jeans_mass``,
+``max_grid_level``, ``min_number_density``, and ``max_number_density``.
+Additional items can be added with the
+:func:`~yt.analysis_modules.level_sets.clump_handling.Clump.add_info_item`
+function.
 
 .. code:: python
 
    master_clump.add_info_item("total_cells")
 
 Just like the validators, custom info items can be added by defining functions
-that minimally accept a ``Clump`` object and return a string to be printed.
+that minimally accept a
+:class:`~yt.analysis_modules.level_sets.clump_handling.Clump` object and return
+a format string to be printed and the value.  These are then added to the list
+of available info items by calling
+:func:`~yt.analysis_modules.level_sets.clump_info_items.add_clump_info`:
 
 .. code:: python
 
@@ -121,10 +124,47 @@
 
    master_clump.add_info_item("mass_weighted_jeans_mass")
 
-By default, the following info items are activated: **total_cells**,
-**cell_mass**, **mass_weighted_jeans_mass**, **volume_weighted_jeans_mass**,
-**max_grid_level**, **min_number_density**, **max_number_density**, and
-**distance_to_main_clump**.
+Beside the quantities calculated by default, the following are available:
+``center_of_mass`` and ``distance_to_main_clump``.
+
+Working with Clumps
+-------------------
+
+After the clump finding has finished, the master clump will represent the top
+of a hierarchy of clumps.  The ``children`` attribute within a
+:class:`~yt.analysis_modules.level_sets.clump_handling.Clump` object
+contains a list of all sub-clumps.  Each sub-clump is also a
+:class:`~yt.analysis_modules.level_sets.clump_handling.Clump` object
+with its own ``children`` attribute, and so on.
+
+.. code:: python
+
+   print(master_clump["gas", "density"])
+   print(master_clump.children)
+   print(master_clump.children[0]["gas", "density"])
+
+The entire clump tree can traversed with a loop syntax:
+
+.. code:: python
+
+   for clump in master_clump:
+       print(clump.clump_id)
+
+The :func:`~yt.analysis_modules.level_sets.clump_handling.get_lowest_clumps`
+function will return a list of the individual clumps that have no children
+of their own (the leaf clumps).
+
+.. code:: python
+
+   # Get a list of just the leaf nodes.
+   leaf_clumps = get_lowest_clumps(master_clump)
+
+   print(leaf_clumps[0]["gas", "density"])
+   print(leaf_clumps[0]["all", "particle_mass"])
+   print(leaf_clumps[0].quantities.total_mass())
+
+Visualizing Clumps
+------------------
 
 Clumps can be visualized using the ``annotate_clumps`` callback.
 
@@ -134,3 +174,44 @@
                            center='c', width=(20,'kpc'))
    prj.annotate_clumps(leaf_clumps)
    prj.save('clumps')
+
+Saving and Reloading Clump Data
+-------------------------------
+
+The clump tree can be saved as a reloadable dataset with the
+:func:`~yt.analysis_modules.level_sets.clump_handling.Clump.save_as_dataset`
+function.  This will save all info items that have been calculated as well as
+any field values specified with the *fields* keyword.  This function
+can be called for any clump in the tree, saving that clump and all those
+below it.
+
+.. code:: python
+
+   fn = master_clump.save_as_dataset(fields=["density", "particle_mass"])
+
+The clump tree can then be reloaded as a regular dataset.  The ``tree`` attribute
+associated with the dataset provides access to the clump tree.  The tree can be
+iterated over in the same fashion as the original tree.
+
+.. code:: python
+
+   ds_clumps = yt.load(fn)
+   for clump ds_clumps.tree:
+       print(clump.clump_id)
+
+The ``leaves`` attribute returns a list of all leaf clumps.
+
+.. code:: python
+
+   print(ds_clumps.leaves)
+
+Info items for each clump can be accessed with the `clump` field type.  Gas
+or grid fields should be accessed using the `grid` field type and particle
+fields should be access using the specific particle type.
+
+.. code:: python
+
+   my_clump = ds_clumps.leaves[0]
+   print(my_clumps["clump", "cell_mass"])
+   print(my_clumps["grid", "density"])
+   print(my_clumps["all", "particle_mass"])

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d doc/source/cookbook/find_clumps.py
--- a/doc/source/cookbook/find_clumps.py
+++ b/doc/source/cookbook/find_clumps.py
@@ -27,14 +27,14 @@
 # As many validators can be added as you want.
 master_clump.add_validator("min_cells", 20)
 
+# Calculate center of mass for all clumps.
+master_clump.add_info_item("center_of_mass")
+
 # Begin clump finding.
 find_clumps(master_clump, c_min, c_max, step)
 
-# Write out the full clump hierarchy.
-write_clump_index(master_clump, 0, "%s_clump_hierarchy.txt" % ds)
-
-# Write out only the leaf nodes of the hierarchy.
-write_clumps(master_clump,0, "%s_clumps.txt" % ds)
+# Save the clump tree as a reloadable dataset
+fn = master_clump.save_as_dataset(fields=["density", "particle_mass"])
 
 # We can traverse the clump hierarchy to get a list of all of the 'leaf' clumps
 leaf_clumps = get_lowest_clumps(master_clump)
@@ -46,5 +46,17 @@
 # Next we annotate the plot with contours on the borders of the clumps
 prj.annotate_clumps(leaf_clumps)
 
-# Lastly, we write the plot to disk.
+# Save the plot to disk.
 prj.save('clumps')
+
+# Reload the clump dataset.
+cds = yt.load(fn)
+
+# Query fields for clumps in the tree.
+print (cds.tree["clump", "center_of_mass"])
+print (cds.tree.children[0]["grid", "density"])
+print (cds.tree.children[1]["all", "particle_mass"])
+
+# Get all of the leaf clumps.
+print (cds.leaves)
+print (cds.leaves[0]["clump", "cell_mass"])

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d doc/source/reference/api/api.rst
--- a/doc/source/reference/api/api.rst
+++ b/doc/source/reference/api/api.rst
@@ -399,6 +399,8 @@
    ~yt.frontends.ytdata.data_structures.YTNonspatialHierarchy
    ~yt.frontends.ytdata.data_structures.YTNonspatialGrid
    ~yt.frontends.ytdata.data_structures.YTProfileDataset
+   ~yt.frontends.ytdata.data_structures.YTClumpTreeDataset
+   ~yt.frontends.ytdata.data_structures.YTClumpContainer
    ~yt.frontends.ytdata.fields.YTDataContainerFieldInfo
    ~yt.frontends.ytdata.fields.YTGridFieldInfo
    ~yt.frontends.ytdata.io.IOHandlerYTDataContainerHDF5
@@ -442,6 +444,26 @@
    ~yt.data_objects.profiles.ParticleProfile
    ~yt.data_objects.profiles.create_profile
 
+.. _clump_finding:
+
+Clump Finding
+^^^^^^^^^^^^^
+
+The ``Clump`` object and associated functions can be used for identification
+of topologically disconnected structures, i.e., clump finding.
+
+.. autosummary::
+   :toctree: generated/
+
+   ~yt.analysis_modules.level_sets.clump_handling.Clump
+   ~yt.analysis_modules.level_sets.clump_handling.Clump.add_info_item
+   ~yt.analysis_modules.level_sets.clump_handling.Clump.add_validator
+   ~yt.analysis_modules.level_sets.clump_handling.Clump.save_as_dataset
+   ~yt.analysis_modules.level_sets.clump_handling.find_clumps
+   ~yt.analysis_modules.level_sets.clump_handling.get_lowest_clumps
+   ~yt.analysis_modules.level_sets.clump_info_items.add_clump_info
+   ~yt.analysis_modules.level_sets.clump_validators.add_validator
+
 .. _halo_analysis_ref:
 
 Halo Analysis

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d yt/analysis_modules/level_sets/clump_handling.py
--- a/yt/analysis_modules/level_sets/clump_handling.py
+++ b/yt/analysis_modules/level_sets/clump_handling.py
@@ -13,14 +13,22 @@
 # The full license is in the file COPYING.txt, distributed with this software.
 #-----------------------------------------------------------------------------
 
-import copy
 import numpy as np
 import uuid
 
 from yt.fields.derived_field import \
     ValidateSpatial
-from yt.funcs import mylog, iterable
-from yt.extern.six import string_types
+from yt.frontends.ytdata.utilities import \
+    save_as_dataset
+from yt.funcs import \
+    deprecate, \
+    get_output_filename, \
+    iterable, \
+    mylog
+from yt.extern.six import \
+    string_types
+from yt.utilities.tree_container import \
+    TreeContainer
 
 from .clump_info_items import \
     clump_info_registry
@@ -46,28 +54,40 @@
                  display_field=False,
                  units='')
 
-class Clump(object):
+class Clump(TreeContainer):
     children = None
     def __init__(self, data, field, parent=None,
-                 clump_info=None, validators=None):
+                 clump_info=None, validators=None,
+                 base=None, contour_key=None,
+                 contour_id=None):
         self.data = data
         self.field = field
         self.parent = parent
         self.quantities = data.quantities
         self.min_val = self.data[field].min()
         self.max_val = self.data[field].max()
+        self.info = {}
+
+        # is this the parent clump?
+        if base is None:
+            base = self
+            self.total_clumps = 0
+            if clump_info is None:
+                self.set_default_clump_info()
+            else:
+                self.clump_info = clump_info
+
+        self.base = base
+        self.clump_id = self.base.total_clumps
+        self.base.total_clumps += 1
+        self.contour_key = contour_key
+        self.contour_id = contour_id
 
         if parent is not None:
             self.data.parent = self.parent.data
 
-        # List containing characteristics about clumps that are to be written 
-        # out by the write routines.
-        if clump_info is None:
-            self.set_default_clump_info()
-        else:
-            # Clump info will act the same if add_info_item is called 
-            # before or after clump finding.
-            self.clump_info = copy.deepcopy(clump_info)
+        if parent is not None:
+            self.data.parent = self.parent.data
 
         if validators is None:
             validators = []
@@ -125,10 +145,11 @@
         for child in self.children:
             child.clear_clump_info()
 
+    @deprecate("Clump.save_as_dataset")
     def write_info(self, level, f_ptr):
         "Writes information for clump using the list of items in clump_info."
 
-        for item in self.clump_info:
+        for item in self.base.clump_info:
             value = item(self)
             f_ptr.write("%s%s\n" % ('\t'*level, value))
 
@@ -159,8 +180,190 @@
                 # Using "ones" here will speed things up.
                 continue
             self.children.append(Clump(new_clump, self.field, parent=self,
-                                       clump_info=self.clump_info,
-                                       validators=self.validators))
+                                       validators=self.validators,
+                                       base=self.base,
+                                       contour_key=contour_key,
+                                       contour_id=cid))
+
+    def __iter__(self):
+        yield self
+        if self.children is None:
+            return
+        for child in self.children:
+            for a_node in child:
+                yield a_node
+
+    def save_as_dataset(self, filename=None, fields=None):
+        r"""Export clump tree to a reloadable yt dataset.
+
+        This function will take a clump object and output a dataset
+        containing the fields given in the ``fields`` list and all info
+        items.  The resulting dataset can be reloaded as a yt dataset.
+
+        Parameters
+        ----------
+        filename : str, optional
+            The name of the file to be written.  If None, the name
+            will be a combination of the original dataset and the clump
+            index.
+        fields : list of strings or tuples, optional
+            If this is supplied, it is the list of fields to be saved to
+            disk.
+
+        Returns
+        -------
+        filename : str
+            The name of the file that has been created.
+
+        Examples
+        --------
+
+        >>> import yt
+        >>> from yt.analysis_modules.level_sets.api import \
+        ...         Clump, find_clumps
+        >>> ds = yt.load("IsolatedGalaxy/galaxy0030/galaxy0030")
+        >>> data_source = ds.disk([0.5, 0.5, 0.5], [0., 0., 1.],
+        ...                       (8, 'kpc'), (1, 'kpc'))
+        >>> field = ("gas", "density")
+        >>> step = 2.0
+        >>> c_min = 10**np.floor(np.log10(data_source[field]).min()  )
+        >>> c_max = 10**np.floor(np.log10(data_source[field]).max()+1)
+        >>> master_clump = Clump(data_source, field)
+        >>> master_clump.add_info_item("center_of_mass")
+        >>> master_clump.add_validator("min_cells", 20)
+        >>> find_clumps(master_clump, c_min, c_max, step)
+        >>> fn = master_clump.save_as_dataset(fields=["density", "particle_mass"])
+        >>> new_ds = yt.load(fn)
+        >>> print (ds.tree["clump", "cell_mass"])
+        1296926163.91 Msun
+        >>> print ds.tree["grid", "density"]
+        [  2.54398434e-26   2.46620353e-26   2.25120154e-26 ...,   1.12879234e-25
+           1.59561490e-25   1.09824903e-24] g/cm**3
+        >>> print ds.tree["all", "particle_mass"]
+        [  4.25472446e+38   4.25472446e+38   4.25472446e+38 ...,   2.04238266e+38
+           2.04523901e+38   2.04770938e+38] g
+        >>> print ds.tree.children[0]["clump", "cell_mass"]
+        909636495.312 Msun
+        >>> print ds.leaves[0]["clump", "cell_mass"]
+        3756566.99809 Msun
+        >>> print ds.leaves[0]["grid", "density"]
+        [  6.97820274e-24   6.58117370e-24   7.32046082e-24   6.76202430e-24
+           7.41184837e-24   6.76981480e-24   6.94287213e-24   6.56149658e-24
+           6.76584569e-24   6.94073710e-24   7.06713082e-24   7.22556526e-24
+           7.08338898e-24   6.78684331e-24   7.40647040e-24   7.03050456e-24
+           7.12438678e-24   6.56310217e-24   7.23201662e-24   7.17314333e-24] g/cm**3
+
+        """
+
+        ds = self.data.ds
+        keyword = "%s_clump_%d" % (str(ds), self.clump_id)
+        filename = get_output_filename(filename, keyword, ".h5")
+
+        # collect clump info fields
+        clump_info = dict([(ci.name, []) for ci in self.base.clump_info])
+        clump_info.update(
+            dict([(field, []) for field in ["clump_id", "parent_id",
+                                            "contour_key", "contour_id"]]))
+        for clump in self:
+            clump_info["clump_id"].append(clump.clump_id)
+            if clump.parent is None:
+                parent_id = -1
+            else:
+                parent_id = clump.parent.clump_id
+            clump_info["parent_id"].append(parent_id)
+
+            contour_key = clump.contour_key
+            if contour_key is None: contour_key = -1
+            clump_info["contour_key"].append(contour_key)
+            contour_id = clump.contour_id
+            if contour_id is None: contour_id = -1
+            clump_info["contour_id"].append(contour_id)
+
+            for ci in self.base.clump_info:
+                ci(clump)
+                clump_info[ci.name].append(clump.info[ci.name][1])
+        for ci in clump_info:
+            if hasattr(clump_info[ci][0], "units"):
+                clump_info[ci] = ds.arr(clump_info[ci])
+            else:
+                clump_info[ci] = np.array(clump_info[ci])
+
+        ftypes = dict([(ci, "clump") for ci in clump_info])
+
+        # collect data fields
+        if fields is not None:
+            contour_fields = \
+              [("index", "contours_%s" % ckey)
+               for ckey in np.unique(clump_info["contour_key"]) \
+               if str(ckey) != "-1"]
+
+            ptypes = []
+            field_data = {}
+            need_grid_positions = False
+            for f in self.base.data._determine_fields(fields) + contour_fields:
+                field_data[f] = self.base[f]
+                if ds.field_info[f].particle_type:
+                    if f[0] not in ptypes:
+                        ptypes.append(f[0])
+                    ftypes[f] = f[0]
+                else:
+                    need_grid_positions = True
+                    ftypes[f] = "grid"
+
+            if len(ptypes) > 0:
+                for ax in "xyz":
+                    for ptype in ptypes:
+                        p_field = (ptype, "particle_position_%s" % ax)
+                        if p_field in ds.field_info and \
+                          p_field not in field_data:
+                            ftypes[p_field] = p_field[0]
+                            field_data[p_field] = self.base[p_field]
+
+                for clump in self:
+                    if clump.contour_key is None:
+                        continue
+                    for ptype in ptypes:
+                        cfield = (ptype, "contours_%s" % clump.contour_key)
+                        if cfield not in field_data:
+                            field_data[cfield] = \
+                              clump.data._part_ind(ptype).astype(np.int64)
+                            ftypes[cfield] = ptype
+                        field_data[cfield][clump.data._part_ind(ptype)] = \
+                          clump.contour_id
+
+            if need_grid_positions:
+                for ax in "xyz":
+                    g_field = ("index", ax)
+                    if g_field in ds.field_info and \
+                      g_field not in field_data:
+                        field_data[g_field] = self.base[g_field]
+                        ftypes[g_field] = "grid"
+                    g_field = ("index", "d" + ax)
+                    if g_field in ds.field_info and \
+                      g_field not in field_data:
+                        ftypes[g_field] = "grid"
+                        field_data[g_field] = self.base[g_field]
+
+            if self.contour_key is not None:
+                cfilters = {}
+                for field in field_data:
+                    if ftypes[field] == "grid":
+                        ftype = "index"
+                    else:
+                        ftype = field[0]
+                    cfield = (ftype, "contours_%s" % self.contour_key)
+                    if cfield not in cfilters:
+                        cfilters[cfield] = field_data[cfield] == self.contour_id
+                    field_data[field] = field_data[field][cfilters[cfield]]
+
+        clump_info.update(field_data)
+        extra_attrs = {"data_type": "yt_clump_tree",
+                       "container_type": "yt_clump_tree"}
+        save_as_dataset(ds, filename, clump_info,
+                        field_types=ftypes,
+                        extra_attrs=extra_attrs)
+
+        return filename
 
     def pass_down(self,operation):
         """
@@ -270,6 +473,7 @@
 
     return clump_list
 
+ at deprecate("Clump.save_as_dataset")
 def write_clump_index(clump, level, fh):
     top = False
     if isinstance(fh, string_types):
@@ -287,6 +491,7 @@
     if top:
         fh.close()
 
+ at deprecate("Clump.save_as_dataset")
 def write_clumps(clump, level, fh):
     top = False
     if isinstance(fh, string_types):

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d yt/analysis_modules/level_sets/clump_info_items.py
--- a/yt/analysis_modules/level_sets/clump_info_items.py
+++ b/yt/analysis_modules/level_sets/clump_info_items.py
@@ -21,14 +21,15 @@
 clump_info_registry = OperatorRegistry()
 
 def add_clump_info(name, function):
-    clump_info_registry[name] = ClumpInfoCallback(function)
+    clump_info_registry[name] = ClumpInfoCallback(name, function)
 
 class ClumpInfoCallback(object):
     r"""
     A ClumpInfoCallback is a function that takes a clump, computes a 
     quantity, and returns a string to be printed out for writing clump info.
     """
-    def __init__(self, function, args=None, kwargs=None):
+    def __init__(self, name, function, args=None, kwargs=None):
+        self.name = name
         self.function = function
         self.args = args
         if self.args is None: self.args = []
@@ -36,43 +37,51 @@
         if self.kwargs is None: self.kwargs = {}
 
     def __call__(self, clump):
-        return self.function(clump, *self.args, **self.kwargs)
-    
+        if self.name not in clump.info:
+            clump.info[self.name] = self.function(clump, *self.args, **self.kwargs)
+        rv = clump.info[self.name]
+        return rv[0] % rv[1]
+
+def _center_of_mass(clump, units="code_length", **kwargs):
+    p = clump.quantities.center_of_mass(**kwargs)
+    return "Center of mass: %s.", p.to(units)
+add_clump_info("center_of_mass", _center_of_mass)
+
 def _total_cells(clump):
     n_cells = clump.data["index", "ones"].size
-    return "Cells: %d." % n_cells
+    return "Cells: %d.", n_cells
 add_clump_info("total_cells", _total_cells)
 
 def _cell_mass(clump):
     cell_mass = clump.data["gas", "cell_mass"].sum().in_units("Msun")
-    return "Mass: %e Msun." % cell_mass
+    return "Mass: %e Msun.", cell_mass
 add_clump_info("cell_mass", _cell_mass)
 
 def _mass_weighted_jeans_mass(clump):
     jeans_mass = clump.data.quantities.weighted_average_quantity(
         "jeans_mass", ("gas", "cell_mass")).in_units("Msun")
-    return "Jeans Mass (mass-weighted): %.6e Msolar." % jeans_mass
+    return "Jeans Mass (mass-weighted): %.6e Msolar.", jeans_mass
 add_clump_info("mass_weighted_jeans_mass", _mass_weighted_jeans_mass)
 
 def _volume_weighted_jeans_mass(clump):
     jeans_mass = clump.data.quantities.weighted_average_quantity(
         "jeans_mass", ("index", "cell_volume")).in_units("Msun")
-    return "Jeans Mass (volume-weighted): %.6e Msolar." % jeans_mass
+    return "Jeans Mass (volume-weighted): %.6e Msolar.", jeans_mass
 add_clump_info("volume_weighted_jeans_mass", _volume_weighted_jeans_mass)
 
 def _max_grid_level(clump):
     max_level = clump.data["index", "grid_level"].max()
-    return "Max grid level: %d." % max_level
+    return "Max grid level: %d.", max_level
 add_clump_info("max_grid_level", _max_grid_level)
 
 def _min_number_density(clump):
     min_n = clump.data["gas", "number_density"].min().in_units("cm**-3")
-    return "Min number density: %.6e cm^-3." % min_n
+    return "Min number density: %.6e cm^-3.", min_n
 add_clump_info("min_number_density", _min_number_density)
 
 def _max_number_density(clump):
     max_n = clump.data["gas", "number_density"].max().in_units("cm**-3")
-    return "Max number density: %.6e cm^-3." % max_n
+    return "Max number density: %.6e cm^-3.", max_n
 add_clump_info("max_number_density", _max_number_density)
 
 def _distance_to_main_clump(clump, units="pc"):
@@ -82,6 +91,7 @@
     master_com = clump.data.ds.arr(master.data.quantities.center_of_mass())
     my_com = clump.data.ds.arr(clump.data.quantities.center_of_mass())
     distance = np.sqrt(((master_com - my_com)**2).sum())
-    return "Distance from master center of mass: %.6e %s." % \
-      (distance.in_units(units), units)
+    distance.convert_to_units("pc")
+    return "Distance from master center of mass: %%.6e %s." % units, \
+      distance.in_units(units)
 add_clump_info("distance_to_main_clump", _distance_to_main_clump)

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d yt/analysis_modules/level_sets/clump_validators.py
--- a/yt/analysis_modules/level_sets/clump_validators.py
+++ b/yt/analysis_modules/level_sets/clump_validators.py
@@ -13,6 +13,8 @@
 # The full license is in the file COPYING.txt, distributed with this software.
 #-----------------------------------------------------------------------------
 
+import numpy as np
+
 from yt.utilities.lib.misc_utilities import \
     gravitational_binding_energy
 from yt.utilities.operator_registry import \
@@ -64,28 +66,30 @@
              (bulk_velocity[1] - clump["all", "particle_velocity_y"])**2 +
              (bulk_velocity[2] - clump["all", "particle_velocity_z"])**2)).sum()
 
+    if use_particles:
+        m = np.concatenate([clump["gas", "cell_mass"].in_cgs(),
+                            clump["all", "particle_mass"].in_cgs()])
+        px = np.concatenate([clump["index", "x"].in_cgs(),
+                             clump["all", "particle_position_x"].in_cgs()])
+        py = np.concatenate([clump["index", "y"].in_cgs(),
+                             clump["all", "particle_position_y"].in_cgs()])
+        pz = np.concatenate([clump["index", "z"].in_cgs(),
+                             clump["all", "particle_position_z"].in_cgs()])
+    else:
+        m = clump["gas", "cell_mass"].in_cgs()
+        px = clump["index", "x"].in_cgs()
+        py = clump["index", "y"].in_cgs()
+        pz = clump["index", "z"].in_cgs()
+
     potential = clump.data.ds.quan(G *
         gravitational_binding_energy(
-            clump["gas", "cell_mass"].in_cgs(),
-            clump["index", "x"].in_cgs(),
-            clump["index", "y"].in_cgs(),
-            clump["index", "z"].in_cgs(),
+            m, px, py, pz,
             truncate, (kinetic / G).in_cgs()),
-        kinetic.in_cgs().units)
-    
+            kinetic.in_cgs().units)
+
     if truncate and potential >= kinetic:
         return True
 
-    if use_particles:
-        potential += clump.data.ds.quan(G *
-            gravitational_binding_energy(
-                clump["all", "particle_mass"].in_cgs(),
-                clump["all", "particle_position_x"].in_cgs(),
-                clump["all", "particle_position_y"].in_cgs(),
-                clump["all", "particle_position_z"].in_cgs(),
-                truncate, ((kinetic - potential) / G).in_cgs()),
-        kinetic.in_cgs().units)
-
     return potential >= kinetic
 add_validator("gravitationally_bound", _gravitationally_bound)
 

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d yt/analysis_modules/level_sets/tests/test_clump_finding.py
--- a/yt/analysis_modules/level_sets/tests/test_clump_finding.py
+++ b/yt/analysis_modules/level_sets/tests/test_clump_finding.py
@@ -15,16 +15,25 @@
 #-----------------------------------------------------------------------------
 
 import numpy as np
+import os
+import shutil
+import tempfile
+
 
 from yt.analysis_modules.level_sets.api import \
     Clump, \
     find_clumps, \
     get_lowest_clumps
+from yt.convenience import \
+    load
 from yt.frontends.stream.api import \
     load_uniform_grid
 from yt.testing import \
     assert_array_equal, \
-    assert_equal
+    assert_equal, \
+    requires_file
+from yt.utilities.answer_testing.framework import \
+    data_dir_load
 
 def test_clump_finding():
     n_c = 8
@@ -63,7 +72,6 @@
     # two leaf clumps
     assert_equal(len(leaf_clumps), 2)
 
-
     # check some clump fields
     assert_equal(master_clump.children[0]["density"][0].size, 1)
     assert_equal(master_clump.children[0]["density"][0], ad["density"].max())
@@ -72,3 +80,58 @@
     assert_equal(master_clump.children[1]["density"][0].size, 1)
     assert_equal(master_clump.children[1]["density"][0], ad["density"].max())
     assert_equal(master_clump.children[1]["particle_mass"].size, 0)
+
+i30 = "IsolatedGalaxy/galaxy0030/galaxy0030"
+ at requires_file(i30)
+def test_clump_tree_save():
+    tmpdir = tempfile.mkdtemp()
+    curdir = os.getcwd()
+    os.chdir(tmpdir)
+
+    ds = data_dir_load(i30)
+    data_source = ds.disk([0.5, 0.5, 0.5], [0., 0., 1.],
+                          (8, 'kpc'), (1, 'kpc'))
+
+    field = ("gas", "density")
+    step = 2.0
+    c_min = 10**np.floor(np.log10(data_source[field]).min()  )
+    c_max = 10**np.floor(np.log10(data_source[field]).max()+1)
+
+    master_clump = Clump(data_source, field)
+    master_clump.add_info_item("center_of_mass")
+    master_clump.add_validator("min_cells", 20)
+
+    find_clumps(master_clump, c_min, c_max, step)
+    leaf_clumps = get_lowest_clumps(master_clump)
+
+    fn = master_clump.save_as_dataset(fields=["density", "particle_mass"])
+    ds2 = load(fn)
+
+    # compare clumps in the tree
+    t1 = [c for c in master_clump]
+    t2 = [c for c in ds2.tree]
+    mt1 = ds.arr([c.info["cell_mass"][1] for c in t1])
+    mt2 = ds2.arr([c["clump", "cell_mass"] for c in t2])
+    it1 = np.argsort(mt1).d.astype(int)
+    it2 = np.argsort(mt2).d.astype(int)
+    assert_array_equal(mt1[it1], mt2[it2])
+
+    for i1, i2 in zip(it1, it2):
+        ct1 = t1[i1]
+        ct2 = t2[i2]
+        assert_array_equal(ct1["gas", "density"],
+                           ct2["grid", "density"])
+        assert_array_equal(ct1["all", "particle_mass"],
+                           ct2["all", "particle_mass"])
+
+    # compare leaf clumps
+    c1 = [c for c in leaf_clumps]
+    c2 = [c for c in ds2.leaves]
+    mc1 = ds.arr([c.info["cell_mass"][1] for c in c1])
+    mc2 = ds2.arr([c["clump", "cell_mass"] for c in c2])
+    ic1 = np.argsort(mc1).d.astype(int)
+    ic2 = np.argsort(mc2).d.astype(int)
+    assert_array_equal(mc1[ic1], mc2[ic2])
+
+    os.chdir(curdir)
+    shutil.rmtree(tmpdir)

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d yt/data_objects/selection_data_containers.py
--- a/yt/data_objects/selection_data_containers.py
+++ b/yt/data_objects/selection_data_containers.py
@@ -783,6 +783,7 @@
         self.conditionals = ensure_list(conditionals)
         self.base_object = data_source
         self._selector = None
+        self._particle_mask = {}
         # Need to interpose for __getitem__, fwidth, fcoords, icoords, iwidth,
         # ires and get_data
 
@@ -805,7 +806,8 @@
             f = self.base_object[field]
             if f.shape != ind.shape:
                 parent = getattr(self, "parent", self.base_object)
-                self.field_data[field] = parent[field][self._part_ind]
+                self.field_data[field] = \
+                  parent[field][self._part_ind(field[0])]
             else:
                 self.field_data[field] = self.base_object[field][ind]
 
@@ -835,21 +837,22 @@
                 np.logical_and(res, ind, ind)
         return ind
 
-    _particle_mask = None
-    @property
-    def _part_ind(self):
-        if self._particle_mask is None:
+    def _part_ind(self, ptype):
+        if self._particle_mask.get(ptype) is None:
             parent = getattr(self, "parent", self.base_object)
             units = "code_length"
             mask = points_in_cells(
-                self["x"].to(units), self["y"].to(units),
-                self["z"].to(units), self["dx"].to(units),
-                self["dy"].to(units), self["dz"].to(units),
-                parent["particle_position_x"].to(units),
-                parent["particle_position_y"].to(units),
-                parent["particle_position_z"].to(units))
-            self._particle_mask = mask
-        return self._particle_mask
+                self[("index", "x")].to(units),
+                self[("index", "y")].to(units),
+                self[("index", "z")].to(units),
+                self[("index", "dx")].to(units),
+                self[("index", "dy")].to(units),
+                self[("index", "dz")].to(units),
+                parent[(ptype, "particle_position_x")].to(units),
+                parent[(ptype, "particle_position_y")].to(units),
+                parent[(ptype, "particle_position_z")].to(units))
+            self._particle_mask[ptype] = mask
+        return self._particle_mask[ptype]
 
     @property
     def icoords(self):

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d yt/frontends/ytdata/api.py
--- a/yt/frontends/ytdata/api.py
+++ b/yt/frontends/ytdata/api.py
@@ -23,7 +23,9 @@
     YTNonspatialDataset, \
     YTNonspatialHierarchy, \
     YTNonspatialGrid, \
-    YTProfileDataset
+    YTProfileDataset, \
+    YTClumpContainer, \
+    YTClumpTreeDataset
 
 from .io import \
     IOHandlerYTDataContainerHDF5, \

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d yt/frontends/ytdata/data_structures.py
--- a/yt/frontends/ytdata/data_structures.py
+++ b/yt/frontends/ytdata/data_structures.py
@@ -56,6 +56,8 @@
     _h5py as h5py
 from yt.utilities.parallel_tools.parallel_analysis_interface import \
     parallel_root_only
+from yt.utilities.tree_container import \
+    TreeContainer
 from yt.fields.field_exceptions import \
     NeedsGridType
 from yt.data_objects.data_containers import \
@@ -480,20 +482,37 @@
                 particles.append((ftype, fname))
             elif (ftype, fname) not in fluids:
                 fluids.append((ftype, fname))
+
         # The _read method will figure out which fields it needs to get from
         # disk, and return a dict of those fields along with the fields that
         # need to be generated.
         read_fluids, gen_fluids = self.index._read_fluid_fields(
                                         fluids, self, self._current_chunk)
         for f, v in read_fluids.items():
-            self.field_data[f] = self.ds.arr(v, input_units = finfos[f].units)
-            self.field_data[f].convert_to_units(finfos[f].output_units)
+            convert = True
+            if v.dtype != np.float64:
+                if finfos[f].units == "":
+                    self.field_data[f] = v
+                    convert = False
+                else:
+                    v = v.astype(np.float64)
+            if convert:
+                self.field_data[f] = self.ds.arr(v, input_units = finfos[f].units)
+                self.field_data[f].convert_to_units(finfos[f].output_units)
 
-        read_particles, gen_particles = self.index._read_particle_fields(
+        read_particles, gen_particles = self.index._read_fluid_fields(
                                         particles, self, self._current_chunk)
         for f, v in read_particles.items():
-            self.field_data[f] = self.ds.arr(v, input_units = finfos[f].units)
-            self.field_data[f].convert_to_units(finfos[f].output_units)
+            convert = True
+            if v.dtype != np.float64:
+                if finfos[f].units == "":
+                    self.field_data[f] = v
+                    convert = False
+                else:
+                    v = v.astype(np.float64)
+            if convert:
+                self.field_data[f] = self.ds.arr(v, input_units = finfos[f].units)
+                self.field_data[f].convert_to_units(finfos[f].output_units)
 
         fields_to_generate += gen_fluids + gen_particles
         self._generate_fields(fields_to_generate)
@@ -562,7 +581,7 @@
 
     def _setup_classes(self):
         # We don't allow geometric selection for non-spatial datasets
-        pass
+        self.objects = []
 
     @parallel_root_only
     def print_key_parameters(self):
@@ -685,3 +704,78 @@
             if data_type == "yt_profile":
                 return True
         return False
+
+class YTClumpContainer(TreeContainer):
+    def __init__(self, clump_id, global_id, parent_id,
+                 contour_key, contour_id, ds=None):
+        self.clump_id = clump_id
+        self.global_id = global_id
+        self.parent_id = parent_id
+        self.contour_key = contour_key
+        self.contour_id = contour_id
+        self.parent = None
+        self.ds = ds
+        TreeContainer.__init__(self)
+
+    def add_child(self, child):
+        if self.children is None:
+            self.children = []
+        self.children.append(child)
+        child.parent = self
+
+    def __repr__(self):
+        return "Clump[%d]" % self.clump_id
+
+    def __getitem__(self, field):
+        g = self.ds.data
+        f = g._determine_fields(field)[0]
+        if f[0] == "clump":
+            return g[f][self.global_id]
+        if self.contour_id == -1:
+            return g[f]
+        cfield = (f[0], "contours_%s" % self.contour_key.decode('utf-8'))
+        if f[0] == "grid":
+            return g[f][g[cfield] == self.contour_id]
+        return self.parent[f][g[cfield] == self.contour_id]
+
+class YTClumpTreeDataset(YTNonspatialDataset):
+    """Dataset for saved clump-finder data."""
+    def __init__(self, filename, unit_system="cgs"):
+        super(YTClumpTreeDataset, self).__init__(filename,
+                                                 unit_system=unit_system)
+        self._load_tree()
+
+    def _load_tree(self):
+        my_tree = {}
+        for i, clump_id in enumerate(self.data[("clump", "clump_id")]):
+            my_tree[clump_id] = YTClumpContainer(
+                clump_id, i, self.data["clump", "parent_id"][i],
+                self.data["clump", "contour_key"][i],
+                self.data["clump", "contour_id"][i], self)
+        for clump in my_tree.values():
+            if clump.parent_id == -1:
+                self.tree = clump
+            else:
+                parent = my_tree[clump.parent_id]
+                parent.add_child(clump)
+
+    _leaves = None
+    @property
+    def leaves(self):
+        if self._leaves is None:
+            self._leaves = []
+            for clump in self.tree:
+                if clump.children is None:
+                    self._leaves.append(clump)
+        return self._leaves
+
+    @classmethod
+    def _is_valid(self, *args, **kwargs):
+        if not args[0].endswith(".h5"): return False
+        with h5py.File(args[0], "r") as f:
+            data_type = parse_h5_attr(f, "data_type")
+            if data_type is None:
+                return False
+            if data_type == "yt_clump_tree":
+                return True
+        return False

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d yt/frontends/ytdata/io.py
--- a/yt/frontends/ytdata/io.py
+++ b/yt/frontends/ytdata/io.py
@@ -42,14 +42,13 @@
                 rv.update(gf)
             if len(rv) == len(fields): return rv
             f = h5py.File(u(g.filename), "r")
-            gds = f["data"]
             for field in fields:
                 if field in rv:
                     self._hits += 1
                     continue
                 self._misses += 1
                 ftype, fname = field
-                rv[(ftype, fname)] = gds[fname].value
+                rv[(ftype, fname)] = f[ftype][fname].value
             if self._cache_on:
                 for gid in rv:
                     self._cached_fields.setdefault(gid, {})

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d yt/frontends/ytdata/utilities.py
--- a/yt/frontends/ytdata/utilities.py
+++ b/yt/frontends/ytdata/utilities.py
@@ -136,6 +136,9 @@
             field_name = field[1]
         else:
             field_name = field
+        # thanks, python3
+        if data[field].dtype.kind == 'U':
+            data[field] = data[field].astype('|S40')
         _yt_array_hdf5(fh[field_type], field_name, data[field])
         if "num_elements" not in fh[field_type].attrs:
             fh[field_type].attrs["num_elements"] = data[field].size

diff -r efbc6a51b1322148be1f956bc2e3b1f73a5b9198 -r ed78789acef4d8e51ecb4e5a40ee55ad31508d5d yt/utilities/tree_container.py
--- /dev/null
+++ b/yt/utilities/tree_container.py
@@ -0,0 +1,33 @@
+"""
+TreeContainer class and member functions
+
+
+
+"""
+
+#-----------------------------------------------------------------------------
+# Copyright (c) 2016, yt Development Team.
+#
+# Distributed under the terms of the Modified BSD License.
+#
+# The full license is in the file COPYING.txt, distributed with this software.
+#-----------------------------------------------------------------------------
+
+class TreeContainer(object):
+    r"""A recursive data container for things like merger trees and
+    clump-finder trees.
+
+    """
+    _child_attr = "children"
+
+    def __init__(self):
+        setattr(self, self._child_attr, None)
+
+    def __iter__(self):
+        yield self
+        children = getattr(self, self._child_attr)
+        if children is None:
+            return
+        for child in children:
+            for a_node in child:
+                yield a_node

Repository URL: https://bitbucket.org/yt_analysis/yt/

--

This is a commit notification from bitbucket.org. You are receiving
this because you have the service enabled, addressing the recipient of
this email.