[yt-dev] [Bug] Halo Catalog units

Nathan Goldbaum nathan12343 at gmail.com
Mon Jun 13 09:40:19 PDT 2016


My apologies for sending you on a wild goose chase here! I should have
looked closer at the code before responding on the mailing list.

It looks like the issue you mailed about yesterday is due to a bug in yt's
unit system. I'm going to submit a patch shortly and will tag you.

On Mon, Jun 13, 2016 at 11:38 AM, Britton Smith <brittonsmith at gmail.com>
wrote:

> Hi Rasmi,
>
> My apologies for not responding earlier to this.  I have been away on
> holiday the last two weeks and didn't see this.
>
> Back when the HaloCatalog was first implemented, the .to_json and
> .from_json functions for saving and loading unit registries did not exist,
> or I was not aware of them.  To avoid confusion, I simply converted all
> fields to CGS for saving.  You can see this around line 435 (in the ._run
> function) of halo_catalog.py.  This is why code_length and CGS units are
> the same for HaloCatalogDatasets.
>
> I'm looking over your PR write now and will leave additional comments
> there.
>
> Britton
>
> On Thu, Jun 2, 2016 at 8:50 PM, Rasmi Elasmar <re2300 at columbia.edu> wrote:
>
>> Makes sense! PR submitted
>> <https://bitbucket.org/yt_analysis/yt/pull-requests/2208/added-halo-catalog-unit_registry-saving/diff>
>> -- let me know if that's good. Thanks for your help!
>>
>> On Thu, Jun 2, 2016 at 3:15 PM, Nathan Goldbaum <nathan12343 at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Thu, Jun 2, 2016 at 2:05 PM, Rasmi Elasmar <re2300 at columbia.edu>
>>> wrote:
>>>
>>>> Nathan,
>>>>
>>>> Sorry for the excess of questions. Should I implement the unit_registry
>>>> attr check here
>>>> <https://bitbucket.org/yt_analysis/yt/src/b8a09cd382dd34f386ce3634e7f78df3f5d9401d/yt/data_objects/static_output.py?at=yt&fileviewer=file-view-default#static_output.py-802>?
>>>> If so, how can I access the h5 dataset from there? If not, is here
>>>> <https://bitbucket.org/yt_analysis/yt/src/b8a09cd382dd34f386ce3634e7f78df3f5d9401d/yt/frontends/halo_catalog/data_structures.py?at=yt&fileviewer=file-view-default#data_structures.py-32>
>>>> better?
>>>>
>>>
>>> I would do it inside the `HaloCatalogDataset` class.
>>>
>>>
>>>> I don't know what the order of precedence is, so I don't want the
>>>> creation of the unit registry to be overwritten by the default one
>>>> _create_unit_registry call in the Dataset __init__. Is Particle File being
>>>> passed a Dataset that is already initiated with the default UnitRegistry?
>>>>
>>>
>>> You could add an implementation of _create_unit_regsitry to
>>> `HaloCatalogDataset` that supplements the implementation in the base class.
>>> First, call the superclass implementation, then if a registry is available
>>> in the output file, overwrite it inside `HaloCatalogDataset`. Something
>>> like this (using pseudocode here):
>>>
>>>     class HaloCatalogDataset(Dataset)
>>>
>>>     ... snip stuff that's already in this class ....
>>>
>>>         def _create_unit_registry(self):
>>>              super(HaloCatalogDataset, self)._create_unit_registry()
>>>              if registry in output_file:
>>>                   # replace the registry with what's in the output file
>>>
>>> -Nathan
>>>
>>>
>>>> On Thu, Jun 2, 2016 at 2:18 PM, Nathan Goldbaum <nathan12343 at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, Jun 2, 2016 at 1:16 PM, Rasmi Elasmar <re2300 at columbia.edu>
>>>>> wrote:
>>>>>
>>>>>> Hi Nathan,
>>>>>>
>>>>>> That approach sounds good. What do you think about this
>>>>>> implementation?
>>>>>>
>>>>>>> import h5py
>>>>>>> import yt
>>>>>>
>>>>>>
>>>>>>
>>>>>> ds = yt.load('RD0009/RD0009')
>>>>>>> hc_file = h5py.File('halo_catalogs/catalog/catalog.0.h5')
>>>>>>
>>>>>>
>>>>>>
>>>>>> # Include unit_registry as an attribute
>>>>>>> hc_file.attrs['unit_registry'] = ds.unit_registry.to_json()
>>>>>>
>>>>>>
>>>>>>
>>>>>> # Or as a string dataset
>>>>>>> unit_registry_json = ds.unit_registry.to_json()
>>>>>>> str_type = h5py.special_dtype(vlen=str)
>>>>>>> unit_registry_h5 = hcfile.create_dataset('unit_registry_json',
>>>>>>> shape=(1,), dtype=str_type)
>>>>>>> unit_registry_h5[:] = unit_registry_json
>>>>>>
>>>>>>
>>>>>>
>>>>>> hc_file.close()
>>>>>>
>>>>>>
>>>>>>
>>>>>> # Regenerate unit_registry when loading the HaloCatalog?
>>>>>>> halos_ds = yt.load('halo_catalogs/catalog/catalog.0.h5')
>>>>>>> hc_file = h5py.File('halo_catalogs/catalog/catalog.0.h5')
>>>>>>> unit_registry_json = hc_file.attrs['unit_registry']
>>>>>>> halos_ds.unit_registry =
>>>>>>> yt.units.unit_registry.UnitRegistry.from_json(unit_registry_json)
>>>>>>
>>>>>>
>>>>>>
>>>>>> I'm not sure if including the unit_registry as an attribute or string
>>>>>> dataset is a better idea -- it seems to be an attribute in other
>>>>>> parts of the codebase
>>>>>> <https://bitbucket.org/yt_analysis/yt/src/c117d528f18ea412f8b80e8522a0595444e20ae8/yt/units/yt_array.py?at=yt&fileviewer=file-view-default#yt_array.py-796>
>>>>>> .
>>>>>>
>>>>>
>>>>> Either should be fine. I've actually been meaning to update the
>>>>> snippet you linked to use the JSON method, since pickles aren't portable
>>>>> across python versions, but we need to be careful to maintain backward
>>>>> compatibility with older files.
>>>>>
>>>>>
>>>>>>
>>>>>> Then, how do we get it to take the place of the halos_ds
>>>>>> unit_registry? Should this be done on creation in HaloCatalogDataset
>>>>>> <https://bitbucket.org/yt_analysis/yt/src/b8a09cd382dd34f386ce3634e7f78df3f5d9401d/yt/frontends/halo_catalog/data_structures.py?at=yt&fileviewer=file-view-default#data_structures.py-40>?
>>>>>> (i.e., check if the h5 file has a unit_registry attr, if so, load that, if
>>>>>> not use the current defaults?)
>>>>>>
>>>>>>
>>>>> Sounds good! Looking forward to the pull request that implements this.
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Rasmi
>>>>>>
>>>>>> On Wed, Jun 1, 2016 at 1:01 PM, Nathan Goldbaum <
>>>>>> nathan12343 at gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 1, 2016 at 11:40 AM, Rasmi Elasmar <re2300 at columbia.edu>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Greg and I found a bug involving halo catalog unit handling:
>>>>>>>>
>>>>>>>> >>> halo.quantities['particle_position_x']
>>>>>>>>> 0.495074982741 cm
>>>>>>>>> >>> halo.quantities['particle_position_x'].in_units('code_length')
>>>>>>>>> 0.495074982741 code_length
>>>>>>>>> >>> halo.quantities['particle_position_x'].in_units('cm')
>>>>>>>>> 0.495074982741 cm
>>>>>>>>> >>> ds.unit_registry['code_length']
>>>>>>>>> (9.195880139956267e+25, (length))
>>>>>>>>> >>> halos_ds.unit_registry['code_length']
>>>>>>>>> (1.0, (length))
>>>>>>>>
>>>>>>>>
>>>>>>>> The halos_ds mixes up cm and code_length units when the HaloCatalog
>>>>>>>> object is created from a saved halo catalog. The halo catalog values are
>>>>>>>> saved in code_length, but the HaloCatalog object assumes they are in cm.
>>>>>>>>
>>>>>>>> Here
>>>>>>>> <https://bitbucket.org/yt_analysis/yt/src/b8a09cd382dd34f386ce3634e7f78df3f5d9401d/yt/analysis_modules/halo_analysis/halo_catalog.py?at=yt&fileviewer=file-view-default#halo_catalog.py-453>
>>>>>>>> is where the code_length units are written out after halo-finding is done
>>>>>>>> (this is confirmed with an h5ls).
>>>>>>>> Here is where the halos_ds
>>>>>>>> <https://bitbucket.org/yt_analysis/yt/src/b8a09cd382dd34f386ce3634e7f78df3f5d9401d/yt/frontends/halo_catalog/data_structures.py?at=yt&fileviewer=file-view-default#data_structures.py-78> for
>>>>>>>> the HaloCatalog is created. The length unit is set in cm -- the catalog is
>>>>>>>> assumed to be in cgs.
>>>>>>>> The HaloCatalog fields
>>>>>>>> <https://bitbucket.org/yt_analysis/yt/src/b8a09cd382dd34f386ce3634e7f78df3f5d9401d/yt/frontends/halo_catalog/fields.py?at=yt&fileviewer=file-view-default#fields.py-21>
>>>>>>>> also assume cgs.
>>>>>>>>
>>>>>>>> In theory, the HaloCatalog could just parse the code_length units
>>>>>>>> of the halos_ds, but this isn't necessarily known at the time of creation,
>>>>>>>> so the ideal fix may be to save the halo catalog length units in cm instead
>>>>>>>> of in code_length. Then the assumptions that are made about length being in
>>>>>>>> cm when creating a HaloCatalog object from a halo catalog would be correct.
>>>>>>>> Any thoughts on this approach or other approaches?
>>>>>>>>
>>>>>>>>
>>>>>>> It would probably be better just to save the unit registry into the
>>>>>>> hdf5 file. You might find the UnitRegistry.to_json() and
>>>>>>> UnitRegistry.from_json() to be useful here - the json data could be saved
>>>>>>> in the HDF5 output file as a string dataset.
>>>>>>>
>>>>>>>
>>>>>>> https://bitbucket.org/yt_analysis/yt/src/yt/yt/units/unit_registry.py?at=yt&fileviewer=file-view-default#unit_registry.py-120
>>>>>>>
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Rasmi
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> yt-dev mailing list
>>>>>>>> yt-dev at lists.spacepope.org
>>>>>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> yt-dev mailing list
>>>>>>> yt-dev at lists.spacepope.org
>>>>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> yt-dev mailing list
>>>>>> yt-dev at lists.spacepope.org
>>>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> yt-dev mailing list
>>>>> yt-dev at lists.spacepope.org
>>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> yt-dev mailing list
>>>> yt-dev at lists.spacepope.org
>>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>>
>>>>
>>>
>>> _______________________________________________
>>> yt-dev mailing list
>>> yt-dev at lists.spacepope.org
>>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>>
>>>
>>
>> _______________________________________________
>> yt-dev mailing list
>> yt-dev at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>
>>
>
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20160613/6a4e85a2/attachment.html>


More information about the yt-dev mailing list