[Yt-dev] errors in SpecificAngularMomentum

Wed Feb 9 07:29:35 PST 2011

Hi John,

(Sorry for the delay, I was away from email yesterday and that
fortuitously gave me time to compose my reply in my head a bit.)

I guess this isn't that surprising to me.  I have done a poor job of
handling vector fields in yt.  I think the AngularMomentum fields
actually date back to before the code was called 'yt,' and back in
those days you could even calculate the inertial tensor as a
fully-local field.

The problem that I ended up running into is exactly the problem you
describe.  The problem is basically what you point out:

1) When generating a covering grid, you end up with (I,J,K,N) where N
is the number of components to the vector or tensor field.
2) Concatenating and joining needs substantially more care, as do
operations like multiplication where broadcasting may result in
exceptionally large arrays if not handled correctly.
3) The weighting and averaging of fields like angular momentum often
have to be handled differently based on both the user and the type of
process being done; by making it epsilon harder, I guess I implicitly
made it less likely that averaging errors were the result of yt's
issues rather than the user.  I know, I know, this kind of decision is
a bit ... callous / inconsiderate?

To get around this, I actually added SpecificAngularMomentum[XYZ], and
that's what I've been using.  This made it more clear when you wanted
to generate, say, the angular momentum magnitude in a given radial
bin, where the square root sign is placed.

Anyway, those things aside, I do agree it would be nice to have a bit
of an easier mechanism for handling vector fields.  In fact, it would
be nice to have an easier and more standard mechanism for addressing
fields on the whole.  I think while we could fix this specific
problem, it would be better to hold off on addressing it directly
until we have both a better method for handling fields in general as
well as a better method for regression and answer testing.

There are a few ways we could address the issue of fields.

Right now fields are just numpy.ndarrays, and individual fields do not
know anything about themselves: there is such a thing as a
DerivedField object (line 220,
yt/data_objects/field_info_container.py), but this object only serves
as a value in a dictionary of fields and as a mechanism for generating
an numpy.ndarray; once the numpy.ndarray had been created, it no
longer retains any reference to the original field.  This is good and
bad; if we were to subclass ndarray to retain field name, units, etc,
then it would be bad for this to contain an incorrect reference once
any value had been changed.  (i.e., if we multiplied by 10 and the old
field name were to be retained.)  However, we *could* subclass ndarray
(http://docs.scipy.org/doc/numpy/user/basics.subclassing.html) to get
some measure of functionality.  If we were to subclass, then we would
be able to implement combination functions, etc etc, that operated in
a yt-specific way.

However, there *are* only a few places that we actually do
manipulations of the array content.  We could construct a new
utilities section for array handling; this could include the
concatenation of arrays inside AMRData objects, multiplication, and so
on.  This is a bit of a longer term process, and again would require
better testing mechanisms than we currently have in place.

(And while we're at it, I'd like to standardize on field names that
mean something and that are distinct from the fields in the file.
This code can be a bit fragile, with translation tables and so on.)

I guess that's kind of a winding response that ends with: right now,
vector/tensor fields can be generated, but are not reliable inputs to
derived quantities and parallelism.  However, you can get around this
with specific components.  And, in the future, we might be able to fix
this to make a general solution, but the code base isn't ready just
yet.

Best,

Matt

On Tue, Feb 8, 2011 at 9:02 AM, John Wise <jwise at astro.princeton.edu> wrote:
> Hi all,
>
> I was trying to calculate the spin parameter for a halo with the following simple script,
>
> center = [0.495013,     0.494046,     0.495160]
> radius = 100  # pc
> pf = load("DD0019/output_0019")
> sp = pf.h.sphere(center, radius/pf["pc"])
> gas_spin = sp.quantities["BaryonSpinParameter"]()
>
> and it doesn't work in parallel.  It gives the following traceback.
>
> ---
> Traceback (most recent call last):
>  File "spin.py", line 9, in <module>
>    gas_spin = sp.quantities["BaryonSpinParameter"]()
>  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/yt-2.1dev-py2.6-macosx-10.6-intel.egg/yt/data_objects/derived_quantities.py", line 85, in __call__
>    self.func(e, *args, **kwargs)
>  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/yt-2.1dev-py2.6-macosx-10.6-intel.egg/yt/data_objects/derived_quantities.py", line 237, in _BaryonSpinParameter
>    am = data["SpecificAngularMomentum"]*data["CellMassMsun"]
> ValueError: shape mismatch: objects cannot be broadcast to a single shape
> Traceback (most recent call last):
>  File "spin.py", line 9, in <module>
>    gas_spin = sp.quantities["BaryonSpinParameter"]()
>  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/yt-2.1dev-py2.6-macosx-10.6-intel.egg/yt/data_objects/derived_quantities.py", line 85, in __call__
>    self.func(e, *args, **kwargs)
>  File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/yt-2.1dev-py2.6-macosx-10.6-intel.egg/yt/data_objects/derived_quantities.py", line 237, in _BaryonSpinParameter
>    am = data["SpecificAngularMomentum"]*data["CellMassMsun"]
> ValueError: shape mismatch: objects cannot be broadcast to a single shape
> ---
>
>
> This happens on multiple systems.  I looked into this further, and I think the shapes of the angular momentum data are not being preserved in the MPI calls.  Also the shapes appear wrong in a serial calculation.  In serial, everything looks fine until data["SpecificAngularMomentum"] is returned.  Here are the shapes of the arrays.
>
> v_vec: (3,16,16,16)
> cross product of r_vec and v_vec: (3,16,16,16)
> and the returned array from _SpecificAngularMomentum has a shape of (3,0)!
>
> Then when I try to run it with 2 processors, the shapes change to
> v_vec: (3,4096)
> cross: (3,4096)
> _SpecificAngularMomentum result: (12288,)
>
> Can someone replicate this error?  Any ideas?
>
> Thanks,
> John
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>