[Yt-dev] Fwd: ellipsoid part 2

Geoffrey So gsiisg at gmail.com
Fri Sep 9 19:13:54 PDT 2011


Matt, Stephen,

Thanks Matt for all the suggestions.  I've made most of the changes, but
some I wasn't quite sure what you meant:

1) Use booleans for parameters such as use_halos.  True and False, not 0 and
1.
- Done
2) Iterate over the halos:
- Done (I assume this is faster because it doesn't try to find which halo
every time I call it?)
3) Index variables should be clearly index variables; "halo" is an
object, but you use it as an index.
- Done, switched to calling halo_num to avoid ambiguity
4) cm => com
- Done
5) You set ax to be an array twice, which will double the memory
requirements.  I don't think ax needs to be an array at all.
- Switched to using list [], not sure why an array of arrays is better than
list of arrays, but I take your word for it
6) Line 109, just modify ax in place.  Fancy-indexing in numpy will (I
believe) implicitly make a copy when you modify:

In [1]: import numpy as na
In [2]: arr = na.random.random(100)
In [3]: ai = arr > 0.8
In [4]: bb = arr[ai]
In [5]: bb -= 1.0
In [6]: arr.max()
Out[6]: 0.99642788644174751
In [7]: bb.max()
Out[7]: -0.0035721135582524877

- Not quite sure was meant by the example

7) Line 113, use range or xrange, and use singular "axis" not "axes"
- Done
8) Line 114, you double the memory by making a new array again.
- Done, switched to list
9) Line 117, abs is a numpy function.
- Done, switched to na.abs()
10) Line 121, this makes four new arrays, each of which is the length
of the number of particles.  Construct a temporary array and do
in-place math.  You don't need "where" here; use booleans.
- Old code left from my using copy and paste from the testing phase.
Most things if used only once I avoid using memory by calling it a name
11) Your calculation of the inertial tensor is nice, but I would again
recommend that you avoid creating four temporary arrays when
calculating the magnitude of 'ax'.
- Not changed, since this is the fastest way I can think of.  Since I
realized the memory requirement of particles is much less than fields,
memory should not be the main concern here but speed.  But I'm all
ears if you or Stephen have a faster/efficient way.
12) I would encourage you to investigate the eigenvector code if you
suspect it gives LH system coordinates.
- Not done
13) "for iter in range(0, maxiterations)") should just be for iter in
xrange(maxiterations).  There is no need to specify 0.
- Done
14) Please use print without parenthesis.  Python 3 has transformed
print into a function, but the formatting rules are also different.
- Done
15) Avoid using where!
- Done
16) use .size instead of len() for arrays
- Done
17) Feel free to throw an error.  Define a subclass of Exception and
then raise an instance of it.
- Will look into how to write an error class in YT..., very useful I think
but want to see if the changes I have so far is adequate.

updated script:
http://paste.enzotools.org/show/1799/

So the ones I have trouble with are #6, 10, 11, all involving the phrase "in
place".  I had to wiki the phrase because I'm not familiar with it, and I
think you mean "input is overwritten by output" as per
http://en.wikipedia.org/wiki/In-place_algorithm

When I solve for the 3 different cases of boundary condition, I had to
double (1/3 axes * 3 cases = 1 extra) the particle position memory, because
I only check 1 axis at a time, so it's not as bad as if I do all 3 axes and
3 cases at once.  I can paste the code bit of "cases" into the next line
where it  is used twice by na.choose().  That should make the code "in
place" but it would be a lot less readable.  And since "cases" is used
twice, having it in memory also makes things easier.  I just wanted to
clarify why it is left the way it is, but if you still think the in place
method is better, I can make the change easily.

Another point which I didn't understand was the difference between using
array of array compare to list of arrays.  From what I read
http://stackoverflow.com/questions/176011/python-list-vs-array-when-to-use
people seem to prefer list over array, and use arrays only when the same
data type is absolutely require.  Another point is that arrays are more
memory efficient, because lists uses extra byte incase we want to append
things to it etc, but if we want to do math to it or modify it, it is better
use lists.  In my case I multiply the rotation matrix to the xys positions,
so I should use list because of doing math.  Is that the correct
understanding?

From
G.S.

---------- Forwarded message ----------
From: Matthew Turk <matthewturk at gmail.com>
Date: Tue, Sep 6, 2011 at 4:11 AM
Subject: Re: [Yt-dev] ellipsoid part 2
To: yt-dev at lists.spacepope.org


Hi Geoffrey,

Great to hear from you again on this project!  I have several comments
about the code interspersed; they are all relatively minor, but there
are several.

On Mon, Sep 5, 2011 at 10:23 PM, Geoffrey So <gsiisg at gmail.com> wrote:
> Hello Stephen,
> This is part 2 of the ellipsoid parameter finding.  The method is outlined
> mainly in the Dubinski and Carlberg 1991 paper.  But since we are doing
this
> in YT and we already know the particle list to begin with, I put in the
> option that the user can specify if they want all the particles in the
> region to be taken into account or only the particles found by the halo
> finder algorithm. This is an option that I think should be amended to the
> first ellipsoid parameter finder.

Is this the parameter halo_only?  I would like to note that
creation_time < 0 is not reliable, and will make your code
Enzo-specific.  I recommend for now using the boolean selector IsStar:

dm_index = dd["IsStar"]

Additionally, I would recommend not constructing a new array (which
will be slower than allowing the array selections to be left as is)
and to instead set ax (which should be renamed, as 'ax' is
idiomatically used in yt to reference an axis) to a list.

I would go a bit further and suggest you not use the parameter
halo_only as it is implicit in the halo finding.  If the halo finding
subselected on dark matter, then you would want that in the ellipsoid.
 If it didn't subselect, then you probably also don't want to
subselect in the ellipsoid finding.

> The script form is linked here:
> http://paste.enzotools.org/show/1782/
> Once I have written out the methodology for myself, I'll work on the YT
> documentation, will probably need some help in this area when it comes to
> it.

I have a few comments:

1) Use booleans for parameters such as use_halos.  True and False, not 0 and
1.
2) Iterate over the halos:

for halo in halos:
3) Index variables should be clearly index variables; "halo" is an
object, but you use it as an index.
4) cm => com
5) You set ax to be an array twice, which will double the memory
requirements.  I don't think ax needs to be an array at all.
6) Line 109, just modify ax in place.  Fancy-indexing in numpy will (I
believe) implicitly make a copy when you modify:

In [1]: import numpy as na
In [2]: arr = na.random.random(100)
In [3]: ai = arr > 0.8
In [4]: bb = arr[ai]
In [5]: bb -= 1.0
In [6]: arr.max()
Out[6]: 0.99642788644174751
In [7]: bb.max()
Out[7]: -0.0035721135582524877

7) Line 113, use range or xrange, and use singular "axis" not "axes"
8) Line 114, you double the memory by making a new array again.
9) Line 117, abs is a numpy function.
10) Line 121, this makes four new arrays, each of which is the length
of the number of particles.  Construct a temporary array and do
in-place math.  You don't need "where" here; use booleans.
11) Your calculation of the inertial tensor is nice, but I would again
recommend that you avoid creating four temporary arrays when
calculating the magnitude of 'ax'.
12) I would encourage you to investigate the eigenvector code if you
suspect it gives LH system coordinates.
13) "for iter in range(0, maxiterations)") should just be for iter in
xrange(maxiterations).  There is no need to specify 0.
14) Please use print without parenthesis.  Python 3 has transformed
print into a function, but the formatting rules are also different.
15) Avoid using where!
16) use .size instead of len() for arrays
17) Feel free to throw an error.  Define a subclass of Exception and
then raise an instance of it.

One fun thing would also be to plot projections of "Ones" using these
ellipsoids as sources.  That would help us see them better, too!

> Doron Lemez has used this method in his paper and helped me tested the
> accuracy of this script.  I was wondering do I need to ask him if he wants
> his name in the credits, or just mention his name as a special thanks?

Which credits do you mean?  The CREDITS file is reserved for code
contributions.  Feel free in your documentation to mention that he
assisted you, if you feel it is warranted.  Please be sure that your
documentation includes the ".. sectionauthor" directive to indicate
that you wrote it.

Please issue a pull request once you have included this routine into
the halo finding module; we can directly review it then.

Additionally, for the documentation modifications, I would encourage
you to fork yt-doc and make your changes in source/analysis_modules/
under the halo finding section.  Include a few (small file size)
images, too.

Thanks, GS!

-Matt

>  One
> thing he did in the paper that was especially useful is defined the
starting
> sphere to have 2x the virial radius.  Without that specification, I run
into
> non-convergence sometimes, where after many iterations I end up with no
> particles.  And I want to mention again that there's no guarantee that any
> or all the haloes will converge with this method.
> I tried to adhere to the PEP8 formatting, but old habits die hard so I'm
> sure I made mistakes in the formatting, let me know if you see them.
> From
> G.S.
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>
_______________________________________________
Yt-dev mailing list
Yt-dev at lists.spacepope.org
http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20110909/352c0b9c/attachment.html>


More information about the yt-dev mailing list