[yt-users] memory usage of HOP halo finder

Matthew Turk matthewturk at gmail.com
Thu Feb 23 19:31:35 PST 2012


Hi Mike,

Thanks for letting me take a look at the data.  I have identified the
problem.  To convert from code-units to good-units, yt calculates the
conversion factor.  However, it also batches the grids to convert.  To
do so, it calculates -- in this case -- CellVolume for every grid.
(Your 512^3 topgrid exacerbates the problem.)  However, because I (and
this was most definitely my fault) did not use the functionality in yt
to ensure that every grid that has a supplemental field loaded then
flushes that field from memory once it has been used, the CellVolume
fields are all retained.  So, CellVolume -- along with maybe one or
two other fields -- was being generated for every grid.

Having fixed this, I see about what I would expect for memory use on
this dataset.

I've issued a pull request to fix this problem, and I would request
testing from both you and Stephen, as it touches the way particles are
read and converted.  I am leery of changes like this without a few
more sets of eyes.  Additionally, I have tested it, and while it gives
the same answer to a very good precision, it is enough different
(likely because of concatenation order and FP-roundoff; for moving7,
the relative difference in a sum is ~1e-8) that the gold standard will
have to be re-generated.

The PR is here:

https://bitbucket.org/yt_analysis/yt/pull-request/105/particle-io-fix

-Matt

On Wed, Feb 22, 2012 at 9:16 PM, Stephen Skory <s at skory.us> wrote:
> Hi Mike,
>
>> Yes, that does the trick. However,
>> self._data_source.quantities["TotalQuantity"]("ParticleMassMsun")
>> returns a list, so I needed to add a '[0]' in order to get just the
>> number.
>
> I'm glad it helped. I will make this change soon to the source. I
> always forget about that list part!
>
>> It's not immediately clear to me how to implement this fix for the
>> dm_only=True case, in which you only want the sum over DM particles.
>
> It may be possible to write a special field or something... I'll think about it.
>
>> Lastly, does the sub_mass calculation have to be done even when
>> subvolume is None and only a single processor is being used? It seems
>> in this case sub_mass = total_mass and the second calculation could be
>> skipped.
>
> I think you're right. I'll make this change too! Thanks for pointing this out.
>
> --
> Stephen Skory
> s at skory.us
> http://stephenskory.com/
> 510.621.3687 (google voice)
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org



More information about the yt-users mailing list