[yt-dev] memory usage with --parallel and periodic radius

Michael Kuhlen mqk at astro.berkeley.edu
Wed Feb 27 22:28:56 PST 2013


Hi Nathan

Thanks for that suggestion. I can confirm that adding the old radius fields
to my_plugins.py fixes the excessive memory issue. I'll just filed an issue
about it.

Btw, I think the problem arises primarily in the use of the 'sphere' data
container in the halo profiling. The first time you access any dataset from
a sphere, AMRSphereBase._get_cut_mask() has to create a RadiusCode field
for the top grid, which in my case is 256^3. With your current _Radius
definition, you first create a position field with shape (3,256,256,256)
and then you duplicate that to create the 'center' field. I haven't fully
groked everything that periodic_dist() does, but I think it must be
possible to do it without creating this huge and redundant center field,
and possibly even without creating one huge position field either.
Unfortunately I don't have the time right now to figure this out,
especially since I can override the radius field with the old definition,
like you suggested.

Mike


On Wed, Feb 27, 2013 at 9:00 PM, Nathan Goldbaum <goldbaum at ucolick.org>wrote:

> Hi Mike,
>
> Sorry to hear you're having issues with my changes.  Just to justify
> myself a little bit: the old way of generating the radius field silently
> produced incorrect results on non-periodic datasets.  yt is increasingly
> being used to examine these datasets and I wanted to make sure that the
> results of a simple radial profile analysis would be correct.
>
> That being said, the memory consumption is clearly not as good now.  When
> I wrote the new code to handle the radius fields, I didn't realize that the
> Radius field was so intimately tied to halo finding.
>
> A quick fix to allow you to continue doing your analyses would be to
> replace the Radius field with the old definition.  This can be done without
> committing any changes to the yt codebase by using the my_plugins.py file.
>  Just create a file called my_plugins.py, place it in the .yt folder that
> lives in your home directory, and enter the old Radius and ParticleRadius
> field definitions inside of it.  I've pasted an example my_plugins.py file
> that does this here: http://paste.yt-project.org/show/3212/
>
> Fields defined in my_plugins.py will override definitions in
> universal_fields.py, so if the new field definitions are the primary cause
> of the increased memory consumption you're seeing, this should fix it.
>
> This isn't a very good long term solution and I'd like to work with you
> and others who deal with large datasets to find a permanent solution.  A
> good first step would be to file an issue about this.  I don't have a lot
> of experience working with large datasets so any help figuring out how to
> reduce the memory needs of the Radius field would be appreciated.
>
> Cheers,
>
> Nathan
>
>
> On Wed, Feb 27, 2013 at 8:01 PM, Michael Kuhlen <mqk at astro.berkeley.edu>wrote:
>
>> > With the last good changeset the total memory usage never gets above
>> ~8.5GB (estimated from top).
>>
>> Sorry, forgot to say: the last good changeset is 41358eecdad5.
>>
>>
>> On Wed, Feb 27, 2013 at 7:58 PM, Michael Kuhlen <mqk at astro.berkeley.edu>wrote:
>>
>>> Hi all and Nathan specifically
>>>
>>> I've found that the changes having to do with periodic fields are
>>> causing excessive memory usage for me when doing parallel halo profiling,
>>> to the point where I can no longer do analysis that used to work fine on my
>>> 24GB memory workstation.
>>>
>>> A simple HaloProfiler script (http://paste.yt-project.org/show/3211/),
>>> run in parallel on 8 processors on the halo catalog obtained from one of my
>>> cosmology simulations, currently almost immediate runs out of memory, while
>>> it used to complete without problems.
>>>
>>> I tried to use hg bisect to find the problematic changeset but had to
>>> skip the testing of several revisions because of runtime yt errors, so in
>>> the end bisect only gave me a range of potentially bad revisions:
>>> http://paste.yt-project.org/show/3210/. As you can see, they're all
>>> related to the periodic radius mods from about a month ago.
>>>
>>> I tried replicating this with the standard Enzo_64 example dataset from
>>> http://yt-project.org/data/, but I guess that one is too small to
>>> produce this problem on my machine. If you want to see the problem for
>>> yourself, then you can download this tarball (
>>> http://astro.berkeley.edu/~mqk/transfer/RD0003.tar 4.5GB) and run this
>>> script (http://paste.yt-project.org/show/3211/) on it, like so:
>>>
>>> $ mpirun -np 8 python ./profile_halos.py --parallel
>>>
>>> When I run this with the current tip (ccfe34e70803) the total memory
>>> usage grows to >24GB. With the last good changeset the total memory usage
>>> never gets above ~8.5GB (estimated from top).
>>>
>>> It'd be great if we could find out what the problem is and fix it,
>>> because this is a major performance regression for me, that, as I already
>>> said, is seriously impacting my ability to do analysis with the current yt
>>> tip. Let me know if I should file a BB issue about this, and/or if there's
>>> some way I can assist with fixing this.
>>>
>>> Cheers,
>>> Mike
>>>
>>> --
>>> *********************************************************************
>>> *                                                                   *
>>> *  Dr. Michael Kuhlen              Theoretical Astrophysics Center  *
>>> *  email: mqk at astro.berkeley.edu   UC Berkeley                      *
>>> *  cell phone: (831) 588-1468      B-116 Hearst Field Annex # 3411  *
>>> *  skype username: mikekuhlen      Berkeley, CA 94720               *
>>> *                                                                   *
>>> *********************************************************************
>>>
>>
>>
>>
>> --
>> *********************************************************************
>> *                                                                   *
>> *  Dr. Michael Kuhlen              Theoretical Astrophysics Center  *
>> *  email: mqk at astro.berkeley.edu   UC Berkeley                      *
>> *  cell phone: (831) 588-1468      B-116 Hearst Field Annex # 3411  *
>> *  skype username: mikekuhlen      Berkeley, CA 94720               *
>> *                                                                   *
>> *********************************************************************
>>
>> _______________________________________________
>> yt-dev mailing list
>> yt-dev at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>>
>>
>
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>


-- 
*********************************************************************
*                                                                   *
*  Dr. Michael Kuhlen              Theoretical Astrophysics Center  *
*  email: mqk at astro.berkeley.edu   UC Berkeley                      *
*  cell phone: (831) 588-1468      B-116 Hearst Field Annex # 3411  *
*  skype username: mikekuhlen      Berkeley, CA 94720               *
*                                                                   *
*********************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20130227/9a9f5720/attachment.htm>


More information about the yt-dev mailing list