[yt-dev] Reducing memory usage in time series

Tue Feb 4 12:48:32 PST 2014

Hi Matt,

On 02/04/2014 01:44 PM, Matthew Turk wrote:
> Hi John,
>
> On Tue, Feb 4, 2014 at 1:21 PM, John Wise <jwise at physics.gatech.edu> wrote:
>> Hi all,
>>
>> I've been trying to run rockstar in a sizable Enzo simulation (150k grids)
>> with ~100 outputs, where it's running out of memory just loading the
>> hierarchies.  One hierarchy instance consumes almost 1GB!  I've found this
>> to be problem not specific to rockstar but time series objects.
>
> Hm.  How are you iterating over the parameter files?  With the time
> series we try to do a load/retain on demand system, where the
> parameter files and their hierarchies are only kept around as long as
> they need to be.  Devin and Hilary looked at how this worked with
> Rockstar, and I thought they came up with that it was okay.

I create the time series with a list of pfs because I want to have them 
sorted by time.  I do this because I have both time-based and 
redshift-based outputs.  Also I've removed my call to the hierarchy 
destructor in the static_output destructor because it wasn't having any 
effect.

Then I iterate like the following...

ts = TimeSeriesData(pfs)
for pf in ts:
     print "%20s (0): %d" % (pf, get_memory_usage())
     pf.h
     print "%20s (1): %d" % (pf, get_memory_usage())
     del pf
     print "%20s (2): %d" % ("", get_memory_usage())

which gives

          output_0048 (0): 74
          output_0048 (1): 971
                      (2): 972
          output_0049 (0): 972
          output_0049 (1): 1705
                      (2): 1705
          output_0050 (0): 1705
          output_0050 (1): 2434
                      (2): 2434
          output_0051 (0): 2434
          output_0051 (1): 3169
                      (2): 3170

Is there a better way to iterate to through the pfs?  Removing the "del 
pf" shows the same memory usage, fwiw.

However if I replace "del pf" with "del pf._instantiated_hierarchy" 
using my destructor, I see nearly stable memory usage

          output_0048 (0): 74
          output_0048 (1): 971
                      (2): 963
          output_0049 (0): 963
          output_0049 (1): 978
                      (2): 978
          output_0050 (0): 978
          output_0050 (1): 978
                      (2): 978
          output_0051 (0): 978
          output_0051 (1): 982
                      (2): 983

I see stable behavior in rockstar if I add a "del 
pf._instantiated_hierarchy" at the end of rh_read_particles().

>>
>> My solution is to explicitly delete the hierarchy's metadata and grids.
>> Since I haven't contributed to yt-3.0 yet, I wanted to run this by everyone
>> before submitting a PR.
>>
>> My question is about coding style, in that I see very few __del__()
>> functions now.  In my working version, I've defined a __del__ function for
>> the grid_geometry_handler as
>>
>>      def __del__(self):
>>          del self.grid_dimensions
>>          del self.grid_left_edge
>>          del self.grid_right_edge
>>          del self.grid_levels
>>          del self.grid_particle_count
>>          del self.grids
>>
>> When I delete pf._instantiated_hierarchy after each loop of a time series
>> iterator, I don't see any excessive memory usage anymore.  It just reuses
>> the allocated memory from the previous iteration, which is totally fine by
>> me.  However, when I include this in a __del__ function for a static_output,
>> I still see excessive memory usage, which is bizarre to me.
>
> Hmm.
>
> I'm of two minds on this.  On the one hand, I am not really *opposed*
> to destructors, but I don't like that they are necessary.  Because the
> hierarchy is weirdly self-referential to the static output, this
> sometimes causes problems and the garbage collector doesn't pick it
> up.  However, when the parameter file is deallocated, it *should*
> deallocate all of the arrays.  Whether it does or not may be related
> to the system allocator, and whether it reuses the memory is
> potentially also related to that.  On the other hand, I'd rather fix
> the issue of having a separate index and static output object, and
> break the reference cycle between them.

I agree that having destructors are not the cleanest way to maintain 
memory usage because it should be taken care of when the parameter file 
is deallocated.  I've tested this on a few systems (Mac OSX, my desktop 
with the v3.11 kernel, and the local cluster with the v2.6.32 kernel), 
and the system allocator behaves similarly.

> So I guess where I fall down on this is: I'd like to fix the
> underlying issue, which is something I have been off-and-on working
> on.  But since you are measurably seeing improvement with this change,
> I'm okay with it going in.  But hopefully it will become obsolete
> eventually.  ;-)

I'm totally fine with this change being a temporary fix while the 
underlying problem is being solved.

> Incidentally, I would still like to see how the time series is
> iterating, and how the references pass through the system.
>
> A related paper you might find interesting:
> http://www.dlr.de/sc/en/Portaldata/15/Resources/dokumente/PyHPC2013/submissions/pyhpc2013_submission_6.pdf

Looks like a good read :)  Thanks for your input!

Cheers,
John

>>
>> Should I define a new routine in the grid_geometry_handler, something like
>> clear_hierarchy(), or keep the __del__ function?  I ask because I want to
>> keep in line with the overall structure of yt-3.0.  This could also be
>> included in the clear_all_data() call.
>>
>> What do people think the best approach would be?

-- 
John Wise
Assistant Professor of Physics
Center for Relativistic Astrophysics, Georgia Tech
http://cosmo.gatech.edu