[yt-dev] Zombie jobs on eudora?

Nathan Goldbaum nathan12343 at gmail.com
Tue Jun 10 10:57:31 PDT 2014


On Tue, Jun 10, 2014 at 10:45 AM, Matthew Turk <matthewturk at gmail.com>
wrote:

> Hi Nathan,
>
> On Tue, Jun 10, 2014 at 12:43 PM, Nathan Goldbaum <nathan12343 at gmail.com>
> wrote:
> >
> >
> >
> > On Tue, Jun 10, 2014 at 6:09 AM, Matthew Turk <matthewturk at gmail.com>
> wrote:
> >>
> >> Hi Nathan,
> >>
> >> On Mon, Jun 9, 2014 at 11:02 PM, Nathan Goldbaum <nathan12343 at gmail.com
> >
> >> wrote:
> >> > Hey all,
> >> >
> >> > I'm looking at a memory leak that Philip (cc'd) is seeing when
> iterating
> >> > over a long list of FLASH datasets.  Just as an example of the type of
> >> > behavior he is seeing - today he left his script running and ended up
> >> > consuming 300 GB of RAM on a viz node.
> >> >
> >> > FWIW, the dataset is not particularly large - ~300 outputs and ~100 MB
> >> > per
> >> > output. These are also FLASH cylindrical coordinate simulations - so
> >> > perhaps
> >> > this behavior will only occur in curvilinear geometries?
> >>
> >> Hm, I don't know about that.
> >>
> >> >
> >> > I've been playing with objgraph to try to understand what's happening.
> >> > Here's the script I've been using:
> >> > http://paste.yt-project.org/show/4762/
> >> >
> >> > Here's the output after one iteration of the for loop:
> >> > http://paste.yt-project.org/show/4761/
> >> >
> >> > It seems that for some reason a lot of data is not being garbage
> >> > collected.
> >> >
> >> > Could there be a reference counting bug somewhere down in a cython
> >> > routine?
> >>
> >> Based on what you're running, the only Cython routines being called
> >> are likely in the selection system.
> >>
> >> > Objgraph is unable to find backreferences to root grid tiles in the
> >> > flash
> >> > dataset, and all the other yt objects that I've looked at seem to have
> >> > backreference graphs that terminate at a FLASHGrid object that
> >> > represents a
> >> > root grid tile in one of the datasets.  That's the best guess I have -
> >> > but
> >> > definitely nothing conclusive.  I'd appreciate any other ideas anyone
> >> > else
> >> > has to help debug this.
> >>
> >> I'm not entirely sure how to parse the output you've pasted, but I do
> >> have a thought.  If you have a reproducible case, I can test it
> >> myself.  I am wondering if this could be related to the way that grid
> >> masks are cached.  You should be able to test this by adding this line
> >> to _get_selector_mask in grid_patch.py, just before "return mask"
> >>
> >> self._last_mask = self._last_selector_id = None
> >>
> >> Something like this patch:
> >>
> >> http://paste.yt-project.org/show/4316/
> >
> >
> > Thanks for the code!  I will look into this today.
> >
> > Sorry for not explaining the random terminal output I pasted from
> objgraph
> > :/
> >
> > It's a list of objects created after yt operates on one dataset and after
> > the garbage collector is explicitly called. Each iteration of the loop
> sees
> > the creation of objects representing the FLASH grids, hierarchy, and
> > associated metadata.  With enough iterations this overhead from previous
> > loop iterations begins to dominate the total memory budget.
>
> The code snippet I sent might help reduce it, but I think it speaks to
> a deeper problem in that somehow the FLASH stuff isn't being GC'd
> anywhere.  It really ought to be.
>
> Can you try also doing:
>
> yt.frontends.flash.FLASHDataset._skip_cache = True
>

No effect, unfortunately.


> and seeing if that helps?
>
> >
> >>
> >>
> >>
> >> -Matt
> >>
> >> >
> >> > Thanks for your help in debugging this!
> >> >
> >> > -Nathan
> >> >
> >> _______________________________________________
> >> yt-dev mailing list
> >> yt-dev at lists.spacepope.org
> >> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >
> >
> >
> > _______________________________________________
> > yt-dev mailing list
> > yt-dev at lists.spacepope.org
> > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
> >
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20140610/70eb3d7a/attachment.html>


More information about the yt-dev mailing list