[yt-users] matplotlib issue on kraken

Matthew Turk matthewturk at gmail.com
Wed Sep 5 12:55:22 PDT 2012


Hi Eric,

The tex.cache issue seems to tbe the big one here.  Can you try, in
serial, launching a single job that imports yt?  I think it just needs
to be bootstrapped once.

-Matt

On Wed, Sep 5, 2012 at 3:50 PM, Eric Hallman <hallman at txcorp.com> wrote:
> Well the earlier traceback I posted below is a good start.  Seriously if I
> post the current error list it's going to be 10M of text.  I'll see what I
> can come up with in the short term and we can try later on IRC or something.
>
> Thanks
>
> Eric
>
> On Sep 5, 2012, at 3:45 PM, Nathan Goldbaum wrote:
>
> I'm sorry you're having so much trouble.  Unfortunately I'm probably not the
> best person to advise since I've never run jobs on Kraken.  Others on the
> list might be more helpful.
>
> One thing that would aid tracking down the problem is if you could paste the
> errors you're seeing somewhere so that one of us can take a look at it in
> detail.
>
> Cheers,
>
> Nathan
>
> On Sep 5, 2012, at 12:42 PM, Eric Hallman wrote:
>
> Nathan,
>   usings pmods in the import leads to an error explosion in the matplotlib
> imports.  It's even worse than the original test.
>
> Eric
> On Sep 5, 2012, at 12:42 PM, Nathan Goldbaum wrote:
>
> Hi Eric,
>
> Exactly right.  This is a drop-in replacement for yt.mods on high-latency
> parallel filesystems (like Kraken, unfortunately).
>
> There's some discussion on the dev mailing list:
> http://lists.spacepope.org/htdig.cgi/yt-dev-spacepope.org/2012-January/001760.html
>
> Unfortunately this isn't covered in the docs (except for a note in the
> changelog) but it should be in there.
>
> Cheers,
>
> Nathan
>
> On Sep 5, 2012, at 9:40 AM, Eric Hallman wrote:
>
> Nathan,
>   I've been off yt for a while, I'm unaware of pmods.  It's specific to
> parallel I'm guessing?
>
> Eric
> On Sep 5, 2012, at 12:38 PM, Nathan Goldbaum wrote:
>
> Hi Eric,
>
> Have you tried from yt.pmods import * instead of the normal yt.mods?
>
> Cheers,
>
> Nathan
>
> On Sep 5, 2012, at 9:37 AM, Eric Hallman wrote:
>
> Hey everyone,
>
> so this issue seems like one I've had before, but I searched the lists and
> don't find this exact issue.
>
>
> On batch jobs on kraken, attempting to do halo finding, I get an almost
> immediate crash (with an eternal hang until the time limit is reached) due
> to matplotlib.  I've been unable to reproduce it in the interactive queue on
> kraken, which is frustrating.  I'm hoping someone has seen it and can
> comment.
>
>
> this is with yt/dev on kraken, and I set env variables to MPLCONFIGDIR like
> so:
>
>
> export MPLCONFIGDIR=${PBS_O_WORKDIR}/.matplotlib/
>
> [ ! -d ${MPLCONFIGDIR} ] && mkdir ${MPLCONFIGDIR}
>
>
> because if you don't, it fails immediately with perm issues.
>
>
> Anyway, it's a simple script and call
>
>
> aprun -n 12 python halo_finding.py --parallel
>
>
> but the details of the script are not too important, since the job fails
> when yt is imported, as so:
>
>
> Traceback (most recent call last):
>
> File "halo_finding.py", line 1, in <module>
>
>   from yt.mods import *
>
> File
> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/yt-2.4dev-py2.7-linux-x86_64.egg/yt/mods.py",
> line 115, in <module>
>
>   from yt.visualization.api import \
>
> File
> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/yt-2.4dev-py2.7-linux-x86_64.egg/yt/visualization/api.py",
> line 34, in <module>
>
>   from plot_collection import \
>
> File
> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/yt-2.4dev-py2.7-linux-x86_64.egg/yt/visualization/plot_collection.py",
> line 26, in <module>
>
>   from matplotlib import figure
>
> File
> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/matplotlib/figure.py",
> line 18, in <module>
>
>   from axes import Axes, SubplotBase, subplot_class_factory
>
> File
> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/matplotlib/axes.py",
> line 18, in <module>
>
>   import matplotlib.contour as mcontour
>
> File
> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/matplotlib/contour.py",
> line 21, in <module>
>
>   import matplotlib.texmanager as texmanager
>
> File
> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/matplotlib/texmanager.py",
> line 72, in <module>
>
>   class TexManager:
>
> File
> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/matplotlib/texmanager.py",
> line 92, in TexManager
>
>   os.mkdir(texcache)
>
> OSError: [Errno 17] File exists:
> '/lustre/scratch/hallman/gigaCubes/run1024/.matplotlib/tex.cache'
>
>
> In each case, I have deleted tex.cache before I restart, thinking an old
> version persisted there, but the same error happens.  The most irritating
> thing is that the job does not kick out of the batch system, so the time
> continues to run on however many processors you have until the limit is
> reached (eternal hang!).
>
>
> I hope this is something obvious and I'm just dumb.  Let me know.
>
>
> Eric
>
> --
>
> Eric Hallman
>
> Tech-X Corporation               hallman at txcorp.com
>
> 5621 Arapahoe Ave, Suite A       Phone: (720) 254-5833
>
> Boulder, CO 80303                Fax:   (303) 448-7756
>
> --
>
>
>
>
>
> _______________________________________________
>
> yt-users mailing list
>
> yt-users at lists.spacepope.org
>
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
> --
> Eric Hallman
> Tech-X Corporation               hallman at txcorp.com
> 5621 Arapahoe Ave, Suite A       Phone: (720) 254-5833
> Boulder, CO 80303                Fax:   (303) 448-7756
> --
>
>
>
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
> --
> Eric Hallman
> Tech-X Corporation               hallman at txcorp.com
> 5621 Arapahoe Ave, Suite A       Phone: (720) 254-5833
> Boulder, CO 80303                Fax:   (303) 448-7756
> --
>
>
>
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
> --
> Eric Hallman
> Tech-X Corporation               hallman at txcorp.com
> 5621 Arapahoe Ave, Suite A       Phone: (720) 254-5833
> Boulder, CO 80303                Fax:   (303) 448-7756
> --
>
>
>
>
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>



More information about the yt-users mailing list