[yt-users] matplotlib issue on kraken

Eric Hallman hallman at txcorp.com
Wed Sep 5 13:09:41 PDT 2012


Matt,
  I will try to narrow it down.  So far it seems to be in import figure call from matplotlib.  I'll isolate and see what is going on.

Thanks

Eric
On Sep 5, 2012, at 4:07 PM, Matthew Turk wrote:

> That is weird, Eric.  I think you can possibly disable usetex to get
> rid of the tex.cache file, but I'm not entirely sure.  This might be
> an issue to isolate, by removing yt.mods and importing the individual
> matplotlib items that throw the error in a script, and then raise it
> with matplotlib-users.
> 
> -Matt
> 
> On Wed, Sep 5, 2012 at 3:59 PM, Eric Hallman <hallman at txcorp.com> wrote:
>> Matt,
>>  it's weird, but in serial or in parallel on a single node I see no
>> problems.  I run interactively on one node and it skips right through this
>> part.  I don't get it.
>> 
>> 
>> On Sep 5, 2012, at 3:55 PM, Matthew Turk wrote:
>> 
>> Hi Eric,
>> 
>> The tex.cache issue seems to tbe the big one here.  Can you try, in
>> serial, launching a single job that imports yt?  I think it just needs
>> to be bootstrapped once.
>> 
>> -Matt
>> 
>> On Wed, Sep 5, 2012 at 3:50 PM, Eric Hallman <hallman at txcorp.com> wrote:
>> 
>> Well the earlier traceback I posted below is a good start.  Seriously if I
>> 
>> post the current error list it's going to be 10M of text.  I'll see what I
>> 
>> can come up with in the short term and we can try later on IRC or something.
>> 
>> 
>> Thanks
>> 
>> 
>> Eric
>> 
>> 
>> On Sep 5, 2012, at 3:45 PM, Nathan Goldbaum wrote:
>> 
>> 
>> I'm sorry you're having so much trouble.  Unfortunately I'm probably not the
>> 
>> best person to advise since I've never run jobs on Kraken.  Others on the
>> 
>> list might be more helpful.
>> 
>> 
>> One thing that would aid tracking down the problem is if you could paste the
>> 
>> errors you're seeing somewhere so that one of us can take a look at it in
>> 
>> detail.
>> 
>> 
>> Cheers,
>> 
>> 
>> Nathan
>> 
>> 
>> On Sep 5, 2012, at 12:42 PM, Eric Hallman wrote:
>> 
>> 
>> Nathan,
>> 
>> usings pmods in the import leads to an error explosion in the matplotlib
>> 
>> imports.  It's even worse than the original test.
>> 
>> 
>> Eric
>> 
>> On Sep 5, 2012, at 12:42 PM, Nathan Goldbaum wrote:
>> 
>> 
>> Hi Eric,
>> 
>> 
>> Exactly right.  This is a drop-in replacement for yt.mods on high-latency
>> 
>> parallel filesystems (like Kraken, unfortunately).
>> 
>> 
>> There's some discussion on the dev mailing list:
>> 
>> http://lists.spacepope.org/htdig.cgi/yt-dev-spacepope.org/2012-January/001760.html
>> 
>> 
>> Unfortunately this isn't covered in the docs (except for a note in the
>> 
>> changelog) but it should be in there.
>> 
>> 
>> Cheers,
>> 
>> 
>> Nathan
>> 
>> 
>> On Sep 5, 2012, at 9:40 AM, Eric Hallman wrote:
>> 
>> 
>> Nathan,
>> 
>> I've been off yt for a while, I'm unaware of pmods.  It's specific to
>> 
>> parallel I'm guessing?
>> 
>> 
>> Eric
>> 
>> On Sep 5, 2012, at 12:38 PM, Nathan Goldbaum wrote:
>> 
>> 
>> Hi Eric,
>> 
>> 
>> Have you tried from yt.pmods import * instead of the normal yt.mods?
>> 
>> 
>> Cheers,
>> 
>> 
>> Nathan
>> 
>> 
>> On Sep 5, 2012, at 9:37 AM, Eric Hallman wrote:
>> 
>> 
>> Hey everyone,
>> 
>> 
>> so this issue seems like one I've had before, but I searched the lists and
>> 
>> don't find this exact issue.
>> 
>> 
>> 
>> On batch jobs on kraken, attempting to do halo finding, I get an almost
>> 
>> immediate crash (with an eternal hang until the time limit is reached) due
>> 
>> to matplotlib.  I've been unable to reproduce it in the interactive queue on
>> 
>> kraken, which is frustrating.  I'm hoping someone has seen it and can
>> 
>> comment.
>> 
>> 
>> 
>> this is with yt/dev on kraken, and I set env variables to MPLCONFIGDIR like
>> 
>> so:
>> 
>> 
>> 
>> export MPLCONFIGDIR=${PBS_O_WORKDIR}/.matplotlib/
>> 
>> 
>> [ ! -d ${MPLCONFIGDIR} ] && mkdir ${MPLCONFIGDIR}
>> 
>> 
>> 
>> because if you don't, it fails immediately with perm issues.
>> 
>> 
>> 
>> Anyway, it's a simple script and call
>> 
>> 
>> 
>> aprun -n 12 python halo_finding.py --parallel
>> 
>> 
>> 
>> but the details of the script are not too important, since the job fails
>> 
>> when yt is imported, as so:
>> 
>> 
>> 
>> Traceback (most recent call last):
>> 
>> 
>> File "halo_finding.py", line 1, in <module>
>> 
>> 
>> from yt.mods import *
>> 
>> 
>> File
>> 
>> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/yt-2.4dev-py2.7-linux-x86_64.egg/yt/mods.py",
>> 
>> line 115, in <module>
>> 
>> 
>> from yt.visualization.api import \
>> 
>> 
>> File
>> 
>> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/yt-2.4dev-py2.7-linux-x86_64.egg/yt/visualization/api.py",
>> 
>> line 34, in <module>
>> 
>> 
>> from plot_collection import \
>> 
>> 
>> File
>> 
>> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/yt-2.4dev-py2.7-linux-x86_64.egg/yt/visualization/plot_collection.py",
>> 
>> line 26, in <module>
>> 
>> 
>> from matplotlib import figure
>> 
>> 
>> File
>> 
>> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/matplotlib/figure.py",
>> 
>> line 18, in <module>
>> 
>> 
>> from axes import Axes, SubplotBase, subplot_class_factory
>> 
>> 
>> File
>> 
>> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/matplotlib/axes.py",
>> 
>> line 18, in <module>
>> 
>> 
>> import matplotlib.contour as mcontour
>> 
>> 
>> File
>> 
>> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/matplotlib/contour.py",
>> 
>> line 21, in <module>
>> 
>> 
>> import matplotlib.texmanager as texmanager
>> 
>> 
>> File
>> 
>> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/matplotlib/texmanager.py",
>> 
>> line 72, in <module>
>> 
>> 
>> class TexManager:
>> 
>> 
>> File
>> 
>> "/lustre/scratch/proj/sw/yt/dev/lib/python2.7/site-packages/matplotlib/texmanager.py",
>> 
>> line 92, in TexManager
>> 
>> 
>> os.mkdir(texcache)
>> 
>> 
>> OSError: [Errno 17] File exists:
>> 
>> '/lustre/scratch/hallman/gigaCubes/run1024/.matplotlib/tex.cache'
>> 
>> 
>> 
>> In each case, I have deleted tex.cache before I restart, thinking an old
>> 
>> version persisted there, but the same error happens.  The most irritating
>> 
>> thing is that the job does not kick out of the batch system, so the time
>> 
>> continues to run on however many processors you have until the limit is
>> 
>> reached (eternal hang!).
>> 
>> 
>> 
>> I hope this is something obvious and I'm just dumb.  Let me know.
>> 
>> 
>> 
>> Eric
>> 
>> 
>> --
>> 
>> 
>> Eric Hallman
>> 
>> 
>> Tech-X Corporation               hallman at txcorp.com
>> 
>> 
>> 5621 Arapahoe Ave, Suite A       Phone: (720) 254-5833
>> 
>> 
>> Boulder, CO 80303                Fax:   (303) 448-7756
>> 
>> 
>> --
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> 
>> 
>> yt-users mailing list
>> 
>> 
>> yt-users at lists.spacepope.org
>> 
>> 
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>> 
>> 
>> 
>> _______________________________________________
>> 
>> yt-users mailing list
>> 
>> yt-users at lists.spacepope.org
>> 
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>> 
>> 
>> 
>> --
>> 
>> Eric Hallman
>> 
>> Tech-X Corporation               hallman at txcorp.com
>> 
>> 5621 Arapahoe Ave, Suite A       Phone: (720) 254-5833
>> 
>> Boulder, CO 80303                Fax:   (303) 448-7756
>> 
>> --
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> 
>> yt-users mailing list
>> 
>> yt-users at lists.spacepope.org
>> 
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>> 
>> 
>> 
>> _______________________________________________
>> 
>> yt-users mailing list
>> 
>> yt-users at lists.spacepope.org
>> 
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>> 
>> 
>> 
>> --
>> 
>> Eric Hallman
>> 
>> Tech-X Corporation               hallman at txcorp.com
>> 
>> 5621 Arapahoe Ave, Suite A       Phone: (720) 254-5833
>> 
>> Boulder, CO 80303                Fax:   (303) 448-7756
>> 
>> --
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> 
>> yt-users mailing list
>> 
>> yt-users at lists.spacepope.org
>> 
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>> 
>> 
>> 
>> _______________________________________________
>> 
>> yt-users mailing list
>> 
>> yt-users at lists.spacepope.org
>> 
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>> 
>> 
>> 
>> --
>> 
>> Eric Hallman
>> 
>> Tech-X Corporation               hallman at txcorp.com
>> 
>> 5621 Arapahoe Ave, Suite A       Phone: (720) 254-5833
>> 
>> Boulder, CO 80303                Fax:   (303) 448-7756
>> 
>> --
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> 
>> yt-users mailing list
>> 
>> yt-users at lists.spacepope.org
>> 
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>> 
>> 
>> _______________________________________________
>> yt-users mailing list
>> yt-users at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>> 
>> 
>> --
>> Eric Hallman
>> Tech-X Corporation               hallman at txcorp.com
>> 5621 Arapahoe Ave, Suite A       Phone: (720) 254-5833
>> Boulder, CO 80303                Fax:   (303) 448-7756
>> --
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> yt-users mailing list
>> yt-users at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>> 
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org

-- 
Eric Hallman
Tech-X Corporation               hallman at txcorp.com
5621 Arapahoe Ave, Suite A       Phone: (720) 254-5833
Boulder, CO 80303                Fax:   (303) 448-7756
--




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-users-spacepope.org/attachments/20120905/51cd625d/attachment.htm>


More information about the yt-users mailing list