[yt-users] error while trying to find abundances through checkpoint files

Nathan Goldbaum nathan12343 at gmail.com
Tue Oct 18 13:08:33 PDT 2016


Hi Tazkera,

When I tried googling your error message last night, I found that it's
associated with more than one MPI process trying to access an HDF5 file at
the same time, presumably using a serial version of the HDF5 library.

I just did a quick test and I'm unable to reproduce here on my laptop.
Unfortunately I don't have access to stampede so can't reproduce there.

Can you share exactly which h5py and HDF5 library versions you're using?

-Nathan

On Tue, Oct 18, 2016 at 3:05 PM, tazkera haque <h.tazkera at gmail.com> wrote:

> HI Nathan,
>
> Sorry to bother you again, but the problem seems to prevail even working
> from $WORK directory. I got the same error msg this morning again with a
> different script. Do you see anything wrong with the script I attached ? I
> have used it for a long time now without any sort of error.
>
> Thanks
>
> On Tue, Oct 18, 2016 at 2:34 AM, tazkera haque <h.tazkera at gmail.com>
> wrote:
>
>> HI Nathan,
>>
>> I figured out what was going wrong, I submitted my job script from the
>> $SCRATCH folder. apparently jobs can only be submitted through the $WORK
>> folder on Stampede. thanks for your prompt response though
>>
>> On Tue, Oct 18, 2016 at 1:22 AM, tazkera haque <h.tazkera at gmail.com>
>> wrote:
>>
>>> Hi Nathan,
>>>
>>> I tried with one file in my ipython notebook, it seems to work there
>>>
>>> On Tue, Oct 18, 2016 at 12:54 AM, tazkera haque <h.tazkera at gmail.com>
>>> wrote:
>>>
>>>> yes it's being run in parallel, I didn't check with one core yet, I
>>>> will let you know what happens then
>>>>
>>>> On Tue, Oct 18, 2016 at 12:50 AM, Nathan Goldbaum <
>>>> nathan12343 at gmail.com> wrote:
>>>>
>>>>> Is the script being run in parallel? If so, does it crash if you run
>>>>> it on only one core?
>>>>>
>>>>> Nathan
>>>>>
>>>>>
>>>>> On Monday, October 17, 2016, tazkera haque <h.tazkera at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> HI people,
>>>>>>
>>>>>> I am using yt 3.3.1 and submitting my SLURM script to stampede.
>>>>>> I was using this script to find abundances of C, O etc through
>>>>>> checkpoint files in FLASH. while my script worked fine with old yt (3.1) ,
>>>>>> suddenly it crashed today and returned me the following error:
>>>>>>
>>>>>> *yt : [INFO     ] 2016-10-17 23:22:24,295 Parameters: current_time
>>>>>>            = 28.1847530806*
>>>>>> *yt : [INFO     ] 2016-10-17 23:22:24,295 Parameters:
>>>>>> domain_dimensions         = [128 128 128]*
>>>>>> *yt : [INFO     ] 2016-10-17 23:22:24,296 Parameters:
>>>>>> domain_left_edge          = [ -2.80000000e+10  -2.80000000e+10
>>>>>>  -2.80000000e+10]*
>>>>>> *yt : [INFO     ] 2016-10-17 23:22:24,296 Parameters:
>>>>>> domain_right_edge         = [  2.80000000e+10   2.80000000e+10
>>>>>> 2.80000000e+10]*
>>>>>> *yt : [INFO     ] 2016-10-17 23:22:24,296 Parameters:
>>>>>> cosmological_simulation   = 0.0*
>>>>>> *Executin lessg abundance.py*
>>>>>> *Traceback (most recent call last):*
>>>>>> *  File "abundance2.py", line 304, in <module>*
>>>>>> *    main(chkFilenames_own)*
>>>>>> *  File "abundance2.py", line 59, in main*
>>>>>> *    pf = yt.load(filenames[n])*
>>>>>> *  File
>>>>>> "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/yt/convenience.py",
>>>>>> line 79, in load*
>>>>>> *    if c._is_valid(*args, **kwargs): candidates.append(n)*
>>>>>> *  File
>>>>>> "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/yt/frontends/flash/data_structures.py",
>>>>>> line 478, in _is_valid*
>>>>>> *    if "bounding box" not in fileh["/"].keys() \*
>>>>>> *  File
>>>>>> "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/h5py/_hl/base.py",
>>>>>> line 368, in keys*
>>>>>> *    return list(self)*
>>>>>> *  File "h5py/_objects.pyx", line 54, in
>>>>>> h5py._objects.with_phil.wrapper
>>>>>> (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2696)*
>>>>>> *  File "h5py/_objects.pyx", line 55, in
>>>>>> h5py._objects.with_phil.wrapper
>>>>>> (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2654)*
>>>>>> *  File
>>>>>> "/work/03858/thaque56/sw/yt-new-3.3/yt-conda/lib/python2.7/site-packages/h5py/_hl/group.py",
>>>>>> line 303, in __len__*
>>>>>> *    return self.id.get_num_objs()*
>>>>>> *  File "h5py/_objects.pyx", line 54, in
>>>>>> h5py._objects.with_phil.wrapper
>>>>>> (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2696)*
>>>>>> *  File "h5py/_objects.pyx", line 55, in
>>>>>> h5py._objects.with_phil.wrapper
>>>>>> (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2654)*
>>>>>> *  File "h5py/h5g.pyx", line 321, in h5py.h5g.GroupID.get_num_objs
>>>>>> (/home/ilan/minonda/conda-bld/work/h5py/h5g.c:4194)*
>>>>>> *RuntimeError: Can't determine (Bad symbol table node signature)*
>>>>>> *[c560-102.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI
>>>>>> process (rank: 0, pid: 22137) exited with status 1*
>>>>>> *TACC: MPI job exited with code: 1*
>>>>>>
>>>>>> *TACC: Shutdown complete. Exiting.*
>>>>>>
>>>>>> I was wondering if there is something wrong with my code or the new
>>>>>> yt. I am also attaching my code here to look at. Thanks in advance
>>>>>>
>>>>>> Best
>>>>>> Tazkera
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> yt-users mailing list
>>>>> yt-users at lists.spacepope.org
>>>>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>>>>
>>>>>
>>>>
>>>
>>
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-users-spacepope.org/attachments/20161018/a24a6abf/attachment-0001.htm>


More information about the yt-users mailing list