[yt-users] curious warning message in parallelHOP

Sam Skillman samskillman at gmail.com
Wed May 9 20:43:01 PDT 2012


Just for reference, I've gotten this a bunch recently and
just blatantly ignored it.  I'm +0 on changing the default to make this go
away, mostly because of the confusion that it can cause.  I'm not sure if
there are any actual downsides to the current behavior.

Sam

On Wed, May 9, 2012 at 7:13 PM, Geoffrey So <gsiisg at gmail.com> wrote:

> I think it would be nice to keep track of memory, but better to do it
> consistently across platform.  If this is causing problems running YT then
> I support killing it, but if it's just annoying warning messages, then I
> don't really care, can be easily ignored.  But I am not sure if it is
> related to my jobs always dying or stuck, often times it's the hardware or
> file system that is broken and not YT's fault at all.
>
> From
> G.S.
>
>
> On Wed, May 9, 2012 at 5:54 PM, Britton Smith <brittonsmith at gmail.com>wrote:
>
>> I made the change.  It was PR #35.  People were on board with it back
>> when it was accepted, but if it's forking, then I agree we should just get
>> rid of it.  After that was put it, another issue crept up as it was found
>> that the command that it calls doesn't even always return the same format
>> on every system.  I guess that's the risk you run when making system
>> calls.  I support killing it.
>>
>> Britton
>>
>>
>> On Wed, May 9, 2012 at 8:37 PM, John Wise <jwise at physics.gatech.edu>wrote:
>>
>>> Hi Matt and Geoffrey,
>>>
>>> If you're talking to this John, I didn't make that change.  I think
>>> Britton made that change in changeset ab0e7664f75d.
>>>
>>> But if popen() creates forks, I'd -1 on this being default, also because
>>> it causes problems in some MPI implementations, as Geoffrey has found out
>>> the hard way!
>>>
>>> John
>>>
>>>
>>> On 05/09/2012 08:23 PM, Matthew Turk wrote:
>>>
>>>> Hi Geoffrey,
>>>>
>>>> Grepping through the yt codebase, I think the most likely culprit is
>>>> that get_memory_usage (which I believe gets called often during
>>>> parallel HOP) falls back on calling popen, which results in a fork.  I
>>>> think John might have added this.  I am -1 on this being default
>>>> behavior, and +1 on changing fallback (by default) to just return 0 if
>>>> it can't get memory information from the resource module, and maybe
>>>> governing this with a config parameter.  John, what do you think?
>>>>
>>>> -MAtt
>>>>
>>>> On Wed, May 9, 2012 at 7:46 PM, Geoffrey So<gsiisg at gmail.com>  wrote:
>>>>
>>>>> I got this warning message when running paralllel HOP, which I guess
>>>>> was
>>>>> there before but never noticed before because it's close to the top
>>>>> (usually
>>>>> looking at the bottom of the logs).  I was wondering if there's
>>>>> anything I
>>>>> need to worry about?
>>>>>
>>>>> ------------------------------**------------------------------**
>>>>> --------------
>>>>> An MPI process has executed an operation involving a call to the
>>>>> "fork()" system call to create a child process.  Open MPI is currently
>>>>> operating in a condition that could result in memory corruption or
>>>>> other system errors; your MPI job may hang, crash, or produce silent
>>>>> data corruption.  The use of fork() (or system() or other calls that
>>>>> create child processes) is strongly discouraged.
>>>>>
>>>>> The process that invoked fork was:
>>>>>
>>>>>   Local host:          lens51 (PID 6818)
>>>>>   MPI_COMM_WORLD rank: 323
>>>>>
>>>>> If you are *absolutely sure* that your application will successfully
>>>>> and correctly survive a call to fork(), you may disable this warning
>>>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>>>> ------------------------------**------------------------------**
>>>>> --------------
>>>>> [lens6:19012] 447 more processes have sent help message
>>>>> help-mpi-runtime.txt
>>>>> / mpi_init:warn-fork
>>>>> [lens6:19012] Set MCA parameter "orte_base_help_aggregate" to 0 to see
>>>>> all
>>>>> help / error messages
>>>>> [lens6:19012] 32 more processes have sent help message
>>>>> help-mpi-runtime.txt
>>>>> / mpi_init:warn-fork
>>>>> [lens6:19012] 32 more processes have sent help message
>>>>> help-mpi-runtime.txt
>>>>> / mpi_init:warn-fork
>>>>>
>>>>> From
>>>>> G.S.
>>>>>
>>>>> ______________________________**_________________
>>>>> yt-users mailing list
>>>>> yt-users at lists.spacepope.org
>>>>> http://lists.spacepope.org/**listinfo.cgi/yt-users-**spacepope.org<http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org>
>>>>>
>>>>>  ______________________________**_________________
>>>> yt-users mailing list
>>>> yt-users at lists.spacepope.org
>>>> http://lists.spacepope.org/**listinfo.cgi/yt-users-**spacepope.org<http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org>
>>>>
>>>
>>> --
>>> John Wise
>>> Assistant Professor of Physics
>>> Center for Relativistic Astrophysics, Georgia Tech
>>>
>>> ______________________________**_________________
>>> yt-users mailing list
>>> yt-users at lists.spacepope.org
>>> http://lists.spacepope.org/**listinfo.cgi/yt-users-**spacepope.org<http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org>
>>>
>>
>>
>> _______________________________________________
>> yt-users mailing list
>> yt-users at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>
>>
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-users-spacepope.org/attachments/20120509/d55a8f2a/attachment.html>


More information about the yt-users mailing list