[yt-dev] Fwd: [mpi4py] Fwd: [Numpy-discussion] Improving Python+MPI import performance

Mon Jan 16 10:33:38 PST 2012

Hi Britton,

As side effect of yt.pmods being imported is that you'll get the
context manager, so you can then still do:

from yt.pmods import *
with mpi_import():
    from my_analysis import *

-Matt

On Mon, Jan 16, 2012 at 1:31 PM, Britton Smith <brittonsmith at gmail.com> wrote:
> Matt is right about the perils of putting any mpi imports in a try block.
> Systems like Ranger will fail in a way that is not catchable by python when
> trying to import mpi4py and not running in parallel.  I think the yt.pmods
> solution is the best for now.  Since the imports problem really only gets
> serious for more than about 100 cores, I think it's ok to impose some
> additional requirements of understanding on the user if they're going to run
> jobs that large.
> Would it be possible to add some sort of helper function such that you could
> do something like the following in a script:
> from yt.pmods import *
> parallel_import("from my_analysis import *")
>
> That be helpful.
> Britton
>
>
> On Mon, Jan 16, 2012 at 1:20 PM, Matthew Turk <matthewturk at gmail.com> wrote:
>>
>> Hi all,
>>
>> Okay, seems like there is some confusion.  My response was in
>> reference to Britton's question, which I thought was "are there any
>> sideeffects of [your fix for recursive imports] on the operation [of
>> the import hack for MPI]?"  There are not.
>>
>> My original statement, which Stephen disagrees with, is that we should
>> require an explicit change on the part of the user before we (on their
>> behalf) fundamentally modify the way the base functionality of
>> 'import' works for all Python modules.  I have several motivations for
>> this:
>>
>>  * The import problem is generic for shared filesystems being accessed
>> in parallel, but is only crippling at relatively large core counts on
>> particularly large lustre systems, compared to what most users
>> utilize.  This is a -1 for global application of the MPI_Import fix.
>>  * The change of yt.mods to yt.pmods is not an invasive change,
>> although I too do not like having different behavior for running in
>> parallel.  However, we do expect a number of things from users that
>> run in parallel: an understanding of the resources they are to
>> allocate, a set up of the queue script, and a recognition of which
>> activities will parallelize and which will not.  I still think it is
>> not the best solution, but I believe it is non-invasive.
>>  * Every time we add on an additional non-sanitized import, we take a
>> big performance hit.  It is in our best interest for this to all occur
>> at the outermost level.
>>  * Detecting whether we are running in parallel is not trivial.
>>  * On some machines, specifically SGI, if you run a script in parallel
>> that contains a try/except block for importing MPI and it was not
>> launched with MPI, it will die unceremoniously.  We cannot rely on a
>> try/except of importing MPI.
>>
>> All these things combined lead me to believe that I think we should
>> not attempt to guess *for* the user
>>
>> My proposed change, of adding yt.pmods, would consist of a new file
>> (yt/pmods.py) that contains the full contents of MPI_Import.py and
>> that, at the end, performs this operation:
>>
>> with mpi_import():
>>    from yt.mods import *
>>
>> What this would result in is a nearly self-contained script that
>> returned to the user the contents of yt.mods; there'd be no
>> duplication.  An alternate solution, which I am not terribly keen on,
>> would be to put manual context startup/shutdown inside yt.mods if
>> startup_tasks.parallel_enabled is true.  I feel like this would result
>> in a lot of unnecessary side effects.  However, with yt.pmods, while
>> the yt imports would all be included, any additional imports would
>> still need sanitation inside the users' script.
>>
>> The fallback would be to require users to use the with: statement
>> themselves.
>>
>> -Matt
>>
>> On Mon, Jan 16, 2012 at 11:08 AM, j s oishi <jsoishi at gmail.com> wrote:
>> >> I am of the opinion we can do this with an alternate, parallel import
>> >> that
>> >> would be compatible with yt.mods. Something like yt.pmods. What do you
>> >> think? What should usage of this be?
>> >
>> > Not sure I understand this. How is yt.pmods compatible with yt.mods?
>> > Do you mean both would have the same effect, but yt.pmods would use
>> > the new mechanism for loading rather than the current standard one
>> > which would be retained by yt.mods? If so, that sounds like an OK idea
>> > to me, though if there is no side effect, it could lead to people
>> > forgetting to sub in pmods, and then being stuck with bad, old
>> > performance. Given that this will be most important where even a
>> > single forgetful job could cost substantial allocation usage, maybe we
>> > should think about making it automatic?
>> >
>> > Or perhaps I am misunderstanding...
>> > _______________________________________________
>> > yt-dev mailing list
>> > yt-dev at lists.spacepope.org
>> > http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>> _______________________________________________
>> yt-dev mailing list
>> yt-dev at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>
>
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>