[yt-users] Parallelism in yt Applied to Large Datasets

Wed Dec 6 12:17:54 PST 2017

Hi all,

I am currently using dataset parallelization, like so:

ts = yt.load("/path/mydata_hdf5_plt*")
results = {}
for sto, ds in ts.piter(storage=results):
   # processing on each dataset using:
   # cut regions
   # center of mass calculations
   # weighted averages
   # projections

if yt.is_root():
    # process aggregate results

I'll try Britton's suggestion of specifying the number of processors to use
as a subset of the total available.

Also, it may or may not be relevant, but the cluster I'm using does not
support MPICH2 (installed by conda as a dependency for mpi4py) due to lack
of InfiniBand support. I have removed the mpi4py and mpich2 conda packages
and reinstalled with the OpenMPI implementation:

conda remove mpi4py
conda remove mpich2
conda install -c mpi4py mpi4py

To check that openmpi is now installed, you can do the following (an
asterisk will appear next to any installed packages):

conda search -c mpi4py mpi4py
conda search -c mpi4py openmpi

On my cluster, these commands show that I have the following packages
installed:

mpi4py
*  2.0.0            py27_openmpi_2  mpi4py

openmpi
*  1.10.2                        1  mpi4py

Is it possible that this is causing my RAM consumption problems?

Thanks for the help,
Jason

On Wed, Dec 6, 2017 at 2:45 PM, <yt-users-request at lists.spacepope.org>
wrote:

> Send yt-users mailing list submissions to
>         yt-users at lists.spacepope.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
> or, via email, send a message with subject or body 'help' to
>         yt-users-request at lists.spacepope.org
>
> You can reach the person managing the list at
>         yt-users-owner at lists.spacepope.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of yt-users digest..."
>
> Today's Topics:
>
>    1. Parallelism in yt Applied to Large Datasets (Jason Galyardt)
>    2. Re: Parallelism in yt Applied to Large Datasets (Nathan Goldbaum)
>    3. Re: Parallelism in yt Applied to Large Datasets (Scott Feister)
>    4. Re: Parallelism in yt Applied to Large Datasets (Britton Smith)
>
>
> ---------- Forwarded message ----------
> From: Jason Galyardt <jason.galyardt at gmail.com>
> To: Discussion of the yt analysis package <yt-users at lists.spacepope.org>
> Cc:
> Bcc:
> Date: Wed, 6 Dec 2017 09:08:29 -0500
> Subject: [yt-users] Parallelism in yt Applied to Large Datasets
> Hi yt Folks,
>
> I've written a script that uses a yt DatasetSeries object to analyze a
> time series dataset generated by FLASH. It worked beautifully, until I
> tried to run it on a new cluster with significantly larger HDF5 files (4 GB
> to greater than 8 GB per file). Now, while running the script, the RAM
> usage just grows and grows until the OS kills the job.
>
> It seems to me that I need to use domain decomposition to process these
> large files. So, my question to the group is this: is it possible to use
> both domain decomposition *and* parallel time series processing in a single
> script? This would require that yt be able to subdivide the available MPI
> processors into a number of work groups, each work group handling a single
> input file.
>
> Cheers,
> Jason
>
> ------
> Jason Galyardt
> University of Georgia
>
>
>
> ---------- Forwarded message ----------
> From: Nathan Goldbaum <nathan12343 at gmail.com>
> To: Discussion of the yt analysis package <yt-users at lists.spacepope.org>
> Cc:
> Bcc:
> Date: Wed, 06 Dec 2017 14:25:01 +0000
> Subject: Re: [yt-users] Parallelism in yt Applied to Large Datasets
> That depends on what sort of analysis you are doing. Not all tasks in yt
> are parallel-aware.
>
> On Wed, Dec 6, 2017 at 8:08 AM Jason Galyardt <jason.galyardt at gmail.com>
> wrote:
>
>> Hi yt Folks,
>>
>> I've written a script that uses a yt DatasetSeries object to analyze a
>> time series dataset generated by FLASH. It worked beautifully, until I
>> tried to run it on a new cluster with significantly larger HDF5 files (4 GB
>> to greater than 8 GB per file). Now, while running the script, the RAM
>> usage just grows and grows until the OS kills the job.
>>
>> It seems to me that I need to use domain decomposition to process these
>> large files. So, my question to the group is this: is it possible to use
>> both domain decomposition *and* parallel time series processing in a single
>> script? This would require that yt be able to subdivide the available MPI
>> processors into a number of work groups, each work group handling a single
>> input file.
>>
>> Cheers,
>> Jason
>>
>> ------
>> Jason Galyardt
>> University of Georgia
>>
>> _______________________________________________
>> yt-users mailing list
>> yt-users at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>
>
>
> ---------- Forwarded message ----------
> From: Scott Feister <sfeister at gmail.com>
> To: Discussion of the yt analysis package <yt-users at lists.spacepope.org>
> Cc:
> Bcc:
> Date: Wed, 6 Dec 2017 11:30:18 -0800
> Subject: Re: [yt-users] Parallelism in yt Applied to Large Datasets
> Hi Jason,
>
> I don't know how to do both domain and time decomposition in yt, but I
> have been doing time-series analysis in yt of some fairly massive FLASH
> HDF5 outputs (~20 GB each) without a problem. If you'd like to share the
> script with me (you can send to feister at flash.uchicago.edu), I can take a
> look and see if I notice anything particularly wasting RAM. Maybe there's a
> simpler solution than resorting to domain decomposition!
>
> Best,
>
> Scott
>
>
> Scott Feister, Ph.D.
> Postdoctoral Researcher, Flash Center for Computational Science
> University of Chicago, Department of Astronomy and Astrophysics
>
> On Wed, Dec 6, 2017 at 6:25 AM, Nathan Goldbaum <nathan12343 at gmail.com>
> wrote:
>
>> That depends on what sort of analysis you are doing. Not all tasks in yt
>> are parallel-aware.
>>
>> On Wed, Dec 6, 2017 at 8:08 AM Jason Galyardt <jason.galyardt at gmail.com>
>> wrote:
>>
>>> Hi yt Folks,
>>>
>>> I've written a script that uses a yt DatasetSeries object to analyze a
>>> time series dataset generated by FLASH. It worked beautifully, until I
>>> tried to run it on a new cluster with significantly larger HDF5 files (4 GB
>>> to greater than 8 GB per file). Now, while running the script, the RAM
>>> usage just grows and grows until the OS kills the job.
>>>
>>> It seems to me that I need to use domain decomposition to process these
>>> large files. So, my question to the group is this: is it possible to use
>>> both domain decomposition *and* parallel time series processing in a single
>>> script? This would require that yt be able to subdivide the available MPI
>>> processors into a number of work groups, each work group handling a single
>>> input file.
>>>
>>> Cheers,
>>> Jason
>>>
>>> ------
>>> Jason Galyardt
>>> University of Georgia
>>>
>>> _______________________________________________
>>> yt-users mailing list
>>> yt-users at lists.spacepope.org
>>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>>
>>
>> _______________________________________________
>> yt-users mailing list
>> yt-users at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>
>>
>
>
> ---------- Forwarded message ----------
> From: Britton Smith <brittonsmith at gmail.com>
> To: Discussion of the yt analysis package <yt-users at lists.spacepope.org>
> Cc:
> Bcc:
> Date: Wed, 6 Dec 2017 11:45:10 -0800
> Subject: Re: [yt-users] Parallelism in yt Applied to Large Datasets
> Hi Scott,
>
> yt can do the multi-level parallelism you're talking about, i.e.,
> parallelism over multiple datasets and in the operations on a single
> dataset.  I would start by looking here:
> http://yt-project.org/docs/dev/analyzing/parallel_computation.html#
> parallelization-over-multiple-objects-and-datasets
>
> Namely, have a look at the user of "piter" when looping over the
> DatasetSeries.  With that function, you can specify the number of jobs (the
> njobs keyword) to be a number less than the total number of processors you
> have available.  This will give you work groups with multiple processors
> for each dataset.  Then, as long as the operations you're trying to do have
> been parallelized, things will just work, i.e., that operation will employ
> all the cores of that work group.
>
> If you need to do some custom parallelization at the dataset level, I also
> suggest having a look at the parallel_objects command:
> http://yt-project.org/docs/dev/analyzing/parallel_computation.html#
> parallelizing-over-multiple-objects
>
> This has a similar structure to piter, only it is a more general looping
> construct that allows you to break the iterations of the loops into
> separate processors or workgroups.  parallel_objects is also nestable, so
> you can have nested loops that continually break things down further.
>
> I hope this helps.  Please, feel free to come back if you have more
> specific questions on parallelizing your analysis.
>
> Britton
>
>
>
> On Wed, Dec 6, 2017 at 11:30 AM, Scott Feister <sfeister at gmail.com> wrote:
>
>> Hi Jason,
>>
>> I don't know how to do both domain and time decomposition in yt, but I
>> have been doing time-series analysis in yt of some fairly massive FLASH
>> HDF5 outputs (~20 GB each) without a problem. If you'd like to share the
>> script with me (you can send to feister at flash.uchicago.edu), I can take
>> a look and see if I notice anything particularly wasting RAM. Maybe there's
>> a simpler solution than resorting to domain decomposition!
>>
>> Best,
>>
>> Scott
>>
>>
>> Scott Feister, Ph.D.
>> Postdoctoral Researcher, Flash Center for Computational Science
>> University of Chicago, Department of Astronomy and Astrophysics
>>
>> On Wed, Dec 6, 2017 at 6:25 AM, Nathan Goldbaum <nathan12343 at gmail.com>
>> wrote:
>>
>>> That depends on what sort of analysis you are doing. Not all tasks in yt
>>> are parallel-aware.
>>>
>>> On Wed, Dec 6, 2017 at 8:08 AM Jason Galyardt <jason.galyardt at gmail.com>
>>> wrote:
>>>
>>>> Hi yt Folks,
>>>>
>>>> I've written a script that uses a yt DatasetSeries object to analyze a
>>>> time series dataset generated by FLASH. It worked beautifully, until I
>>>> tried to run it on a new cluster with significantly larger HDF5 files (4 GB
>>>> to greater than 8 GB per file). Now, while running the script, the RAM
>>>> usage just grows and grows until the OS kills the job.
>>>>
>>>> It seems to me that I need to use domain decomposition to process these
>>>> large files. So, my question to the group is this: is it possible to use
>>>> both domain decomposition *and* parallel time series processing in a single
>>>> script? This would require that yt be able to subdivide the available MPI
>>>> processors into a number of work groups, each work group handling a single
>>>> input file.
>>>>
>>>> Cheers,
>>>> Jason
>>>>
>>>> ------
>>>> Jason Galyardt
>>>> University of Georgia
>>>>
>>>> _______________________________________________
>>>> yt-users mailing list
>>>> yt-users at lists.spacepope.org
>>>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>>>
>>>
>>> _______________________________________________
>>> yt-users mailing list
>>> yt-users at lists.spacepope.org
>>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>>
>>>
>>
>> _______________________________________________
>> yt-users mailing list
>> yt-users at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>
>>
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-users-spacepope.org/attachments/20171206/c1030c5f/attachment-0002.html>