[yt-users] Parallelism in yt Applied to Large Datasets

Britton Smith brittonsmith at gmail.com
Wed Dec 6 11:45:10 PST 2017


Hi Scott,

yt can do the multi-level parallelism you're talking about, i.e.,
parallelism over multiple datasets and in the operations on a single
dataset.  I would start by looking here:
http://yt-project.org/docs/dev/analyzing/parallel_computation.html#parallelization-over-multiple-objects-and-datasets

Namely, have a look at the user of "piter" when looping over the
DatasetSeries.  With that function, you can specify the number of jobs (the
njobs keyword) to be a number less than the total number of processors you
have available.  This will give you work groups with multiple processors
for each dataset.  Then, as long as the operations you're trying to do have
been parallelized, things will just work, i.e., that operation will employ
all the cores of that work group.

If you need to do some custom parallelization at the dataset level, I also
suggest having a look at the parallel_objects command:
http://yt-project.org/docs/dev/analyzing/parallel_computation.html#parallelizing-over-multiple-objects

This has a similar structure to piter, only it is a more general looping
construct that allows you to break the iterations of the loops into
separate processors or workgroups.  parallel_objects is also nestable, so
you can have nested loops that continually break things down further.

I hope this helps.  Please, feel free to come back if you have more
specific questions on parallelizing your analysis.

Britton



On Wed, Dec 6, 2017 at 11:30 AM, Scott Feister <sfeister at gmail.com> wrote:

> Hi Jason,
>
> I don't know how to do both domain and time decomposition in yt, but I
> have been doing time-series analysis in yt of some fairly massive FLASH
> HDF5 outputs (~20 GB each) without a problem. If you'd like to share the
> script with me (you can send to feister at flash.uchicago.edu), I can take a
> look and see if I notice anything particularly wasting RAM. Maybe there's a
> simpler solution than resorting to domain decomposition!
>
> Best,
>
> Scott
>
>
> Scott Feister, Ph.D.
> Postdoctoral Researcher, Flash Center for Computational Science
> University of Chicago, Department of Astronomy and Astrophysics
>
> On Wed, Dec 6, 2017 at 6:25 AM, Nathan Goldbaum <nathan12343 at gmail.com>
> wrote:
>
>> That depends on what sort of analysis you are doing. Not all tasks in yt
>> are parallel-aware.
>>
>> On Wed, Dec 6, 2017 at 8:08 AM Jason Galyardt <jason.galyardt at gmail.com>
>> wrote:
>>
>>> Hi yt Folks,
>>>
>>> I've written a script that uses a yt DatasetSeries object to analyze a
>>> time series dataset generated by FLASH. It worked beautifully, until I
>>> tried to run it on a new cluster with significantly larger HDF5 files (4 GB
>>> to greater than 8 GB per file). Now, while running the script, the RAM
>>> usage just grows and grows until the OS kills the job.
>>>
>>> It seems to me that I need to use domain decomposition to process these
>>> large files. So, my question to the group is this: is it possible to use
>>> both domain decomposition *and* parallel time series processing in a single
>>> script? This would require that yt be able to subdivide the available MPI
>>> processors into a number of work groups, each work group handling a single
>>> input file.
>>>
>>> Cheers,
>>> Jason
>>>
>>> ------
>>> Jason Galyardt
>>> University of Georgia
>>>
>>> _______________________________________________
>>> yt-users mailing list
>>> yt-users at lists.spacepope.org
>>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>>
>>
>> _______________________________________________
>> yt-users mailing list
>> yt-users at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>>
>>
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-users-spacepope.org/attachments/20171206/a15f8493/attachment-0002.html>


More information about the yt-users mailing list