[yt-dev] Debugging parallel objects?

Nathan Goldbaum nathan12343 at gmail.com
Wed Oct 3 13:48:25 PDT 2012


Hi all,

I have time series script that iterates over a bunch of simulation outputs, does some analysis, and then dumps a pickle with the results of the analysis.  This script works correctly when I loop over the outputs like so:

for pf in ts:
	do stuff

But when I use the piter functionality to loop over the outputs in parallel:

for sto,pf in ts.piter(storage=sto):
	do the same stuff

the script will sometimes hang when I run on more than one processor.  It's difficult to find exactly where and why it's hanging since the run is distributed - I'd like to be able to reproduce the error to track down why it's happening.

I'm curious if anyone has any tips for debugging parallel operations in yt.  I'm not very familiar with the internals of the parallel_objects machinery, so likely places to check or put breakpoints would be very helpful.  Also, are there parallel debugging tools for python?

Thanks for your help,

Nathan


More information about the yt-dev mailing list