[Yt-dev] quick question on particle IO

Tue Oct 18 19:59:37 PDT 2011

Geoffrey,

Parallel HOP definitely does not attempt to load all of the particles,
simultaneously, on all processors.  This is covered in the method
papers for both p-hop and yt, the documentation for yt, the source
code, and I believe on the yt-users mailing list a couple times when
discussing estimates for resource usage in p-hop.

The struggles you have been having with Nautilus may in fact be a yt
problem, or an application-of-yt problem, a software problem on
Nautilus, or even (if Nautilus is being exposed to an excessive number
of cosmic rays, for instance) a hardware problem.  It would probably
be productive to properly debug exactly what is going on for you to
provide to us:

1) What are you attempting to do, precisely?
2) What type of data, and what size of data, are you applying this to?
3) What is the version of yt you are using (changeset hash)?
4) How are you launching yt?
5) What is the memory available to each individual process?
6) Under what circumstances does yt crash?
7) How does yt report this crash to you, and is it deterministic?
8) What have you attempted?  How did it change #6 and #7?

We're interested in ensuring that yt functions well on Nautilus, and
that it is able to successfully halo find, analyze, etc.  However,
right now it feels like we're being given about 10% of a bug report,
and that is regrettably not enough to properly diagnose and repair the
problem.

Thanks,

Matt

On Tue, Oct 18, 2011 at 7:51 PM, Geoffrey So <gsiisg at gmail.com> wrote:
> Ah yes, I think that answers our question.
> We were worried that all the particles were read in by each processor (which
> I told him I don't think it did, or it would have crashed my smaller 800
> cube long ago), but I wanted to get the answer from pros.
> Thanks!
> From
> G.S.
>
> On Tue, Oct 18, 2011 at 4:21 PM, Stephen Skory <s at skory.us> wrote:
>>
>> Geoffrey,
>>
>> > "Is the particle IO in YT that calls h5py spawned by multiple processors
>> > or is it doing it serially?"
>>
>> For your purposes, h5py is only used to *write* particle data to disk
>> after the halos have been found (if you are saving them to disk, which
>> you must do explicitly, of course). And in this case, it will open up
>> one file using h5py per MPI task.
>>
>> I'm guessing that they're actually concerned about reading particle
>> data, because that is more disk intensive. This is done with functions
>> written in C that read the data, not h5py. Here each MPI task does its
>> own reading of data, and may open up multiple files to retrieve the
>> particle data it needs depending on the layouts of grids in the
>> .cpuNNNN files.
>>
>> Does that help?
>>
>> --
>> Stephen Skory
>> s at skory.us
>> http://stephenskory.com/
>> 510.621.3687 (google voice)
>> _______________________________________________
>> Yt-dev mailing list
>> Yt-dev at lists.spacepope.org
>> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>
> _______________________________________________
> Yt-dev mailing list
> Yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>