[Yt-dev] 1024^3 amr hop problem

Stephen Skory stephenskory at yahoo.com
Thu Apr 30 08:44:16 PDT 2009


Matt,

> I think this is the right location for this -- I believe it's a bug,
> and we should be addressing it as such.  Are you using the old version
> of HOP, without the patch I constructed to do single-array addressing
> of particles?

I am not, but it shouldn't matter. Following the output logs, it's crashing after the first read of particles for the unpadded regions, but before HOP gets called by any thread. It's crashing when its reading in the particle_position_* fields for the first time.

> Can you tell me a bit more about the results of running top?  How much
> free memory was there?  Is there any reason to believe that the
> distribution of particles would be enough to compensate for this and
> blow out the ram on another node? 

I ran top when I was doing 1 thread per node with 64 threads. The python process itself maxed out at nearly 20% of the machine before it crashed. The memory used line for the whole node was showing quite a bit more used, nearly 1/2 the node. I'd be surprised if the uneven distribution of particles was blowing out another node, but I'm not certain. I've run up to 256 threads, two per node, which would give each process 2x the memory as the time I watched with top above, and I got the same error message about the same place.

> Please let us know what happens on Kraken.  This is unacceptable and
> we need to fix it.

I'll let you know.

 _______________________________________________________
sskory at physics.ucsd.edu           o__  Stephen Skory
http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student
________________________________(_)_\(_)_______________



More information about the yt-dev mailing list