[Yt-dev] 1024^3 amr hop problem

Matthew Turk matthewturk at gmail.com
Thu Apr 30 08:48:38 PDT 2009


> I am not, but it shouldn't matter. Following the output logs, it's crashing after the first read of particles for the unpadded regions, but before HOP gets called by any thread. It's crashing when its reading in the particle_position_* fields for the first time.

I'd say run a test problem.  Write a script that partitions the
hierarchy.  Read in a single position field.  Then, copy it a few
times (my_array.copy()) to see if it dies.  I have never seen the
error you are seeing, which is why I am inclined to think it's a
memory issue.  But I'm not sure.

> I ran top when I was doing 1 thread per node with 64 threads. The python process itself maxed out at nearly 20% of the machine before it crashed. The memory used line for the whole node was showing quite a bit more used, nearly 1/2 the node. I'd be surprised if the uneven distribution of particles was blowing out another node, but I'm not certain. I've run up to 256 threads, two per node, which would give each process 2x the memory as the time I watched with top above, and I got the same error message about the same place.

Okay.  That's very good to know.  So maybe it's not a memory issue --
but the specific error is unclear to me.  Since we can localize it to
that point, maybe you should add in a dir() command on the object as
well as some print statements to see where it gets.

-Matt



More information about the yt-dev mailing list