[yt-dev] A set of benchmarks
Matthew Turk
matthewturk at gmail.com
Fri May 6 08:58:10 PDT 2016
Hi Jonah,
Thanks for supplying these. I've reviewed them but am still processing.
On Tue, May 3, 2016 at 10:48 PM, Jonah Miller <
jonah.maxwell.miller at gmail.com> wrote:
> Hi yt-dev,
>
> I am developing a frontend for Einstein Toolkit for yt and in the process
> I generated some crude, preliminary benchmarks which I thought I would
> share in case anybody is interested.
>
> I performed three tests:
>
> 1. I just load the dataset and calculate (say) the maximum of some
> quantity on each grid.
> 2. I load in the dataset and calculate the maximum of the magnitude of
> a gradient on each grid. This requires the generation of ghost zones at
> grid boundaries.
> 3. I load in the dataset and perform a volume rendering. I make a
> "movie" with 4 frames where I rotate around the volume.
>
> Salient details:
>
> - I performed these tests with the attached scripts. I turned openmp
> and MPI off (set OMP_NUM_THREADS=1).
> - The domain sizes were: 64^3, 128^3, 256^3, and 512^3.
> - All four datasets have only one refinement level. However the 512^3
> dataset has multiple grids on that refinement level.
> - The size of a dataset ranges from about 10MB for the 64^3 set to
> 10GB for the 512^3
> - All data was performed on a single modern workstation. Details:
> 2.6GHz clock with AVX2 instructions and a 60MB L3 cache.
>
> Results:
>
> - Reading in the data is (for the datasets I tested) extremely fast.
> The 512^3 dataset takes only about 20 seconds to read the whole thing in.
>
>
That's good news!
>
> - Generating the first frame in a volume render is extremely slow. On
> the order of 4 minutes for the 512^3 data set. After the first frame is
> produced, new frames are fast, even with openmp off. With 1 openmp thread,
> it takes on the order of 10s of seconds for a new frame.
>
>
That sounds about right; my guess is that this is the kdtree building,
which will get the vertex centered data.
>
> -
> - Generating ghost zones is very fast for datasets with only one grid.
> It is incredibly slow for datasets with multiple grids, dominating the run
> time.
>
>
Sounds about right.
>
> - I attach a plot comparing the three tests.
>
> Naively, it seems to me that there must be several sources of overhead
> when I perform volume rendering that are significantly more costly than
> simply reading in the data (which seems to be fast). Clearly one of these
> is the generation of the ghost zones, and another is the actual ray tracing
> (although the ray tracing itself seems to be quite fast.). However I'm not
> sure that these two operations alone explain the cost of the volume
> rendering.
>
I think that the cost is likely dominated by the vertex-centering. It may
be possible to overload get_vertex_centered_data for your subclass of
AMRPatchGrid to make this faster based on your data format. I am not sure
that it is a huge cost to do the ray casting itself; there are likely
python operations like writing the image, etc, that are a non-negligible
fraction of that time.
> I'd love to know other people's experience with benchmarking. Do these
> costs seem normal, up to an order of magnitude or so? Would you have any
> insight into what contributes to the cost of generating the first frame
> when volume rendering?
>
> Thanks very much!
>
> Best,
> Jonah Miller
>
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20160506/d8b7820a/attachment.html>
More information about the yt-dev
mailing list