[yt-users] h5py execution time

Stephen Skory stephenskory at yahoo.com
Fri Aug 13 16:02:38 PDT 2010


JC,

> In my experience, interpreted languages are slow with loops.  If you use 
> the array syntax in numpy, all of the loops are done in C.  The same 
> thing happens in IDL, as well, where you dearly try to avoid loops. 
> This  page helped me long ago in  IDL...


John is right about interpreted languages. To expand on the point about Numpy 
and C, if you have a Numpy array (which is what h5py outputs when you read data 
out of a file), and you operate on it using Numpy functions, the work is done in 
compiled C code. This means it's much faster. Here is some example code that 
does the same thing twice, once with a Python loop and the other in Numpy:

----------

import time, math
import numpy as np

a = np.random.random(1e7).astype('float64')
b = np.empty(1e7, dtype='float64')
c = np.empty(1e7, dtype='float64')

t0 = time.time()
for i, item in enumerate(a):
    b[i] = math.pow(item, 2.)
t1 = time.time()
print 'Python loop takes %f seconds' % (t1-t0)

t0 = time.time()
np.power(a, 2., c)
t1 = time.time()
print 'Numpy C loop takes %f seconds' % (t1-t0)

print 'The arrays are the same?', (b == c).all()

-----------

Noting that I had to enforce 64-bit so the answers will match up, I get this 
when I run it:

python timeme.py
Python loop takes 9.354742 seconds
Numpy C loop takes 0.584433 seconds
The arrays are the same? True

I think the advantages are obvious! You had a triple loop in your original code, 
so the effect is likely even worse than this.

I hope this helps!

 _______________________________________________________
sskory at physics.ucsd.edu o__ Stephen Skory
http://physics.ucsd.edu/~sskory/ _.>/ _Graduate Student
________________________________(_)_\(_)_______________




More information about the yt-users mailing list