在Numpy中执行零功能
我只是注意到的zeros
功能numpy
有一个奇怪的行为:
%timeit np.zeros((1000, 1000))
1.06 ms ± 29.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit np.zeros((5000, 5000))
4 µs ± 66 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
另一方面,ones
似乎有正常的行为。有人知道为什么用该zeros
函数初始化一个小的numpy数组比一个大的数组要花更多的时间吗?
(Python 3.5,numpy 1.11)
-
看起来好像
calloc
达到了一个阈值,在该阈值下,操作系统会要求将内存清零,而无需手动对其进行初始化。查看源代码,numpy.zeros
最终委托来calloc
获取清零的内存块,如果与进行比较numpy.empty
,则不执行初始化:In [15]: %timeit np.zeros((5000, 5000)) The slowest run took 12.65 times longer than the fastest. This could mean that a n intermediate result is being cached. 100000 loops, best of 3: 10 µs per loop In [16]: %timeit np.empty((5000, 5000)) The slowest run took 5.05 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 10.3 µs per loop
您会看到
np.zeros
5000x5000阵列没有初始化开销。实际上,在您尝试访问该内存之前,该操作系统甚至没有“真正”分配该内存。在无数TB可用空间的机器上,对TB级阵列的请求成功完成:
In [23]: x = np.zeros(2**40) # No MemoryError!