Python

用for循环求和比用reduce更快？

发布于 2021-01-29 16:53:36

我想看看与使用for循环进行简单的数值运算相比，reduce的速度要快多少。这是我发现的（使用标准timeit库）：

In [54]: print(setup)
from operator import add, iadd
r = range(100)

In [55]: print(stmt1)    
c = 0
for i in r:
    c+=i

In [56]: timeit(stmt1, setup)
Out[56]: 8.948904991149902
In [58]: print(stmt3)    
reduce(add, r)

In [59]: timeit(stmt3, setup)
Out[59]: 13.316915035247803

看一点：

In [68]: timeit("1+2", setup)
Out[68]: 0.04145693778991699

In [69]: timeit("add(1,2)", setup)
Out[69]: 0.22807812690734863

这里发生了什么？显然，reduce的循环比for循环快，但函数调用似乎占主导地位。简化版本是否应该几乎完全在C中运行？在for循环版本中使用iadd（c，i）使其在约24秒内运行。为什么使用operator.add比+慢很多？我的印象是+和operator.add运行相同的C代码（我检查以确保operator.add不只是在python或任何东西中调用+）。

顺便说一句，仅使用sum即可运行约2.3秒。

In [70]: print(sys.version)
2.7.1 (r271:86882M, Nov 30 2010, 09:39:13) 
[GCC 4.0.1 (Apple Inc. build 5494)]

关注者

被浏览

165

1 个回答

面试哥 2021-01-29

为面试而生，有面试问题，就找面试哥。

在reduce(add, r)必须调用add()函数的100倍，因此函数调用的开销加起来-
减少使用PyEval_CallObject调用add在每次迭代：

for (;;) {
    ...
    if (result == NULL)
        result = op2;
    else {
        # here it is creating a tuple to pass the previous result and the next
        # value from range(100) into func add():
        PyTuple_SetItem(args, 0, result);
        PyTuple_SetItem(args, 1, op2);
        if ((result = PyEval_CallObject(func, args)) == NULL)
            goto Fail;
    }

更新：对评论中问题的回答。

当您输入1 + 2Python源代码时，字节码编译器将执行加法运算并将该表达式替换为3：

f1 = lambda: 1 + 2
c1 = byteplay.Code.from_code(f1.func_code)
print c1.code

1           1 LOAD_CONST           3
            2 RETURN_VALUE

如果添加两个变量，a + b则编译器将生成字节码，该字节码将加载两个变量并执行BINARY_ADD，这比调用函数执行加法要快得多：

f2 = lambda a, b: a + b
c2 = byteplay.Code.from_code(f2.func_code)
print c2.code

1           1 LOAD_FAST            a
            2 LOAD_FAST            b
            3 BINARY_ADD           
            4 RETURN_VALUE

知识点

Python

面圈网VIP题库全新上线，海量真题题库资源。 90大类考试，超10万份考试真题开放下载啦

去下载看看