Lambda与列表理解性能
我最近发布了一个使用lambda函数的问题,在答复中有人提到lambda不再受欢迎,而是使用列表推导。我对Python比较陌生。我进行了一个简单的测试:
import time
S=[x for x in range(1000000)]
T=[y**2 for y in range(300)]
#
#
time1 = time.time()
N=[x for x in S for y in T if x==y]
time2 = time.time()
print 'time diff [x for x in S for y in T if x==y]=', time2-time1
#print N
#
#
time1 = time.time()
N=filter(lambda x:x in S,T)
time2 = time.time()
print 'time diff filter(lambda x:x in S,T)=', time2-time1
#print N
#
#
#http://snipt.net/voyeg3r/python-intersect-lists/
time1 = time.time()
N = [val for val in S if val in T]
time2 = time.time()
print 'time diff [val for val in S if val in T]=', time2-time1
#print N
#
#
time1 = time.time()
N= list(set(S) & set(T))
time2 = time.time()
print 'time diff list(set(S) & set(T))=', time2-time1
#print N #the results will be unordered as compared to the other ways!!!
#
#
time1 = time.time()
N=[]
for x in S:
for y in T:
if x==y:
N.append(x)
time2 = time.time()
print 'time diff using traditional for loop', time2-time1
#print N
它们都打印相同的N,所以我评论了打印出stmt(除了最后一种方法是无序的),但是在重复测试中产生的时间差异很有趣,如以下示例所示:
time diff [x for x in S for y in T if x==y]= 54.875
time diff filter(lambda x:x in S,T)= 0.391000032425
time diff [val for val in S if val in T]= 12.6089999676
time diff list(set(S) & set(T))= 0.125
time diff using traditional for loop 54.7970001698
因此,尽管我觉得列表理解总体上更容易阅读,但至少在此示例中似乎存在一些性能问题。
因此,有两个问题:
-
为什么将lambda等放到一边?
-
对于列表理解方式,是否有更有效的实现?您如何知道不进行测试就更有效?我的意思是,由于额外的函数调用,lambda / map / filter的效率较低,但似乎效率更高。
保罗
-
您的测试做的事情完全不同。S为1M元素,T为300:
[x for x in S for y in T if x==y]= 54.875
此选项进行300M相等性比较。
filter(lambda x:x in S,T)= 0.391000032425
此选项对S执行300次线性搜索。
[val for val in S if val in T]= 12.6089999676
此选项对T执行1M线性搜索。
list(set(S) & set(T))= 0.125
此选项执行两个设置构造和一个设置相交。
这些选项之间的性能差异与每个选项所使用的算法关系更大, 而 不是列表理解和之间的任何差异
lambda
。