1 个回答
-
通过使用
drop_duplicates
pd.concat([df1,df2]).drop_duplicates(keep=False)
Update :
The above method only works for those data frames that don't already have duplicates themselves. For example:
df1=pd.DataFrame({'A':[1,2,3,3],'B':[2,3,4,4]}) df2=pd.DataFrame({'A':[1],'B':[2]})
它会像下面这样输出,这是错误的
错误输出:
pd.concat([df1, df2]).drop_duplicates(keep=False) Out[655]: A B 1 2 3
正确输出
Out[656]: A B 1 2 3 2 3 4 3 3 4
如何做到这一点?
方法一:使用
isin
withtuple
df1[~df1.apply(tuple,1).isin(df2.apply(tuple,1))] Out[657]: A B 1 2 3 2 3 4 3 3 4
方法2:
merge
用indicator
df1.merge(df2,indicator = True, how='left').loc[lambda x : x['_merge']!='both'] Out[421]: A B _merge 1 2 3 left_only 2 3 4 left_only 3 3 4 left_only