Python-pandas将列表的一列分为多列
发布于 2021-02-02 23:22:52
我有DataFrame
一列如下所示的熊猫:
In [207]:df2.teams
Out[207]:
0 [SF, NYG]
1 [SF, NYG]
2 [SF, NYG]
3 [SF, NYG]
4 [SF, NYG]
5 [SF, NYG]
6 [SF, NYG]
7 [SF, NYG]
我需要将列表的此列分为2列,team1
并team2
使用pandas
。
关注者
0
被浏览
100
1 个回答
-
您可以使用
DataFrame
与构造函数lists
通过转换为创建numpy array
通过values
使用tolist
:import pandas as pd d1 = {'teams': [['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG'], ['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG']]} df2 = pd.DataFrame(d1) print (df2) teams 0 [SF, NYG] 1 [SF, NYG] 2 [SF, NYG] 3 [SF, NYG] 4 [SF, NYG] 5 [SF, NYG] 6 [SF, NYG] df2[['team1','team2']] = pd.DataFrame(df2.teams.values.tolist(), index= df2.index) print (df2) teams team1 team2 0 [SF, NYG] SF NYG 1 [SF, NYG] SF NYG 2 [SF, NYG] SF NYG 3 [SF, NYG] SF NYG 4 [SF, NYG] SF NYG 5 [SF, NYG] SF NYG 6 [SF, NYG] SF NYG
对于新的DataFrame:
df3 = pd.DataFrame(df2['teams'].values.tolist(), columns=['team1','team2']) print (df3) team1 team2 0 SF NYG 1 SF NYG 2 SF NYG 3 SF NYG 4 SF NYG 5 SF NYG 6 SF NYG
解决方案apply(pd.Series)非常慢:
#7k rows df2 = pd.concat([df2]*1000).reset_index(drop=True) In [89]: %timeit df2['teams'].apply(pd.Series) 1 loop, best of 3: 1.15 s per loop In [90]: %timeit pd.DataFrame(df2['teams'].values.tolist(), columns=['team1','team2']) 1000 loops, best of 3: 820 µs per loop