如何使用csv.DictReader跳过标题前行?

发布于 2021-01-29 14:58:45

我想csv.DictReader从文件中推断出字段名称。文档说:
“如果省略fieldnames参数,则csvfile第一行中的值将用作字段名。” ,但在我的情况下,第一行包含标题,第二行包含名称。

我无法next(reader)按照Python
3.2的要求在csv.DictReader中跳过一行,
因为在初始化读取器时发生了字段名分配(否则我做错了)。

csvfile(从Excel
2010导出,原始源):

CanVec v1.1.0,,,,,,,,,^M
Entity,Attributes combination,"Specification Code
Point","Specification Code
Line","Specification Code
Area",Generic Code,Theme,"GML - Entity name
Shape - File name
Point","GML - Entity name
Shape - File name
Line","GML - Entity name
Shape - File name
Area"^M
Amusement park,Amusement park,,,2260012,2260009,LX,,,LX_2260009_2^M
Auto wrecker,Auto wrecker,,,2360012,2360009,IC,,,IC_2360009_2^M

我的代码:

f = open(entities_table,'rb')
try:
    dialect = csv.Sniffer().sniff(f.read(1024))
    f.seek(0)

    reader = csv.DictReader(f, dialect=dialect)
    print 'I think the field names are:\n%s\n' % (reader.fieldnames)

    i = 0
    for row in reader:
        if i < 20:
            print row
            i = i + 1

finally:
    f.close()

当前结果:

I think the field names are:
['CanVec v1.1.0', '', '', '', '', '', '', '', '', '']

所需结果:

I think the field names are:
['Entity','Attributes combination','"Specification Code Point"',...snip]

我意识到只删除第一行并继续进行是很方便的,但是我正在尝试尽可能地就地读取数据并尽量减少人工干预。

关注者
0
被浏览
73
1 个回答
  • 面试哥
    面试哥 2021-01-29
    为面试而生,有面试问题,就找面试哥。

    我从itertools使用过islice。我的标题位于重要序言的最后一行。我已经通过了序言,并使用hederline作为字段名:

    with open(file, "r") as f:
        '''Pass preamble'''
        n = 0
        for line in f.readlines():
            n += 1
            if 'same_field_name' in line: # line with field names was found
                h = line.split(',')
                break
        f.close()
        f = islice(open(i, "r"), n, None)
    
        reader = csv.DictReader(f, fieldnames = h)
    


知识点
面圈网VIP题库

面圈网VIP题库全新上线,海量真题题库资源。 90大类考试,超10万份考试真题开放下载啦

去下载看看