使用csv模块读取ASCII分隔文本?
您可能会或可能不知道的ASCII分隔文本,其中有使用非键盘字符分离领域和线条的不错的优势。
写下来很简单:
import csv
with open('ascii_delim.adt', 'w') as f:
writer = csv.writer(f, delimiter=chr(31), lineterminator=chr(30))
writer.writerow(('Sir Lancelot of Camelot', 'To seek the Holy Grail', 'blue'))
writer.writerow(('Sir Galahad of Camelot', 'I seek the Grail', 'blue... no yellow!'))
而且,可以肯定的是,您可以正确地丢弃东西。但是,在阅读时,lineterminator
什么也没有做,并且如果我尝试这样做:
open('ascii_delim.adt', newline=chr(30))
它抛出 ValueError: illegal newline value:
那么,如何读取ASCII分隔文件?我会降级line.split(chr(30))
吗?
-
您可以通过有效地将文件中的行尾字符转换为换行字符
csv.reader
进行硬编码来识别:import csv with open('ascii_delim.adt', 'w') as f: writer = csv.writer(f, delimiter=chr(31), lineterminator=chr(30)) writer.writerow(('Sir Lancelot of Camelot', 'To seek the Holy Grail', 'blue')) writer.writerow(('Sir Galahad of Camelot', 'I seek the Grail', 'blue... no yellow!')) def readlines(f, newline='\n'): while True: line = [] while True: ch = f.read(1) if ch == '': # end of file? return elif ch == newline: # end of line? line.append('\n') break line.append(ch) yield ''.join(line) with open('ascii_delim.adt', 'rb') as f: reader = csv.reader(readlines(f, newline=chr(30)), delimiter=chr(31)) for row in reader: print row
输出:
['Sir Lancelot of Camelot', 'To seek the Holy Grail', 'blue'] ['Sir Galahad of Camelot', 'I seek the Grail', 'blue... no yellow!']