如何使用python将文本文件拆分为多个文本文件?
我有一个包含以下内容的文本文件。我想将此文件拆分为多个文件(1.txt,2.txt,3.txt
…)。每个新的输出文件将如下所示。我尝试的代码无法正确分割输入文件。如何将输入文件拆分为多个文件?
我的代码:
#!/usr/bin/python
with open("input.txt", "r") as f:
a1=[]
a2=[]
a3=[]
for line in f:
if not line.strip() or line.startswith('A') or line.startswith('$$'): continue
row = line.split()
a1.append(str(row[0]))
a2.append(float(row[1]))
a3.append(float(row[2]))
f = open('1.txt','a')
f = open('2.txt','a')
f = open('3.txt','a')
f.write(str(a1))
f.close()
输入文件:
A
x
k
..
$$
A
z
m
..
$$
A
B
l
..
$$
所需的输出1.txt
A
x
k
..
$$
所需的输出2.txt
A
z
m
..
$$
所需的输出3.txt
A
B
l
..
$$
-
尝试re.findall()函数:
import re with open('input.txt', 'r') as f: data = f.read() found = re.findall(r'\n*(A.*?\n\$\$)\n*', data, re.M | re.S) [open(str(i)+'.txt', 'w').write(found[i-1]) for i in range(1, len(found)+1)]
前3次出现的 简约方法:
import re found = re.findall(r'\n*(A.*?\n\$\$)\n*', open('input.txt', 'r').read(), re.M | re.S) [open(str(found.index(f)+1)+'.txt', 'w').write(f) for f in found[:3]]
一些解释:
found = re.findall(r'\n*(A.*?\n\$\$)\n*', data, re.M | re.S)
将查找与指定RegEx匹配的所有匹配项,并将它们放入 列表中 ,称为
found
[open(str(found.index(f)+1)+'.txt', 'w').write(f) for f in found]
遍历(属于列表)所有元素(使用列表推导),
found
并为每个元素创建文本文件(称为“index of the element + 1
.txt”),并将该元素(出现)写入该文件。没有RegEx的另一个版本:
blocks_to_read = 3 blk_begin = 'A' blk_end = '$$' with open('35916503.txt', 'r') as f: fn = 1 data = [] write_block = False for line in f: if fn > blocks_to_read: break line = line.strip() if line == blk_begin: write_block = True if write_block: data.append(line) if line == blk_end: write_block = False with open(str(fn) + '.txt', 'w') as fout: fout.write('\n'.join(data)) data = [] fn += 1
PS我个人不喜欢这个版本,我会使用RegEx