python中的正则表达式:是否可以获取匹配,替换和最终字符串?
为了进行正则表达式替换,您需要执行三件事:
- 比赛模式
- 更换方式
- 原始字符串
正则表达式引擎发现我感兴趣的三件事:
- 匹配的 字符串
- 替换 字符串
- 最终处理的字符串
使用时re.sub
,最后的字符串是返回的内容。但是是否可以访问其他两件事,匹配的字符串和替换的字符串?
这是一个例子:
orig = "This is the original string."
matchpat = "(orig.*?l)"
replacepat = "not the \\1"
final = re.sub(matchpat, replacepat, orig)
print(final)
# This is the not the original string
匹配字符串为"original"
,替换字符串为"not the
original"
。有办法让他们吗?我正在编写一个脚本来搜索和替换许多文件,我希望它可以将其查找和替换的内容打印出来,而不必打印出整行。
-
class Replacement(object): def __init__(self, replacement): self.replacement = replacement self.matched = None self.replaced = None def __call__(self, match): self.matched = match.group(0) self.replaced = match.expand(self.replacement) return self.replaced >>> repl = Replacement('not the \\1') >>> re.sub('(orig.*?l)', repl, 'This is the original string.') 'This is the not the original string.' >>> repl.matched 'original' >>> repl.replaced 'not the original'
编辑: 正如@FJ所指出的,以上内容将仅记住最后一个匹配项/替换项。此版本可处理多个事件:
class Replacement(object): def __init__(self, replacement): self.replacement = replacement self.occurrences = [] def __call__(self, match): matched = match.group(0) replaced = match.expand(self.replacement) self.occurrences.append((matched, replaced)) return replaced >>> repl = Replacement('[\\1]') >>> re.sub('\s(\d)', repl, '1 2 3') '1[2][3]' >>> for matched, replaced in repl.occurrences: ....: print matched, '=>', replaced ....: 2 => [2] 3 => [3]