使用正则表达式逗号分隔南亚编号系统中的大量数字

发布于 2021-01-29 14:57:12

我正在尝试找到一个基于南亚编号系统的正则表达式,以逗号分隔大量数字。

一些例子:

  • 1,000,000(阿拉伯文)是10,00,000(印度/印度/南亚)
  • 1,000,000,000(阿拉伯文)是100,00,00,000(印度文/ H / SA)。

逗号模式每7位重复一次。例如, 1,00,00,000,00,00,000

从Friedl撰写的Mastering Regular Expressions一书中,我有以下阿拉伯数字系统的正则表达式:

r'(?<=\d)(?=(\d{3})+(?!\d))'

对于印度编号系统,我想出了以下表达式,但不适用于超过8位数字的数字:

r'(?<=\d)(?=(((\d{2}){0,2}\d{3})(?=\b)))'

使用上述模式,我得到100000000,00,00,000

我正在使用Pythonre模块(re.sub())。有任何想法吗?

关注者
0
被浏览
129
1 个回答
  • 面试哥
    面试哥 2021-01-29
    为面试而生,有面试问题,就找面试哥。

    尝试这个:

    (?<=\d)(?=(\d{2}){0,2}\d{3}(\d{7})*(?!\d))
    

    例如:

    >>> import re
    >>> inp = ["1" + "0"*i for i in range(20)]
    >>> [re.sub(r"(?<=\d)(?=(\d{2}){0,2}\d{3}(\d{7})*(?!\d))", ",", i) 
         for i in inp]
    ['1', '10', '100', '1,000', '10,000', '1,00,000', '10,00,000', '1,00,00,000', 
     '10,00,00,000', '100,00,00,000', '1,000,00,00,000', '10,000,00,00,000', 
     '1,00,000,00,00,000', '10,00,000,00,00,000', '1,00,00,000,00,00,000', 
     '10,00,00,000,00,00,000', '100,00,00,000,00,00,000', 
     '1,000,00,00,000,00,00,000', '10,000,00,00,000,00,00,000',
     '1,00,000,00,00,000,00,00,000']
    

    作为评论正则表达式:

    result = re.sub(
        r"""(?x)       # Enable verbose mode (comments)
        (?<=\d)        # Assert that we're not at the start of the number.
        (?=            # Assert that it's possible to match:
         (\d{2}){0,2}  # 0, 2 or 4 digits,
         \d{3}         # followed by 3 digits,
         (\d{7})*      # followed by 0, 7, 14, 21 ... digits,
         (?!\d)        # and no more digits after that.
        )              # End of lookahead assertion.""", 
        ",", subject)
    


知识点
面圈网VIP题库

面圈网VIP题库全新上线,海量真题题库资源。 90大类考试,超10万份考试真题开放下载啦

去下载看看