normalize.py 文件源码

python
阅读 27 收藏 0 点赞 0 评论 0

项目:tashaphyne 作者: linuxscout 项目源码 文件源码
def normalize_lamalef(text):
    """Normalize Lam Alef ligatures into two letters (LAM and ALEF),
    and return a result text.
    Some systems present lamAlef ligature as a single letter,
    this function convert it into two letters,
    The converted letters into  LAM and ALEF are :
        - LAM_ALEF, LAM_ALEF_HAMZA_ABOVE, LAM_ALEF_HAMZA_BELOW,
         LAM_ALEF_MADDA_ABOVE

    Example:
        >>> text=u"????? ???? ???????"
        >>> normalize_lamalef(text)
        ????? ???? ???????

    @param text: arabic text.
    @type text: unicode.
    @return: return a converted text.
    @rtype: unicode.
    """
    return arabconst.LAMALEFAT_PAT.sub(\
      u'%s%s'%(arabconst.LAM, arabconst.ALEF), text)

#--------------------------------------
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号