tokenize.py 文件源码

python
阅读 23 收藏 0 点赞 0 评论 0

项目:CVProject 作者: hieuxinhe94 项目源码 文件源码
def __call__(self, seq):
    _seq = str.split(seq)
    min_order = self.min_order
    max_order = self.max_order
    t = tee(_seq, max_order)
    for i in xrange(max_order):
      for j in xrange(i):
        # advance iterators, ignoring result
        t[i].next()
    while True:
      token = [tn.next() for tn in t]
      if len(token) < max_order: break
      for n in xrange(min_order-1, max_order):
        yield ' '.join(token[:n+1])
    for a in xrange(max_order-1):
      for b in xrange(min_order, max_order-a):
        yield ' '.join(token[a:a+b])
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号