NLTK-块语法不读逗号

发布于 2021-01-29 14:59:40

from nltk.chunk.util import tagstr2tree
from nltk import word_tokenize, pos_tag
text = "John Rose Center is very beautiful place and i want to go there with Barbara Palvin. Also there are stores like Adidas ,Nike ,Reebok Center."
tagged_text = pos_tag(text.split())

grammar = "NP:{<NNP>+}"

cp = nltk.RegexpParser(grammar)
result = cp.parse(tagged_text)

print(result)

输出:

(S
  (NP John/NNP Rose/NNP Center/NNP)
  is/VBZ
  very/RB
  beautiful/JJ
  place/NN
  and/CC
  i/NN
  want/VBP
  to/TO
  go/VB
  there/RB
  with/IN
  (NP Barbara/NNP Palvin./NNP)
  Also/RB
  there/EX
  are/VBP
  stores/NNS
  like/IN
  (NP Adidas/NNP ,Nike/NNP ,Reebok/NNP Center./NNP))

我用于分块的语法仅适用于nnp标记,但是如果单词与逗号连续,它们仍将在同一行上。

(S
  (NP John/NNP Rose/NNP Center/NNP)
  is/VBZ
  very/RB
  beautiful/JJ
  place/NN
  and/CC
  i/NN
  want/VBP
  to/TO
  go/VB
  there/RB
  with/IN
  (NP Barbara/NNP Palvin./NNP)
  Also/RB
  there/EX
  are/VBP
  stores/NNS
  like/IN
  (NP Adidas,/NNP)
  (NP Nike,/NNP)
  (NP Reebok/NNP Center./NNP))

我应该在“ grammar =“中写什么,还是可以像上面写的那样编辑输出?如您所见,我只为我的命名实体项目解析专有名词,请帮助我。

关注者
0
被浏览
183
1 个回答
  • 面试哥
    面试哥 2021-01-29
    为面试而生,有面试问题,就找面试哥。

    使用word_tokenize(string)代替string.split()

    >>> import nltk
    >>> from nltk.chunk.util import tagstr2tree
    >>> from nltk import word_tokenize, pos_tag
    >>> text = "John Rose Center is very beautiful place and i want to go there with Barbara Palvin. Also there are stores like Adidas ,Nike ,Reebok Center."
    >>> tagged_text = pos_tag(word_tokenize(text))
    >>> 
    >>> grammar = "NP:{<NNP>+}"
    >>> 
    >>> cp = nltk.RegexpParser(grammar)
    >>> result = cp.parse(tagged_text)
    >>> 
    >>> print(result)
    (S
      (NP John/NNP Rose/NNP Center/NNP)
      is/VBZ
      very/RB
      beautiful/JJ
      place/NN
      and/CC
      i/NN
      want/VBP
      to/TO
      go/VB
      there/RB
      with/IN
      (NP Barbara/NNP Palvin/NNP)
      ./.
      Also/RB
      there/EX
      are/VBP
      stores/NNS
      like/IN
      (NP Adidas/NNP)
      ,/,
      (NP Nike/NNP)
      ,/,
      (NP Reebok/NNP Center/NNP)
      ./.)
    


知识点
面圈网VIP题库

面圈网VIP题库全新上线,海量真题题库资源。 90大类考试,超10万份考试真题开放下载啦

去下载看看