@Override
public Multiset<String> tokenizeToMultiset(String input) {
// tokenizeToList is not reused here on purpose. Removing duplicate
// words early means these don't have to be tokenized multiple
// times. Increases performance.
Multiset<String> tokens = HashMultiset.create(input.length());
tokens.add(input);
Multiset<String> newTokens = HashMultiset.create(input.length());
for (Tokenizer t : tokenizers) {
for (String token : tokens) {
newTokens.addAll(t.tokenizeToList(token));
}
Multiset<String> swap = tokens;
tokens = newTokens;
newTokens = swap;
newTokens.clear();
}
return tokens;
}
Tokenizers.java 文件源码
java
阅读 48
收藏 0
点赞 0
评论 0
项目:linkifier
作者:
评论列表
文章目录