VPMapOnlyMapper.java 文件源码

java
阅读 21 收藏 0 点赞 0 评论 0

项目:PigSPARQL 作者:
@Override
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
    String[] parsedTriple = rdfParser.parseTriple(value.toString());
    if (parsedTriple != null) {
        // Convert liters to Pig Types, if possible
        parsedTriple[2] = Util.toPigTypes(parsedTriple[2]);
        // Use Predicate for Vertical Partitioning
        multipleOutputs.write(NullWritable.get(), new Text(parsedTriple[0] + "\t" + parsedTriple[2]),
                Util.generateFileName(parsedTriple[1]));
        // Write all parsed triples also to "inputData" for queries where Predicate is not known
        multipleOutputs.write(NullWritable.get(), new Text(parsedTriple[0] + "\t" + parsedTriple[1] + "\t" + parsedTriple[2]),
                Util.generateFileName("inputData"));
        context.getCounter("RDF Dataset Properties", VALID_TRIPLES).increment(1);
    } else {
        if (value.getLength() == 0 || value.toString().startsWith("@")) {
            System.out.println("IGNORING: " + value);
            context.getCounter("RDF Dataset Properties", IGNORED_LINES).increment(1);
        } else {
            System.out.println("DISCARDED: " + value);
            context.getCounter("RDF Dataset Properties", INVALID_TRIPLES).increment(1);
        }
    }
}
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号