matrix_multiply.py 文件源码

python
阅读 18 收藏 0 点赞 0 评论 0

项目:Scalable-Matrix-Multiplication-on-Apache-Spark 作者: Abhishek-Arora 项目源码 文件源码
def main():


    input = sys.argv[1]
    output = sys.argv[2]


    conf = SparkConf().setAppName('Matrix Multiplication')
    sc = SparkContext(conf=conf)
    assert sc.version >= '1.5.1'

    row = sc.textFile(input).map(lambda row : row.split(' ')).cache()
    ncol = len(row.take(1)[0])
    intermediateResult = row.map(permutation).reduce(add_tuples)

    outputFile = open(output, 'w') 





    result = [intermediateResult[x:x+3] for x in range(0, len(intermediateResult), ncol)]


    for row in result:
        for element in row:
            outputFile.write(str(element) + ' ')
        outputFile.write('\n')

    outputFile.close()

    # outputResult = sc.parallelize(result).coalesce(1)
    # outputResult.saveAsTextFile(output)
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号