tp3_solutions.py 文件源码

python
阅读 29 收藏 0 点赞 0 评论 0

项目:TPs 作者: DataMiningP7 项目源码 文件源码
def try_kmeans(X):
    """ Run the K-Means algorithm on X with different values of K, and return
     the one that gives the best score.

    Args:
        X: the TF-IDF matrix where each line represents a document and each
           column represents a word, typically obtained by running
           transform_text() from the TP2.
    """
    best_k = 1
    best_score = -1

    for k in range(2, 20+1):
        model = KMeans(n_clusters=k)
        model.fit(X)
        labels = model.predict(X)
        score = silhouette_score(model.transform(X), labels)

        print(k, "->", score)
        if score > best_score:
            best_k = k
            best_score = score

    print("The best K is", best_k)
    return best_k


# Ex3
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号