isp_data_pollution.py 文件源码

python
阅读 37 收藏 0 点赞 0 评论 0

项目:isp-data-pollution 作者: essandess 项目源码 文件源码
def draw_domain(self,log_sampling=False):
        """ Draw a single, random domain. """
        domain = None
        domain_array = np.array([dmn for dmn in self.domain_links])
        domain_count = np.array([len(self.domain_links[domain_array[k]]) for k in range(domain_array.shape[0])])
        p = np.array([np.float(c) for c in domain_count])
        count_total = p.sum()
        if log_sampling:  # log-sampling [log(x+1)] to bias lower count domains
            p = np.fromiter((np.log1p(x) for x in p), dtype=p.dtype)
        if count_total > 0:
            p = p/p.sum()
            cnts = npr.multinomial(1, pvals=p)
            k = int(np.nonzero(cnts)[0])
            domain = domain_array[k]
        return domain
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号