metrics.py 文件源码

python
阅读 21 收藏 0 点赞 0 评论 0

项目:lens 作者: ASIDataScience 项目源码 文件源码
def _compute_smoothed_histogram2d(values,
                                  bandwidth,
                                  coord_ranges,
                                  logtrans=False):
    """Approximate 2-D density estimation.

    Estimate 2-D probability densities at evenly-spaced grid points,
    for specified data. This method is based on creating a 2-D histogram of
    data points quantised with respect to evenly-spaced grid points.
    Probability densities are then estimated at the grid points by convolving
    the obtained histogram with a Gaussian kernel.

    Parameters
    ----------
    values : np.array (N,2)
        A 2-D array containing the data for which to perform density
        estimation. Successive data points are indexed by the first axis in the
        array. The second axis indexes x and y coordinates of data points
        (values[:,0] and values[:,1] respectively).
    bandwidth : array-like (2,)
        The desired KDE bandwidths for x and y axes. (When log-transformation
        of data is desired, bandwidths should be specified in log-space.)
    coord_range: (2,2)
        Minimum and maximum values of coordinates on which to evaluate the
        smoothed histogram.
    logtrans : array-like (2,)
        A 2-element boolean array specifying whether or not to log-transform
        the x or y coordinates of the data before performing density
        estimation.

    Returns
    -------
    np.array (M-1, M-1)
        An array of estimated probability densities at specified grid points.
    """
    bin_edges = []
    bedge_range = []
    for minmax, lt in zip(coord_ranges, logtrans):
        if lt:
            ber = [np.log10(extreme) for extreme in minmax]
            bin_edges.append(np.logspace(*ber, num=DENSITY_N + 1))
            bedge_range.append(ber[1] - ber[0])
        else:
            bin_edges.append(np.linspace(*minmax, num=DENSITY_N + 1))
            bedge_range.append(minmax[1] - minmax[0])

    # Bin the observations
    H = np.histogram2d(values[:, 0], values[:, 1], bins=bin_edges)[0]

    relative_bw = [bw / berange for bw, berange in zip(bandwidth, bedge_range)]
    K = _compute_gaussian_kernel(H.shape, relative_bw)

    pdf = signal.fftconvolve(H.T, K, mode='same')

    # Normalize pdf
    bin_centers = [edges[:-1] + np.diff(edges) / 2. for edges in bin_edges]
    pdf /= np.trapz(np.trapz(pdf, bin_centers[1]), bin_centers[0])

    # Return lower bin edges and density
    return bin_edges[0][:-1], bin_edges[1][:-1], pdf
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号