data_set_gen.py 文件源码

python
阅读 27 收藏 0 点赞 0 评论 0

项目:Multi-channel-speech-extraction-using-DNN 作者: zhr1201 项目源码 文件源码
def transform(audio_data, save_image_path, nFFT=256, overlap=0.75):
    '''audio_data: signals to convert
    save_image_path: path to store the image file'''
    # spectrogram
    freq_data = stft(audio_data, nFFT, overlap)
    freq_data = np.maximum(np.abs(freq_data),
                           np.max(np.abs(freq_data)) / 10000)
    log_freq_data = 20. * np.log10(freq_data / 1e-4)
    N_samples = log_freq_data.shape[0]
    # log_freq_data = np.maximum(log_freq_data, max_m - 70)
    # print(np.max(np.max(log_freq_data)))
    # print(np.min(np.min(log_freq_data)))
    log_freq_data = np.round(log_freq_data)
    log_freq_data = np.transpose(log_freq_data)
    # ipdb.set_trace()

    assert np.max(np.max(log_freq_data)) < 256, 'spectrogram value too large'
    # save the image
    spec_imag = Image.fromarray(log_freq_data)
    spec_imag = spec_imag.convert('RGB')
    spec_imag.save(save_image_path)
    return N_samples
评论列表
文章目录


问题


面经


文章

微信
公众号

扫码关注公众号