Python

如何使Matplotlib / Pandas条形图看起来像历史图？

发布于 2021-01-29 14:09:46

绘制`bar`和之间的差异`hist`

由于在一些数据pandas.Series，rv，之间是有差异

hist直接调用数据进行绘图
计算直方图结果（用numpy.histogram），然后用绘制bar

示例数据生成

%matplotlib inline

import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib
matplotlib.rcParams['figure.figsize'] = (12.0, 8.0)
matplotlib.style.use('ggplot')

# Setup size and distribution
size = 50000
distribution = stats.norm()

# Create random data
rv = pd.Series(distribution.rvs(size=size))
# Get sane start and end points of distribution
start = distribution.ppf(0.01)
end = distribution.ppf(0.99)

# Build PDF and turn into pandas Series
x = np.linspace(start, end, size)
y = distribution.pdf(x)
pdf = pd.Series(y, x)

# Get histogram of random data
y, x = np.histogram(rv, bins=50, normed=True)
# Correct bin edge placement
x = [(a+x[i+1])/2.0 for i,a in enumerate(x[0:-1])]
hist = pd.Series(y, x)

`hist()` 绘图

ax = pdf.plot(lw=2, label='PDF', legend=True)
rv.plot(kind='hist', bins=50, normed=True, alpha=0.5, label='Random Samples', legend=True, ax=ax)

`bar()` 绘图

ax = pdf.plot(lw=2, label='PDF', legend=True)
hist.plot(kind='bar', alpha=0.5, label='Random Samples', legend=True, ax=ax)

如何使`bar`情节看起来像`hist`情节？

为此，用例仅需保存 直方图 数据以供以后使用（以后通常会比原始数据小）。

关注者

被浏览

355

1 个回答

面试哥 2021-01-29

为面试而生，有面试问题，就找面试哥。

条形图差异

要获得bar类似于该hist图的图，需要对的默认行为进行一些处理bar。

bar通过传递x（hist.index）和y（hist.values）强制使用实际x数据绘制范围。默认bar行为是在任意范围内绘制y数据，并将x数据作为标签。
将width参数设置为与x数据的实际步长相关（默认值为0.8）
将align参数设置为'center'。
手动设置轴图例。

需要这些变化经由到制成matplotlib的bar()呼吁轴线（ax）代替pandas的bar()呼吁数据（hist）。

绘图示例

%matplotlib inline

import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib
matplotlib.rcParams['figure.figsize'] = (12.0, 8.0)
matplotlib.style.use('ggplot')

# Setup size and distribution
size = 50000
distribution = stats.norm()

# Create random data
rv = pd.Series(distribution.rvs(size=size))
# Get sane start and end points of distribution
start = distribution.ppf(0.01)
end = distribution.ppf(0.99)

# Build PDF and turn into pandas Series
x = np.linspace(start, end, size)
y = distribution.pdf(x)
pdf = pd.Series(y, x)

# Get histogram of random data
y, x = np.histogram(rv, bins=50, normed=True)
# Correct bin edge placement
x = [(a+x[i+1])/2.0 for i,a in enumerate(x[0:-1])]
hist = pd.Series(y, x)

# Plot previously histogrammed data
ax = pdf.plot(lw=2, label='PDF', legend=True)
w = abs(hist.index[1]) - abs(hist.index[0])
ax.bar(hist.index, hist.values, width=w, alpha=0.5, align='center')
ax.legend(['PDF', 'Random Samples'])