如何使Matplotlib / Pandas条形图看起来像历史图?
绘制bar
和之间的差异hist
由于在一些数据pandas.Series
,rv
,之间是有差异
-
hist
直接调用数据进行绘图 -
计算直方图结果(用
numpy.histogram
),然后用绘制bar
示例数据生成
%matplotlib inline
import numpy as np
import pandas as pd
import scipy.stats as stats
import matplotlib
matplotlib.rcParams['figure.figsize'] = (12.0, 8.0)
matplotlib.style.use('ggplot')
# Setup size and distribution
size = 50000
distribution = stats.norm()
# Create random data
rv = pd.Series(distribution.rvs(size=size))
# Get sane start and end points of distribution
start = distribution.ppf(0.01)
end = distribution.ppf(0.99)
# Build PDF and turn into pandas Series
x = np.linspace(start, end, size)
y = distribution.pdf(x)
pdf = pd.Series(y, x)
# Get histogram of random data
y, x = np.histogram(rv, bins=50, normed=True)
# Correct bin edge placement
x = [(a+x[i+1])/2.0 for i,a in enumerate(x[0:-1])]
hist = pd.Series(y, x)
hist()
绘图
ax = pdf.plot(lw=2, label='PDF', legend=True)
rv.plot(kind='hist', bins=50, normed=True, alpha=0.5, label='Random Samples', legend=True, ax=ax)
bar()
绘图
ax = pdf.plot(lw=2, label='PDF', legend=True)
hist.plot(kind='bar', alpha=0.5, label='Random Samples', legend=True, ax=ax)
如何使bar
情节看起来像hist
情节?
为此,用例仅需保存 直方图 数据以供以后使用(以后通常会比原始数据小)。
-
条形图差异
要获得
bar
类似于该hist
图的图,需要对的默认行为进行一些处理bar
。bar
通过传递x(hist.index
)和y(hist.values
)强制使用实际x数据绘制范围。默认bar
行为是在任意范围内绘制y数据,并将x数据作为标签。- 将
width
参数设置为与x数据的实际步长相关(默认值为0.8
) - 将
align
参数设置为'center'
。 - 手动设置轴图例。
需要这些变化经由到制成
matplotlib
的bar()
呼吁轴线(ax
)代替pandas
的bar()
呼吁数据(hist
)。绘图示例
%matplotlib inline import numpy as np import pandas as pd import scipy.stats as stats import matplotlib matplotlib.rcParams['figure.figsize'] = (12.0, 8.0) matplotlib.style.use('ggplot') # Setup size and distribution size = 50000 distribution = stats.norm() # Create random data rv = pd.Series(distribution.rvs(size=size)) # Get sane start and end points of distribution start = distribution.ppf(0.01) end = distribution.ppf(0.99) # Build PDF and turn into pandas Series x = np.linspace(start, end, size) y = distribution.pdf(x) pdf = pd.Series(y, x) # Get histogram of random data y, x = np.histogram(rv, bins=50, normed=True) # Correct bin edge placement x = [(a+x[i+1])/2.0 for i,a in enumerate(x[0:-1])] hist = pd.Series(y, x) # Plot previously histogrammed data ax = pdf.plot(lw=2, label='PDF', legend=True) w = abs(hist.index[1]) - abs(hist.index[0]) ax.bar(hist.index, hist.values, width=w, alpha=0.5, align='center') ax.legend(['PDF', 'Random Samples'])