函数解析

  1. Help on function bar in module plotly.express._chart_types:
  2. bar(data_frame=None, x=None, y=None, color=None, facet_row=None, facet_col=None, facet_col_wrap=0, hover_name=None, hover_data=None, custom_data=None, text=None, error_x=None, error_x_minus=None, error_y=None, error_y_minus=None, animation_frame=None, animation_group=None, category_orders={}, labels={}, color_discrete_sequence=None, color_discrete_map={}, color_continuous_scale=None, range_color=None, color_continuous_midpoint=None, opacity=None, orientation='v', barmode='relative', log_x=False, log_y=False, range_x=None, range_y=None, title=None, template=None, width=None, height=None)
  3. In a bar plot, each row of `data_frame` is represented as a rectangular
  4. mark.
  5. Parameters
  6. ----------
  7. data_frame: DataFrame or array-like or dict
  8. This argument needs to be passed for column names (and not keyword
  9. names) to be used. Array-like and dict are tranformed internally to a
  10. pandas DataFrame. Optional: if missing, a DataFrame gets constructed
  11. under the hood using the other arguments.
  12. x: str or int or Series or array-like
  13. Either a name of a column in `data_frame`, or a pandas Series or
  14. array_like object. Values from this column or array_like are used to
  15. position marks along the x axis in cartesian coordinates. For
  16. horizontal histograms, these values are used as inputs to `histfunc`.
  17. y: str or int or Series or array-like
  18. Either a name of a column in `data_frame`, or a pandas Series or
  19. array_like object. Values from this column or array_like are used to
  20. position marks along the y axis in cartesian coordinates. For vertical
  21. histograms, these values are used as inputs to `histfunc`.
  22. color: str or int or Series or array-like
  23. Either a name of a column in `data_frame`, or a pandas Series or
  24. array_like object. Values from this column or array_like are used to
  25. assign color to marks.
  26. facet_row: str or int or Series or array-like
  27. Either a name of a column in `data_frame`, or a pandas Series or
  28. array_like object. Values from this column or array_like are used to
  29. assign marks to facetted subplots in the vertical direction.
  30. facet_col: str or int or Series or array-like
  31. Either a name of a column in `data_frame`, or a pandas Series or
  32. array_like object. Values from this column or array_like are used to
  33. assign marks to facetted subplots in the horizontal direction.
  34. facet_col_wrap: int
  35. Maximum number of facet columns. Wraps the column variable at this
  36. width, so that the column facets span multiple rows. Ignored if 0, and
  37. forced to 0 if `facet_row` or a `marginal` is set.
  38. hover_name: str or int or Series or array-like
  39. Either a name of a column in `data_frame`, or a pandas Series or
  40. array_like object. Values from this column or array_like appear in bold
  41. in the hover tooltip.
  42. hover_data: list of str or int, or Series or array-like, or dict
  43. Either a list of names of columns in `data_frame`, or pandas Series, or
  44. array_like objects or a dict with column names as keys, with values
  45. True (for default formatting) False (in order to remove this column
  46. from hover information), or a formatting string, for example ':.3f' or
  47. '|%a' or list-like data to appear in the hover tooltip or tuples with a
  48. bool or formatting string as first element, and list-like data to
  49. appear in hover as second element Values from these columns appear as
  50. extra data in the hover tooltip.
  51. custom_data: list of str or int, or Series or array-like
  52. Either names of columns in `data_frame`, or pandas Series, or
  53. array_like objects Values from these columns are extra data, to be used
  54. in widgets or Dash callbacks for example. This data is not user-visible
  55. but is included in events emitted by the figure (lasso selection etc.)
  56. text: str or int or Series or array-like
  57. Either a name of a column in `data_frame`, or a pandas Series or
  58. array_like object. Values from this column or array_like appear in the
  59. figure as text labels.
  60. error_x: str or int or Series or array-like
  61. Either a name of a column in `data_frame`, or a pandas Series or
  62. array_like object. Values from this column or array_like are used to
  63. size x-axis error bars. If `error_x_minus` is `None`, error bars will
  64. be symmetrical, otherwise `error_x` is used for the positive direction
  65. only.
  66. error_x_minus: str or int or Series or array-like
  67. Either a name of a column in `data_frame`, or a pandas Series or
  68. array_like object. Values from this column or array_like are used to
  69. size x-axis error bars in the negative direction. Ignored if `error_x`
  70. is `None`.
  71. error_y: str or int or Series or array-like
  72. Either a name of a column in `data_frame`, or a pandas Series or
  73. array_like object. Values from this column or array_like are used to
  74. size y-axis error bars. If `error_y_minus` is `None`, error bars will
  75. be symmetrical, otherwise `error_y` is used for the positive direction
  76. only.
  77. error_y_minus: str or int or Series or array-like
  78. Either a name of a column in `data_frame`, or a pandas Series or
  79. array_like object. Values from this column or array_like are used to
  80. size y-axis error bars in the negative direction. Ignored if `error_y`
  81. is `None`.
  82. animation_frame: str or int or Series or array-like
  83. Either a name of a column in `data_frame`, or a pandas Series or
  84. array_like object. Values from this column or array_like are used to
  85. assign marks to animation frames.
  86. animation_group: str or int or Series or array-like
  87. Either a name of a column in `data_frame`, or a pandas Series or
  88. array_like object. Values from this column or array_like are used to
  89. provide object-constancy across animation frames: rows with matching
  90. `animation_group`s will be treated as if they describe the same object
  91. in each frame.
  92. category_orders: dict with str keys and list of str values (default `{}`)
  93. By default, in Python 3.6+, the order of categorical values in axes,
  94. legends and facets depends on the order in which these values are first
  95. encountered in `data_frame` (and no order is guaranteed by default in
  96. Python below 3.6). This parameter is used to force a specific ordering
  97. of values per column. The keys of this dict should correspond to column
  98. names, and the values should be lists of strings corresponding to the
  99. specific display order desired.
  100. labels: dict with str keys and str values (default `{}`)
  101. By default, column names are used in the figure for axis titles, legend
  102. entries and hovers. This parameter allows this to be overridden. The
  103. keys of this dict should correspond to column names, and the values
  104. should correspond to the desired label to be displayed.
  105. color_discrete_sequence: list of str
  106. Strings should define valid CSS-colors. When `color` is set and the
  107. values in the corresponding column are not numeric, values in that
  108. column are assigned colors by cycling through `color_discrete_sequence`
  109. in the order described in `category_orders`, unless the value of
  110. `color` is a key in `color_discrete_map`. Various useful color
  111. sequences are available in the `plotly.express.colors` submodules,
  112. specifically `plotly.express.colors.qualitative`.
  113. color_discrete_map: dict with str keys and str values (default `{}`)
  114. String values should define valid CSS-colors Used to override
  115. `color_discrete_sequence` to assign a specific colors to marks
  116. corresponding with specific values. Keys in `color_discrete_map` should
  117. be values in the column denoted by `color`.
  118. color_continuous_scale: list of str
  119. Strings should define valid CSS-colors This list is used to build a
  120. continuous color scale when the column denoted by `color` contains
  121. numeric data. Various useful color scales are available in the
  122. `plotly.express.colors` submodules, specifically
  123. `plotly.express.colors.sequential`, `plotly.express.colors.diverging`
  124. and `plotly.express.colors.cyclical`.
  125. range_color: list of two numbers
  126. If provided, overrides auto-scaling on the continuous color scale.
  127. color_continuous_midpoint: number (default `None`)
  128. If set, computes the bounds of the continuous color scale to have the
  129. desired midpoint. Setting this value is recommended when using
  130. `plotly.express.colors.diverging` color scales as the inputs to
  131. `color_continuous_scale`.
  132. opacity: float
  133. Value between 0 and 1. Sets the opacity for markers.
  134. orientation: str (default `'v'`)
  135. One of `'h'` for horizontal or `'v'` for vertical)
  136. barmode: str (default `'relative'`)
  137. One of `'group'`, `'overlay'` or `'relative'` In `'relative'` mode,
  138. bars are stacked above zero for positive values and below zero for
  139. negative values. In `'overlay'` mode, bars are drawn on top of one
  140. another. In `'group'` mode, bars are placed beside each other.
  141. log_x: boolean (default `False`)
  142. If `True`, the x-axis is log-scaled in cartesian coordinates.
  143. log_y: boolean (default `False`)
  144. If `True`, the y-axis is log-scaled in cartesian coordinates.
  145. range_x: list of two numbers
  146. If provided, overrides auto-scaling on the x-axis in cartesian
  147. coordinates.
  148. range_y: list of two numbers
  149. If provided, overrides auto-scaling on the y-axis in cartesian
  150. coordinates.
  151. title: str
  152. The figure title.
  153. template: str or dict or plotly.graph_objects.layout.Template instance
  154. The figure template name (must be a key in plotly.io.templates) or
  155. definition.
  156. width: int (default `None`)
  157. The figure width in pixels.
  158. height: int (default `None`)
  159. The figure height in pixels.
  160. Returns
  161. -------
  162. plotly.graph_objects.Figure

最简单的柱状图

源代码

  1. #将plotly交互包导入
  2. import plotly.express as px
  3. #获取数据
  4. data_canada = px.data.gapminder().query("country == 'Canada'")
  5. #绘制图形
  6. fig = px.bar(data_canada, x='year', y='pop')
  7. #展示数据
  8. fig.show()

个人分析

这个是最简单的一个绘制过程,导入软件包,获取数据,绘制图形以及最后展示图形。获取数据是利用了谷歌的一个数据,国内的网络可能会有些问题,但其实问题不大,本质上就是一个DataFrame,标准的pandas那种格式,下面把这组数据的情况展示下
image.png
可以直接导出,具体的命令如下

  1. import plotly.express as px
  2. import pandas as pd
  3. data_canada= px.data.gapminder().query("country == 'Canada'")
  4. data_canada.to_csv("data_canada")
  5. fig = px.bar(data_canada, x='year', y='pop')
  6. fig.show()

然后结果如下
figure.png
最后一步便是保存图片了,也是很方便的

  1. import plotly.express as px
  2. import pandas as pd
  3. data_canada= px.data.gapminder().query("country == 'Canada'")
  4. data_canada.to_csv("test.csv")
  5. fig = px.bar(data_canada, x='year', y='pop')
  6. fig.show()
  7. fig.write_image('figure.svg')

fig.write_image可以导出多种格式的图片(png,jpg,svg,pdf,eps),包括矢量图,满足各类人群需求。
总的来说plotly从绘图到保存很方便,完美的接合pandas清洗数据。

自定义参数

源代码

  1. import plotly.express as px
  2. data = px.data.gapminder()
  3. data_canada = data[data.country == 'Canada']
  4. fig = px.bar(data_canada, x='year', y='pop',
  5. hover_data=['lifeExp', 'gdpPercap'], color='lifeExp',
  6. labels={'pop':'population of Canada'}, height=400)
  7. fig.show()

图像

newplot.png

解读

可以看到柱状图的颜色是根据color参数来定义的,这个图是lifeExp;
hover_data是关于网页的,比如你把鼠标点到某个图上,会显示这两个值,输出文件是看不到的;
labels其实就是我们横纵坐标的值,这里是把pop数据值显示为更长的population of Canada。
width和hight就是设置图形的宽和高

堆叠柱状图

当很多Y公用一个X值时,便是堆叠柱状图。

源代码

  1. import plotly.express as px
  2. df = px.data.tips()
  3. fig = px.bar(df, x="sex", y="total_bill", color='time')
  4. fig.show()

这个还是用的官方的数据,具体的数据可以按照上方的例子查看。

图片

newplot (1).png

解读

这种其实是个极端情况,有很多的Y值公用一个X值,实际中往往不是这种。

分组柱状图

源代码

  1. # Change the default stacking
  2. import plotly.express as px
  3. fig = px.bar(df, x="sex", y="total_bill", color='smoker', barmode='group',
  4. height=400)
  5. fig.show()

图片

newplot.png

解读

其实主要调整了下barmode参数,这个参数决定了柱状图的模式。