Python 可视化
在一些学术论文中,经常会看到用「相关性矩阵(correlation matrix)」 去表示数据集中每队数据变量间的关系,可以实现对数据集大致情况的一个快速预览,常常用于探索性分析。

R绘制相关性矩阵

在R中有很多可视化包可以绘制相关性矩阵图,如R-ggcorrplot、R-ggstatsplot和R-corrplot。

R-ggcorrplot

R-ggcorrplot包作为ggplot2的拓展包,首先进行介绍,具体内容如下:

  1. 官网

R-ggcorrplot包的官网如下:https://rpkgs.datanovia.com/ggcorrplot/

  1. 样例介绍 R-ggcorrplot包主要提供ggcorrplot()cor_pmat()两个绘图函数,具体例子如下(这里都做了主题等细节设置):

    「样例一」:默认情况

    ```r library(tidyverse) library(ggtext) library(hrbrthemes) library(wesanderson) library(LaCroixColoR) library(ggsci) library(ggcorrplot)

data(mtcars) corr <- round(cor(mtcars), 1) p.mat <- cor_pmat(mtcars) colors = c(“#B2182B”, “white”, “#4D4D4D”)

plot01 <- ggcorrplot(corr,colors = colors, ggtheme=hrbrthemes::theme_ipsum(base_family = “Roboto Condensed”))

plot01_cus <- plot01 + labs(x=””,y=””, title = “Example of ggcorrplot charts makes“, subtitle = “processed charts with ggcorrplot()“, caption = “Visualization by DataCharm“) + theme(
plot.title = element_markdown(hjust = 0.5,vjust = .5,color = “black”, size = 25, margin = margin(t = 1, b = 12)), plot.subtitle = element_markdown(hjust = 0,vjust = .5,size=20), plot.caption = element_markdown(face = ‘bold’,size = 15))

  1. ![2021-05-27-16-58-35-160536.png](https://cdn.nlark.com/yuque/0/2021/png/396745/1622106377947-14a3e6f2-100f-4f05-b0a7-561558c3c56c.png#align=left&display=inline&height=810&id=ue471babb&margin=%5Bobject%20Object%5D&name=2021-05-27-16-58-35-160536.png&originHeight=810&originWidth=1080&size=2629534&status=done&style=shadow&width=1080)<br />Example01 of ggcorrplot
  2. <a name="tqv8c"></a>
  3. ### 「样例二」:圆形下半面
  4. ```r
  5. plot02 <- ggcorrplot(corr,colors = colors,
  6. method = "circle",
  7. outline.color = "black",
  8. lab = TRUE,
  9. type = "lower",
  10. lab_size = 4,
  11. ggtheme=hrbrthemes::theme_ipsum(base_family = "Roboto Condensed"))
  12. plot02_cus <- plot02 +
  13. labs(x="",y="",
  14. title = "Example of <span style='color:#D20F26'>ggcorrplot charts makes</span>",
  15. subtitle = "processed charts with <span style='color:#1A73E8'>ggcorrplot()</span>",
  16. caption = "Visualization by <span style='color:#0057FF'>DataCharm</span>") +
  17. #hrbrthemes::theme_ipsum(base_family = "Roboto Condensed") +
  18. theme(
  19. plot.title = element_markdown(hjust = 0.5,vjust = .5,color = "black",
  20. size = 25, margin = margin(t = 1, b = 12)),
  21. plot.subtitle = element_markdown(hjust = 0,vjust = .5,size=20),
  22. plot.caption = element_markdown(face = 'bold',size = 15))

2021-05-27-16-58-35-271245.png
Example02 of ggcorrplot

「样例三」:上半面

  1. plot03 <- ggcorrplot(cor(mtcars),colors = colors,
  2. outline.color = "black",
  3. lab = TRUE,
  4. type = "upper",
  5. p.mat = p.mat,
  6. digits = 2,
  7. ggtheme=hrbrthemes::theme_ipsum(base_family = "Roboto Condensed"))
  8. plot03_cus <- plot03 +
  9. labs(x="",y="",
  10. title = "Example of <span style='color:#D20F26'>ggcorrplot charts makes</span>",
  11. subtitle = "processed charts with <span style='color:#1A73E8'>ggcorrplot()</span>",
  12. caption = "Visualization by <span style='color:#0057FF'>DataCharm</span>") +
  13. #hrbrthemes::theme_ipsum(base_family = "Roboto Condensed") +
  14. theme(
  15. plot.title = element_markdown(hjust = 0.5,vjust = .5,color = "black",
  16. size = 25, margin = margin(t = 1, b = 12)),
  17. plot.subtitle = element_markdown(hjust = 0,vjust = .5,size=20),
  18. plot.caption = element_markdown(face = 'bold',size = 15))

2021-05-27-16-58-35-404883.png
Example03 of ggcorrplot
以上就是ggcorrplot包绘制的基本情况(基本上重要的参数设置都介绍完了)。

R-ggstatsplot

主要介绍里面的ggcorrmat() 绘图函数,可通过如下例子进行介绍:

「样例一」:基础样例

  1. ggstatsplot01 <- ggcorrmat(
  2. data = mtcars,
  3. colors = c("#B2182B", "white", "#4D4D4D"),
  4. title = "Correlalogram Example of ggstatsplot charts makes",
  5. subtitle = "processed charts with ggcorrmat()",
  6. caption = "Visualization by DataCharm",
  7. ggtheme = hrbrthemes::theme_ipsum(base_family = "Roboto Condensed"),
  8. ) +
  9. theme(
  10. plot.title = element_text(hjust = 0.5,vjust = .5,color = "black",
  11. size = 18, margin = margin(t = 1, b = 12)),
  12. plot.subtitle = element_text(hjust = 0,vjust = .5,size=16),
  13. plot.caption = element_text(face = 'bold',size = 12))

2021-05-27-16-58-35-553485.png
Example01 of ggstatsplot

「样例二」:定制化操作

  1. ggstatsplot02 <- ggcorrmat(
  2. data = mtcars,
  3. matrix.type = "upper",
  4. ggcorrplot.args = list(lab_col = "black",lab_size = 4,tl.srt = 90,pch.col = "red",pch.cex = 10),
  5. title = "Correlalogram Example of ggstatsplot charts makes",
  6. subtitle = "Processed charts with ggcorrmat()",
  7. caption = "Visualization by DataCharm",
  8. ggtheme = hrbrthemes::theme_ipsum(base_family = "Roboto Condensed"),
  9. ) +
  10. theme(
  11. plot.title = element_text(hjust = 0.5,vjust = .5,color = "black",
  12. size = 18, margin = margin(t = 1, b = 12)),
  13. plot.subtitle = element_text(hjust = 0,vjust = .5,size=16),
  14. plot.caption = element_text(face = 'bold',size = 12))

2021-05-27-16-58-35-665703.png
Example02 of ggstatsplot
此外,ggstatsplot还有针对分组数据的的相关性矩阵绘制方法。

R-corrplot

介绍完基于ggplot2绘图体系的相关性矩阵图表绘制方法之后,再介绍R-corrplot包的绘制方法。详细内容如下:

  1. 官网 R-corrplot包的使用方法可参考如下网址:https://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html
  2. 样例介绍 R-corrplot包由于其自身的绘图语法,这里只列举两个小例子做比较,其他详细内容,可以参考官网:

    「样例一」:

    1. opar <- par(family = "Roboto Condensed")
    2. col1 <- colorRampPalette(c("#B2182B", "white", "#4D4D4D"))
    3. corrplot(M, type = "upper",method = "ellipse",col =col1(100),order = "hclust", addrect = 2,
    4. tl.col = "black", tl.srt = 45)
    5. mtext(text = "Example Of Corrplot", side = 1, line = -4,
    6. col = "black", font = 4, adj = 0.05, cex = 2)
    7. mtext(text = "Visualization by DataCharm", side=1,
    8. line = - 1, col = "black", font = 3, adj = 0.05, cex = 1)
    2021-05-27-16-58-35-762443.png
    Example01 of corrplot

    「样例二」:组合样式

    1. corrplot.mixed(M, lower = "ellipse", upper = "circle",
    2. tl.col = "black", tl.srt = 45)
    2021-05-27-16-58-35-861179.png
    Example02 of corrplot

    Python 绘制相关性矩阵

    在介绍完R绘制相关性矩阵图的方法后,再简单介绍下如何使用Python进行绘制,这里直接列出例子即可:
    1. from string import ascii_letters
    2. import numpy as np
    3. import pandas as pd
    4. import seaborn as sns
    5. import matplotlib.pyplot as plt
    6. sns.set_theme(style="white")
    7. rs = np.random.RandomState(33)
    8. d = pd.DataFrame(data=rs.normal(size=(100, 26)),
    9. columns=list(ascii_letters[26:]))
    10. corr = d.corr()
    11. mask = np.triu(np.ones_like(corr, dtype=bool))
    12. f, ax = plt.subplots(figsize=(11, 9))
    13. cmap = sns.diverging_palette(230, 20, as_cmap=True)
    14. sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3, center=0,
    15. square=True, linewidths=.5, cbar_kws={"shrink": .5})
    2021-05-27-16-58-35-961909.png
    Example of seaborn.heatmap