使用readxl包导入excel文件
- excel_sheet()
- read_excel()
  - 几个参数
另外一种导入Excel 方式：gdata包
- gdata 和readxl 包对比
- read.xls() 导入文件
打通excel和R的包：XLConnect
使用XLConnect 修改数据

使用readxl包导入excel文件

主要包含两个函数excel_sheets()，read_excel()

excel_sheet()

用于提取excel 中的表单

# Load the readxl package
library(readxl)
# Print the names of all worksheets
excel_sheets("urbanpop.xlsx")

用于读取excel 表单中的信息到R

read_excel()

# The readxl package is already loaded
# Read the sheets, one by one
pop_1 <- read_excel("urbanpop.xlsx", sheet = 1)
pop_2 <- read_excel("urbanpop.xlsx", sheet = 2)
pop_3 <- read_excel("urbanpop.xlsx", sheet = 3)
# Put pop_1, pop_2 and pop_3 in a list: pop_list
pop_list <- list(pop_1, pop_2, pop_3)
# Display the structure of pop_list
str(pop_list)

通过lapply 函数可以直接将提取的表单传递给read_excel() 函数。

# The readxl package is already loaded
# Read all Excel sheets with lapply(): pop_list
pop_list <- lapply(excel_sheets('urbanpop.xlsx'), read_excel, path = "urbanpop.xlsx")
# Display the structure of pop_list
str(pop_list)

几个参数

默认参数设置为

01. 读取Excel 文件 - 图2

col_types

可以通过向量进行赋值，如text, blank, numeric, date 等。

sheet

选择Excel表格中选定的表单。

skip

类似之前readr包提及的skip。用于跳过某些行内容。

col_names

默认下col_names 值为TRUE，即函数不会自动命名。可以通过赋值或改为FALSE的方式，自定义命名或依靠函数自动命名。

这里可以使用一个小技巧，通过paste() 批量连接信息。

paste(“a”, 0:10)，即代表生成 “a0”, “a1”…”a10”

# The readxl package is already loaded
# Import the first Excel sheet of urbanpop_nonames.xlsx (R gives names): pop_a
pop_a <- read_excel("urbanpop_nonames.xlsx", col_names = FALSE)
# Import the first Excel sheet of urbanpop_nonames.xlsx (specify col_names): pop_b
cols <- c("country", paste0("year_", 1960:1966))
pop_b <- read_excel("urbanpop_nonames.xlsx", col_names = cols)
# Print the summary of pop_a
summary(pop_a)
# Print the summary of pop_b
summary(pop_b)

另外一种导入Excel 方式：gdata包

gdata 原理：

01. 读取Excel 文件 - 图3

gdata 和readxl 包对比

01. 读取Excel 文件 - 图4

主要因为readxl 包还在发展，很多功能不完善，而且可能语法会变换。

因此选择gdata 这个成熟的包学习，会更加保险一些。

read.xls() 导入文件

# Import the second sheet of urbanpop.xls: urban_pop
urban_pop <- read.xls("urbanpop.xls", sheet = "1967-1974")

通过cbind() 可以添加data.frame或matrix 等信息
data_frame[-1]，可以去除第一列的信息。
na.omit可以用来除去data.frame 中的NA 信息。

例子

# Add code to import data from all three sheets in urbanpop.xls
path <- "urbanpop.xls"
urban_sheet1 <- read.xls(path, sheet = 1, stringsAsFactors = FALSE)
urban_sheet2 <- read.xls(path, sheet = 2, stringsAsFactors = FALSE)
urban_sheet3 <- read.xls(path, sheet = 3, stringsAsFactors = FALSE)
# Extend the cbind() call to include urban_sheet3: urban
urban <- cbind(urban_sheet1, urban_sheet2[-1], urban_sheet3[-1])
# Remove all rows with NAs from urban: urban_clean
urban_clean <- na.omit(urban)
# Print out a summary of urban_clean
summary(urban_clean)

打通excel和R的包：XLConnect

一个应用了Java的包（安装可能需要java 环境）。

几乎可以实现使用R代码进行所有excel 可以进行的操作。

loadWorkbook()

加载excel 的表格。功能是创建在R中创建一个workbook，用于连接excel文件和R工作区。可以将其赋值给一个变量。

# Load the XLConnect package
library(XLConnect)
# Build connection to urbanpop.xlsx: my_book
my_book <- loadWorkbook("urbanpop.xlsx")

getsheet()

用于列出excel 文件中的所有列表

getSheets(my_book)

readWorksheet()

读取表格信息。

readWorksheet 一般有四个参数。object 为表格对象，一般为需先经过loadWorkbook() 处理；sheet 表示表格信息，startCol 表示开始的行数，endCol 表示结束的行数。

# XLConnect is already available
# Build connection to urbanpop.xlsx
my_book <- loadWorkbook("urbanpop.xlsx")
# Import columns 3, 4, and 5 from second sheet in my_book: urbanpop_sel
urbanpop_sel <- readWorksheet(my_book, sheet = 2, startCol = 3, endCol = 5)
# Import first column from second sheet in my_book: countries
countries <- readWorksheet(my_book, sheet = 2, startCol = 1, endCol = 1)
# cbind() urbanpop_sel and countries together: selectioncbind(urbanpop_sel, countries)
selection <- cbind(countries, urbanpop_sel)

使用XLConnect 修改数据

createSheet()

createSheet(object, name = )

创建一个空的表格

# Add a worksheet to my_book, named "data_summary"
createSheet(my_book, name = "data_summary")

writeWorksheet()

将新的表格信息写入到某个表格中。

writeWorksheet(object, new_object, sheet = )

# Add data in summ to "data_summary" sheet
writeWorksheet(my_book, summ, sheet = "data_summary")

saveWorkbook()

所有的编辑结束之后需要使用该函数进行文件的保存。（类似于进行excel操作后得保存文件，否则所有内容都付之东流了。）

saveWorkbook(object, flie = )

# Save workbook as summary.xlsx
saveWorkbook(my_book, file = "summary.xlsx")

renameSheet()

对表格进行重命名

renameSheet(object, 'old_name', 'new_name' )

# Rename "data_summary" sheet to "summary"
renameSheet(my_book,sheet = 4, "summary")

removeSheet()

移除整个表格

`removeSheet(object, sheet = )

# Remove the fourth sheet
removeSheet(my_book, sheet = 4)