⭐️⭐️⭐️⭐️⭐️(实战前准备)
本文图片、代码取自生信技能树培训班。感谢各位老师的教学与前期工作准备。
一、将表格文件读入——生成数据框(PS:表格文件读入R,生成数据框,对数据框修改不会同步到表格文件),不要取原文件同名变量,以免覆盖原文件
1、常见表格文件后缀——csv(纯文本文件后缀无意义 )
(1)可使用sublime打开——推荐
(2)使用R语言读取——可赋值(注意读取后格式是否是我们想要的),read函数(常规)与rio包(新手、懒人友好型⭐️⭐️⭐️)
ex1 = read.table("ex1.txt") #读取txt格式文件
ex1=read.table("ex1.txt",header = T)
ex2 = read.csv("ex1.txt") #读取csv文件
ex2=read.csv("ex2.csv",
row.names = 1,
check.names = F)
#懒人必备读取/导出包“rio”⭐️⭐️⭐️
#import()
x <- import("mtcars.csv")
#当文件缺少后缀名时,可以指定后缀名打开
a=import("mtcars1", format = "csv")
#Importing Data Lists
b=import_list("list.xlsx")
#export
export(mtcars, "mtcars.csv")
当读取失败(读取样式不是我们需要的),需要指定部分参数,如:
rio包介绍:The idea behind rio is to simplify the process of importing data into R and exporting data from R.rio aims to unify data I/O (importing and exporting) into two simple functions: import() and export() so that beginners (and experienced R users) never have to think twice (or even once) about the best way to read and write R data.The core advantage of rio is that it makes assumptions that the user is probably willing to make.for example, in determining what application is associated with a given file type. By taking away the need to manually match a file type (which a beginner may not recognize) to a particular import or export function, rio allows almost all common data formats to be read with the same function. #**数据输入:**_rio allows you to import files in almost any format using one, typically single-argument, function. import() infers the file format from the file’s extension and calls the appropriate data import function for you, returning a simple data.frame. This works for any for the formats listed above. #缺少后缀名:If for some reason a file does not have an extension, or has a file extension that does not match its actual type, you can manually specify a file format to override the format inference step. #导入列表(excel工作簿):_Sometimes you may have multiple data files that you want to import. import() only ever returns a single data frame, but import_list() can be used to import a vector of file names into R. This works even if the files are different formats.Similarly, some single-file formats (e.g. Excel Workbooks, Zip directories, HTML files, etc.) can contain multiple data sets.
(3)数据框导出(csv/txt格式)
(4)R特有数据格式——Rdata
二、错题重现——时刻关注数据类型
#加载y.Rdata,求gene1列的平均值
load("y.Rdata")
a=mean(y[,1])#看y的类型(矩阵)——里面都是字符型
Warning message:
In mean.default(y[, 1]) : 参数不是数值也不是逻辑值:回覆NA
a=mean(as.numeric(y[,1]))