介绍
数据归一化和标准化都是scaling,常用Normalization或Standardization表示。记录下R实现不同scaling方法。更多知识分享请到 https://zouhua.top/。
标准化R实现
- Median scale normalization
- Robust scale normalization
- Unit scale normalization
- z-scale normalization
- Min-Max normalization
# method1: Median scale normalizationMDA_fun <- function(features){# x for features X = (x1, x2, ..., xn)value <- as.numeric(features)d_mad <- mad(value)x_scale <- (value - median(value))/d_madreturn(x_scale)}dat_s1_MDA <- apply(dat, 1, MDA_fun)rownames(dat_s1_MDA) <- colnames(dat)# method2: Robust scale normalizationRobust_fun <- function(features){# x for features X = (x1, x2, ..., xn)value <- as.numeric(features)q_value <- as.numeric(quantile(value))remain_value <- value[value > q_value[2] & value < q_value[4]]mean_value <- mean(remain_value)sd_value <- sd(remain_value)x_scale <- (value - mean_value)/sd_valuereturn(x_scale)}# method3: Unit scale normalizationUnit_fun <- function(samples){# v for samples v = (v1, v2, ..., vn)value <- as.numeric(samples)x_scale <- value / sqrt(sum(value^2))return(x_scale)}# method4: z-scale normalizationZscore_fun <- function(features){# x for features X = (x1, x2, ..., xn)value <- as.numeric(features)mean_value <- mean(value)sd_value <- sd(value)x_scale <- (value - mean_value)/sd_valuereturn(x_scale)}# method5: Min-Max normalizationMin_Max_fun <- function(features){# x for features X = (x1, x2, ..., xn)value <- as.numeric(features)min_value <- min(value)max_value <- max(value)x_scale <- (value - min_value)/(max_value - min_value)return(x_scale)}
method1 2 4 5 的scaling的计算方式为减一个统计量再除以一个统计量,method3除以向量自身的长度,前者适合行向量,后者适合列向量,当然也不一定。
参考
参考文章如引起任何侵权问题,可以与我联系,谢谢。
