java8-stream操作 - 《Java基础》

难理解的点
- flatmap
小常识
静态工厂方法
创建流
Stream操作
流收集器Collectors

难理解的点

flatmap

和 map 的区别主要就是，map 返回的是基本数据类型；而 flatmap 返回的类型是 stream ；
即将多个 stream 合并为一个 stream，然后拿到一个 stream 之后就跟正常的各种流操作一样了；

举例1：
一个对象 User ，里面有多个字段，然后还有一个字段类型是 List 类型的。即每个人都有多辆车；
然后从找出所有人的所有车中是奥迪的车的集合，然后代码可以按照如下的编写：

List<Car> car1 = Lists.list(Car.builder().name("aodi").username("tom").build(),
                            Car.builder().name("baoma").username("tom2").build());
List<Car> car2 = Lists.list(Car.builder().name("benchi").username("tom3").build(),
                            Car.builder().name("benchi").username("tom4").build());
List<Car> car3 = Lists.list(Car.builder().name("aodi").username("tom5").build(),
                            Car.builder().name("tuguan").username("tom6").build());
List<UserInfo> list = Lists.newArrayList();
list.add(UserInfo.builder().cars(car1).build());
list.add(UserInfo.builder().cars(car2).build());
list.add(UserInfo.builder().cars(car3).build());
List<Car> carList = list.stream()
    .flatMap(u -> u.getCars().stream()) // 注意这里要返回一个stream对象
    .filter(c -> c.getName().equalsIgnoreCase("aodi"))
    .collect(Collectors.toList());
carList.forEach(System.out::println);

举例2：
如下分步骤写出来，更容易看到使用 flatmap 的时候需要返回的是一个 stream 对象了；

String[] strings = new String[]{"hello", "world"};
Stream<String[]> stream = Arrays.stream(strings).map(s -> s.split(""));
stream.flatMap(s -> Arrays.stream(s)).forEach(System.out::print);

//// 打印的结果
helloworld

小常识

常规内容

给出某一个集合
操作整个集合并对每一个元素应用函数—>使用 map
过滤操作—>使用filter
取出集合中数据的某一个有特征的数据（年龄最小）—>使用聚合 reduce + 方法引用或 Stream.min/max 等
对于集合对象中某个Integer（Double，Long）类型字段操作时—>使用mapToInt
对于集合的规约和汇总为一个值，以及分组分区—>使用收集器Collectors

小技巧

sorted(), distinct(), min(), max(), sum(), skip(), limit(), anyMatch() 等 Stream 的方法作为辅助工具，都很好用。由于这些都是很常用的方法，所以被加入到 Stream 的标准库里面了；其实这里面很多方法都是 reduce 操作；

flatMap

这里说下 flatMap，它的意思就是对多份流进行平铺操作，转换成一份流对象；【和 map 很类似了，只不过就是 map 返回的是基本数据类型，flatMap 返回的是一个 stream 对象；然后由 flatMap 来合并这多个 stream 对象】
举个例子：

String[] strings = new String[]{"hello", "world"};
Stream<String[]> stream = Arrays.stream(strings).map(s -> s.split(""));
stream.flatMap(s -> Arrays.stream(s)).forEach(System.out::print);

//// 打印的结果
helloworld

分析，这里面什么叫多份流，什么叫单份流；对于字符串数组，进行 map 操作每一个字符串对象，然后对每一个字符串对象进行 split 操作，这里就会得到两份流，一份流中的数据是 h e l l o 这 5 个元素，另一份流中数据是 w o r l d 这 5 个元素，然后把这两个流 flatMap 就会得到一份流了，然后在对这一份流进行操作；
注意：这里为什么有两份流呢？是因为对每个字符串进行了 split 操作了，如果不做任何操作，那就只有一份流了，这份流中只有 2 个元素，即 hello 和 world 这两个元素了；

静态工厂方法

Collectors类的静态工厂方法（Collectors可通过工厂方法创建各种收集器）：

工厂方法	返回类型	用于
toList	List	把流中所有项目收集到一个List
toSet	Set	把流中所有项目收集到一个Set，删除重复项
toCollection	Collection	把流中所有项目收集到给定的供应源创建的集合
counting	Long	计算流中元素的个数
summingInt	Integer	对流中项目的一个整数属性求和
averagingInt	Double	计算流中项目Integer 属性的平均值
summarizingInt	IntSummaryStatistics	收集关于流中项目Integer 属性的统计值，例如最大、最小、总和与平均值
joining`	String	连接对流中每个项目调用toString 方法所生成的字符串
maxBy	Optional	一个包裹了流中按照给定比较器选出的最大元素的Optional，或如果流为空则为Optional.empty()
minBy	Optional	一个包裹了流中按照给定比较器选出的最小元素的Optional，或如果流为空则为Optional.empty()
reducing	归约操作产生的类型	从一个作为累加器的初始值开始，利用BinaryOperator 与流中的元素逐个结合，从而将流归约为单个值
collectingAndThen	转换函数返回的类型	包裹另一个收集器，对其结果应用转换函数
groupingBy	Map K, List	根据项目的一个属性的值对流中的项目作问组，并将属性值作为结果Map 的键
partitioningBy	Map Boolean,List	根据对流中每个项目应用谓词的结果来对项进行分区

创建流

1）集合获取流

List<String> list = Arrays.asList("hello", "java", "stream");
list.stream().forEach(System.out::println);

2）Values 创建流

Stream.of("hello", "java", "stream").forEach(System.out::println);

3）Arrays 数组获取流

String[] strings = new String[]{"hello", "java", "stream"};
Arrays.stream(strings).forEach(System.out::println);

4）File 文件获取流

Path path = Paths.get("D:\\workspace\\demo\\src\\main\\java\\com\\example\\demo\\java8\\StreamDemo.java");
Stream<String> lines = Files.lines(path);
lines.forEach(System.out::println);

5）Iterator 无限循环获取流，可对其各种操作

// 无限循环
Stream<Integer> iterate = Stream.iterate(0, n -> n + 2);
// 取前十条
Stream<Integer> iterate2 = Stream.iterate(0, n -> n + 2).limit(10);

iterate.forEach(System.out::println);

6）generate，也有 int 等类型，操作都一样

// 无限循环
Stream<Double> generate = Stream.generate(Math::random);
// 取前十条
Stream<Double> generate2 = Stream.generate(Math::random).limit(10);

generate2.forEach(System.out::println);

Stream操作

1）filter，distinct，skip，limit，map

List<Integer> integers = Arrays.asList(1, 2, 3, 4, 5, 5, 5, 6, 6);

// filter 过滤
List<Integer> collect = integers.stream().filter((x) -> x % 2 == 0).collect(Collectors.toList());
System.out.println(collect);  // [2, 4, 6, 6]

// distinct 去重
List<Integer> collect1 = integers.stream().distinct().collect(Collectors.toList());
System.out.println(collect1);  // [1, 2, 3, 4, 5, 6]

// 跳过前5个元素
List<Integer> collect2 = integers.stream().skip(5).collect(Collectors.toList());
System.out.println(collect2);  // [5, 5, 6, 6]

// 取前5个元素
List<Integer> collect3 = integers.stream().limit(5).collect(Collectors.toList());
System.out.println(collect3);   // [1, 2, 3, 4, 5]

// map
List<Integer> collect4 = integers.stream().map(i -> i * 2).collect(Collectors.toList());
System.out.println(collect4);   // [2, 4, 6, 8, 10, 10, 10, 12, 12]

2）map，flatMap 遍历操作

List<User> users = Arrays.asList(
        new User("zhangsan", true, 23, SexEnum.BOY, "123456"),
        new User("lisi", false, 18, SexEnum.GIRL, "123456"),
        new User("wangwu", true, 25, SexEnum.BOY, "123456"),
        new User("xiaoqiang", false, 23, SexEnum.GIRL, "123456")
);
// map
List<String> collect = users.stream().map(User::getUsername).collect(Collectors.toList());
System.out.println(collect);   // [zhangsan, lisi, wangwu, xiaoqiang]


String[] strings = new String[]{"hello", "world"};
// map: {'h','e','l','l','o'},{'w','o','r','l','d'}
Stream<String[]> stream = Arrays.stream(strings).map(i -> i.split(""));
// flatmap: 'h','e','l','l','o','w','o','r','l','d'
Stream<String> stringStream = stream.flatMap(s -> Arrays.stream(s));
stringStream.distinct().forEach(System.out::println);

flatMap 用法: 调用 stream 方法，将每个列表转换成 Stream 对象，其余部分由 flatMap 方法处理。 flatMap 方法的相关函数接口和 map 方法的一样，都是 Function 接口，只是方法的返回值限定为 Stream 类型罢了。
（比如看上面的例子，flatMap 中返回的是 Arrays.stream(s) , 即返回一个 stream）

3）match：元素是否匹配条件

Stream<Integer> stream = Arrays.stream(new Integer[]{4, 5, 6, 7, 8});
// 所有元素都满足此条件
boolean b = stream.allMatch(i -> i > 3);  // true
boolean b = stream.allMatch(i -> i > 7);  // false

boolean b1 = stream.anyMatch(i -> i > 6);   // true

boolean b = stream.noneMatch(i -> i < 0);   // true

4）find：查找元素

Stream<Integer> stream = Arrays.stream(new Integer[]{4, 5, 6, 7, 8});

// 随便拿一个值
Optional<Integer> any = stream.filter(i -> i % 2 == 0).findAny();
System.out.println(any.get());    // 4

// 拿第一个值
Optional<Integer> first = stream.filter(i -> i % 2 == 0).findFirst();  
System.out.println(first.get());  // 4

5）reduce：对一个集合数据进行聚合操作，如 sum, min, max

Stream<Integer> stream = Arrays.stream(new Integer[]{4, 5, 6, 7, 8});

// 总和
Optional<Integer> reduce = stream.reduce((i, j) -> i + j);
System.out.println(reduce.get());   // 30

// 方法引用：总和
Optional<Integer> reduce = stream.reduce(Integer::sum);
System.out.println(reduce.get()); // 30

// 方法引用：取最大值
Optional<Integer> reduce = stream.reduce(Integer::max);
System.out.println(reduce.get());   // 8

6）数值流

java8 引入了三个原始类型特化流接口，IntStream，DoubleStream和LongStream，分别将流中的元素特化为int，long和double，从而避免了暗含的装箱成本。每个接口都带来了进行常用数值规约的新方法，比如对数值流求和的sum，找到最大元素的max。此外还有在必要时再把它们转回对象流的方法。要记住的是，这些特化的原因并不在于流的复杂性，而是装箱造成的复杂性—-即类似int和Integer之间的效率差异。

Stream<Integer> stream = Arrays.stream(new Integer[]{4, 5, 6, 7, 8});

// 还有 long,double 操作类似
int sum = stream.mapToInt(i -> i.intValue()).filter(i -> i > 6).sum();
System.out.println(sum);

// 将数值流转回对象流
Stream<Integer> boxed = stream.mapToInt(i -> i.intValue()).filter(i -> i > 6).boxed();
boxed.forEach(System.out::println);

7）数值范围

和数字打交道时，有一个常用的东西就是数值范围。比如，假设你想要生成1和100之间的所有数字。java8引入了两个可以用于IntStream和LongStream的静态方法，帮助生成这种范围：range和rangeClosed。这两个方法都是第一个参数接受起始值，第二个参数接受结束值。但是range是不包含结束值的，而rangeClosed则包含结束值。

IntStream intStream = IntStream.range(1, 100).filter(i -> i % 2 == 0);
System.out.println(intStream.count());  // 49

IntStream intStream2 = IntStream.rangeClosed(1, 100).filter(i -> i % 2 == 0);
System.out.println(intStream2.count());  // 50

流收集器Collectors

理解流收集器

1）根据情况选择最佳解决方案（摘自java8 in action 书中所述）
函数式编程（特别是java8的Collections框架中加入的基于函数式风格原理设计的新API）通常提供了多种方法来执行同一个操作。并且，收集器在某种程度上比Stream接口上直接提供的方法用起来更复杂，但好处在于它们能够提供更高水平的抽象和概括，也更容易重用和自定义。
我们的建议是，尽可能为手头的问题探索不同的解决方案，但在通用的方案里面，始终选择最专门化的一个。无论是从可读性还是性能上看，这一般都是最好的决定。例如，要计算菜单的总热量，我们更倾向于最后一个解决方案（使用IntStream），因为它最简明，也很可能最易读。同时，它也是性能最好的一个，因为IntStream可以让我们避免自动拆箱操作，也就是从Integer到int的隐式转换，它在这里毫无用处。

准备数据

1）Dish 类

public class Dish {
    private String name;
    private boolean vegetarian;  // 是否是荤菜
    private int calories;  // 热量
    private Type type;

    // getter/setter...    

    public enum Type {
        MEAT, FISH, OTHER
    }
}

2）集合数据

List<Dish> menu = Arrays.asList(
                new Dish("pork", false, 800, Dish.Type.MEAT),
                new Dish("beef", false, 700, Dish.Type.MEAT),
                new Dish("chicken", false, 400, Dish.Type.MEAT),
                new Dish("french fries", true, 530, Dish.Type.OTHER),
                new Dish("rice", true, 350, Dish.Type.OTHER),
                new Dish("season fruit", true, 120, Dish.Type.OTHER),
                new Dish("pizza", true, 550, Dish.Type.OTHER),
                new Dish("prawns", false, 300, Dish.Type.FISH),
                new Dish("salmon", false, 450, Dish.Type.FISH)
        );

规约为一个值

// 获取热量最大的菜
Optional<Dish> collect = menu.stream().collect(Collectors.maxBy(Comparator.comparingInt(Dish::getCalories)));
System.out.println(collect); // Optional[Dish{name='beef', vegetarian=false, calories=800, type=MEAT}]

// 获取热量最小的菜
Optional<Dish> collect1 = menu.stream().collect(Collectors.minBy(Comparator.comparingInt(Dish::getCalories)));
System.out.println(collect1); // Optional[Dish{name='season fruit', vegetarian=true, calories=120, type=OTHER}]

汇总

// 求和
Integer collect = menu.stream().collect(Collectors.summingInt(Dish::getCalories));
System.out.println(collect);  // 4200

// 平均值 averagingInt, averagingDouble ...
Double collect1 = menu.stream().collect(Collectors.averagingInt(Dish::getCalories));
System.out.println(collect1);  // 466.6666666666667

// summarizingInt, summarizingDouble ...
IntSummaryStatistics collect2 = menu.stream().collect(Collectors.summarizingInt(Dish::getCalories));
System.out.println(collect2);  // IntSummaryStatistics{count=9, sum=4200, min=120, average=466.666667, max=800}

连接字符串

// 连接字符串（默认没有分隔符,所有字符串连在一起,可读性差）
String collect = menu.stream().map(Dish::getName).collect(Collectors.joining());
System.out.println(collect); // porkbeefchickenfrench friesriceseason fruitpizzaprawnssalmon

// 连接字符串（提供分隔符,可读性强）
String collect1 = menu.stream().map(Dish::getName).collect(Collectors.joining(", "));
System.out.println(collect1); // pork, beef, chicken, french fries, rice, season fruit, pizza, prawns, salmon

元素分组

1）单级分组

// 按枚举Type分组
/*
{FISH=[Dish{name='prawns', vegetarian=false, calories=300, type=FISH}, Dish{name='salmon', vegetarian=false, calories=450, type=FISH}],
 MEAT=[Dish{name='pork', vegetarian=false, calories=800, type=MEAT}, Dish{name='beef', vegetarian=false, calories=700, type=MEAT}, Dish{name='chicken', vegetarian=false, calories=400, type=MEAT}],
 OTHER=[Dish{name='french fries', vegetarian=true, calories=530, type=OTHER}, Dish{name='rice', vegetarian=true, calories=350, type=OTHER}, Dish{name='season fruit', vegetarian=true, calories=120, type=OTHER}, Dish{name='pizza', vegetarian=true, calories=550, type=OTHER}]}
 */
Map<Dish.Type, List<Dish>> collect = menu.stream().collect(Collectors.groupingBy(Dish::getType));
System.out.println(collect);

// 把热量不到400的菜划分为低热量diet,热量不到700划分为普通normal,高于700的划分为高热量fat
/*
{FAT=[Dish{name='pork', vegetarian=false, calories=800, type=MEAT}],
 DIET=[Dish{name='chicken', vegetarian=false, calories=400, type=MEAT}, Dish{name='rice', vegetarian=true, calories=350, type=OTHER}, Dish{name='season fruit', vegetarian=true, calories=120, type=OTHER}, Dish{name='prawns', vegetarian=false, calories=300, type=FISH}],
 NORMAL=[Dish{name='beef', vegetarian=false, calories=700, type=MEAT}, Dish{name='french fries', vegetarian=true, calories=530, type=OTHER}, Dish{name='pizza', vegetarian=true, calories=550, type=OTHER}, Dish{name='salmon', vegetarian=false, calories=450, type=FISH}]}
 */
Map<CaloricLevel, List<Dish>> collect1 = menu.stream().collect(Collectors.groupingBy(dish -> {
    if (dish.getCalories() <= 400) return CaloricLevel.DIET;
    else if (dish.getCalories() <= 700) return CaloricLevel.NORMAL;
    else return CaloricLevel.FAT;
}));
System.out.println(collect1);

2）多级分组
要实现多级分组，我们可以使用一个由双参数版本的Collectors.groupingBy工厂方法创建的收集器，它除了普通的分类函数之外，还可以接受Collector类型的第二个参数。那么要进行二级分组的话，我们可以把一个内层groupingBy传递给外层groupingBy，并定义一个为流中项目分类的二级标准。
如：先按照Type分组，在按照热量分组，代码如下

/*
{MEAT={FAT=[Dish{name='pork', vegetarian=false, calories=800, type=MEAT}], DIET=[Dish{name='chicken', vegetarian=false, calories=400, type=MEAT}], NORMAL=[Dish{name='beef', vegetarian=false, calories=700, type=MEAT}]},
 FISH={DIET=[Dish{name='prawns', vegetarian=false, calories=300, type=FISH}], NORMAL=[Dish{name='salmon', vegetarian=false, calories=450, type=FISH}]},
 OTHER={DIET=[Dish{name='rice', vegetarian=true, calories=350, type=OTHER}, Dish{name='season fruit', vegetarian=true, calories=120, type=OTHER}], NORMAL=[Dish{name='french fries', vegetarian=true, calories=530, type=OTHER}, Dish{name='pizza', vegetarian=true, calories=550, type=OTHER}]}}
 */
Map<Dish.Type, Map<CaloricLevel, List<Dish>>> collect = menu.stream().collect(Collectors.groupingBy(Dish::getType,
        Collectors.groupingBy(dish -> {
            if (dish.getCalories() <= 400) return CaloricLevel.DIET;
            else if (dish.getCalories() <= 700) return CaloricLevel.NORMAL;
            else return CaloricLevel.FAT;
        })
));

3）多级分组—>按子组收集数据1
其实，传递给第一个groupingBy的第二个收集器可以是任何类型，而不一定是另一个groupingBy。例如要数一数菜单中每类菜有多少个，可以传递counting收集器作为groupingBy收集器的第二个参数，如下代码：

// {FISH=2, OTHER=4, MEAT=3}
Map<Dish.Type, Long> collect = menu.stream().collect(Collectors.groupingBy(Dish::getType, Collectors.counting()));
System.out.println(collect);

注意：普通的单参数groupingBy(f)（其中f是分类函数）实际上是groupingBy(f，toList()) 的简便写法。

4）多级分组—>按子组收集数据2—>把收集器的结果转换为另一种类型
查找每种类型中热量最高的菜肴：

/*
{OTHER=Optional[Dish{name='pizza', vegetarian=true, calories=550, type=OTHER}], 
 FISH=Optional[Dish{name='salmon', vegetarian=false, calories=450, type=FISH}], 
 MEAT=Optional[Dish{name='pork', vegetarian=false, calories=800, type=MEAT}]}
 */
Map<Dish.Type, Optional<Dish>> collect = menu.stream().collect(Collectors.groupingBy(Dish::getType, Collectors.maxBy(Comparator.comparingInt(Dish::getCalories))));
System.out.println(collect);

注：从结果中可以看出，返回的是给每个值包装一个Optional，这没一点用，所以需做优化去掉Optional。
把收集器的结果转换为另一种类型：

/*
{OTHER=Dish{name='pizza', vegetarian=true, calories=550, type=OTHER},
MEAT=Dish{name='pork', vegetarian=false, calories=800, type=MEAT},
FISH=Dish{name='salmon', vegetarian=false, calories=450, type=FISH}}
 */
Map<Dish.Type, Dish> collect = menu.stream().collect(Collectors.groupingBy(Dish::getType,
        Collectors.collectingAndThen(Collectors.maxBy(Comparator.comparingInt(Dish::getCalories)), Optional::get)
));
System.out.println(collect);

5）其他例子
根据Type分类，并求出所有菜肴总和

// {MEAT=1900, FISH=750, OTHER=1550}
Map<Dish.Type, Integer> collect = menu.stream().collect(Collectors.groupingBy(Dish::getType,
        Collectors.summingInt(Dish::getCalories)));
System.out.println(collect);

常常和groupingBy联合使用的另一个收集器是mapping方法生成的。这个方法接受两个参数：一个函数对流中的元素做变换，另一个则将变换的结果对象收集起来。其目的是在累加之前对每个输入元素应用一个映射函数，这样就可以让接受特定类型元素的收集器适应不同类型的对象。如下：

// {MEAT=[NORMAL, FAT, DIET], FISH=[NORMAL, DIET], OTHER=[NORMAL, DIET]}
Map<Dish.Type, Set<CaloricLevel>> collect = menu.stream().collect(Collectors.groupingBy(Dish::getType,
        Collectors.mapping(dish -> {
            if (dish.getCalories() <= 400) return CaloricLevel.DIET;
            else if (dish.getCalories() <= 700) return CaloricLevel.NORMAL;
            else return CaloricLevel.FAT;
        }, Collectors.toSet())
));
System.out.println(collect);

通过此结果就可以轻松的做出选择了。如果你想吃鱼并且在减肥，那很容易找到一道菜；同样，如果你饥肠辘辘，想要很多热量的话，菜单中肉类部分就可以满足你了。
注意：在上一个示例中，对于返回的Set是什么样类型并没有任何保证。但通过使用toCollection，你就可以有更多的控制。例如，你可以给它传递一个构造函数引用来要求HashSet：

Map<Dish.Type, Set<CaloricLevel>> collect = menu.stream().collect(Collectors.groupingBy(Dish::getType,
        Collectors.mapping(dish -> {
            if (dish.getCalories() <= 400) return CaloricLevel.DIET;
            else if (dish.getCalories() <= 700) return CaloricLevel.NORMAL;
            else return CaloricLevel.FAT;
        }, Collectors.toCollection(HashSet::new))
));

元素分区

1）由一个谓词（返回一个布尔值的函数）作为分类函数，被称为分区函数。要求返回的必须是一个Boolean类型，否则报错

/*
{false=[Dish{name='pork', vegetarian=false, calories=800, type=MEAT}, Dish{name='beef', vegetarian=false, calories=700, type=MEAT}, Dish{name='chicken', vegetarian=false, calories=400, type=MEAT}, Dish{name='prawns', vegetarian=false, calories=300, type=FISH}, Dish{name='salmon', vegetarian=false, calories=450, type=FISH}],
 true=[Dish{name='french fries', vegetarian=true, calories=530, type=OTHER}, Dish{name='rice', vegetarian=true, calories=350, type=OTHER}, Dish{name='season fruit', vegetarian=true, calories=120, type=OTHER}, Dish{name='pizza', vegetarian=true, calories=550, type=OTHER}]}
 */
Map<Boolean, List<Dish>> collect = menu.stream().collect(Collectors.partitioningBy(Dish::isVegetarian));
System.out.println(collect);

2）分区优势在于保留了分区函数返回true或false的两套流元素列表。它也可传入收集器作为第二个参数

/*
{false={FISH=[Dish{name='prawns', vegetarian=false, calories=300, type=FISH},Dish{name='salmon', vegetarian=false, calories=450, type=FISH}],
        MEAT=[Dish{name='pork', vegetarian=false, calories=800, type=MEAT}, Dish{name='beef', vegetarian=false, calories=700, type=MEAT}, Dish{name='chicken', vegetarian=false, calories=400, type=MEAT}]},
 true={OTHER=[Dish{name='french fries', vegetarian=true, calories=530, type=OTHER}, Dish{name='rice', vegetarian=true, calories=350, type=OTHER}, Dish{name='season fruit', vegetarian=true, calories=120, type=OTHER}, Dish{name='pizza', vegetarian=true, calories=550, type=OTHER}]}}
 */
Map<Boolean, Map<Dish.Type, List<Dish>>> collect = menu.stream().collect(Collectors.partitioningBy(Dish::isVegetarian,
        Collectors.groupingBy(Dish::getType)));
System.out.println(collect);