刷题记录 - 随机算法 - 《学习笔记》

710. 黑名单中的随机数

给定一个包含 [0，n) 中不重复整数的黑名单 blacklist ，写一个函数从 [0, n) 中返回一个不在 blacklist 中的随机整数。
对它进行优化使其尽量少调用系统方法 Math.random() 。
提示:

1 <= n <= 1000000000
0 <= blacklist.length < min(100000, N)
[0, n) 不包含 n ，详细参见 interval notation) 。

示例 1：
输入： [“Solution”,”pick”,”pick”,”pick”] [[1,[]],[],[],[]] 输出：[null,0,0,0]
示例 2：
输入： [“Solution”,”pick”,”pick”,”pick”] [[2,[]],[],[],[]] 输出：[null,1,1,1]
示例 3：
输入： [“Solution”,”pick”,”pick”,”pick”] [[3,[1]],[],[],[]] 输出：[null,0,0,2]
示例 4：
输入： [“Solution”,”pick”,”pick”,”pick”] [[4,[2]],[],[],[]] 输出：[null,1,3,1]
输入语法说明：
输入是两个列表：调用成员函数名和调用的参数。Solution的构造函数有两个参数，n 和黑名单 blacklist。pick 没有参数，输入参数是一个列表，即使参数为空，也会输入一个 [] 空列表。

思路：
方法1：维护一个白名单列表
会超时，空间也会爆掉
方法2：重映射
长度为n中选数去重黑名单，等价于从长度为len = n - blackList.length中选数，可能len中仍有黑名单中的数，需将其映射为[len, n)中的白名单数。

// 方法2
class Solution {
    Map<Integer, Integer> map = new HashMap<>();
    int len;
    public Solution(int n, int[] blacklist) {
        Arrays.sort(blacklist);
        this.len = n - blacklist.length;
        Set<Integer> set = new HashSet<>();
        for (int i = 0; i < blacklist.length; i++) {
            set.add(blacklist[i]);
        }
        for (int i = 0, j = n - 1; i < blacklist.length && blacklist[i] < len; i++) {
            while (set.contains(j))
                j--;
            map.put(blacklist[i], j--);
        }
        // System.out.println(map);
    }
    Random r = new Random();
    public int pick() {
        int idx = r.nextInt(len);
        return map.getOrDefault(idx, idx);
    }
}
/**
 * Your Solution object will be instantiated and called as such:
 * Solution obj = new Solution(n, blacklist);
 * int param_1 = obj.pick();
 */

方法3：二分
想象一个白名单，长度为len = n - blackList.length
随机生成一个数idx ∈ [0, len)，考虑这个下标对应白名单中的哪个数。
由于有黑名单的存在idx并不是白名单中下标为idx对应的数，需要根据黑名单找到这个数。
把黑名单中的数从小到大排序，根据每个数及其下标能确定其前面有几个白名单中的数。
举个例子：blackList：[2, 5, 7]，2前面有两个白名单数0 1，5前面有4个白名单数0, 1, 3, 4，故计算方式为blackList[i] - i，且其为非递减序列。
二分找到小于等于idx的第一个blackList[i] - i。
实际上找的是目标数字的区间
若blackList[l] - i > idx,返回idx
否则，返回idx + l + 1意思是黑名单中下标小于l的数中有l + 1个数是属于黑名单的，故应在idx的基础上加上l + 1得到实际应返回的白名单中对应的数字。

class Solution {
    int[] a;
    int len;
    public Solution(int n, int[] blacklist) {
        len = n - blacklist.length;
        a = blacklist;
        Arrays.sort(a);
    }
    Random r = new Random();
    public int pick() {
        int idx = r.nextInt(len);
        int l = 0, r = a.length - 1;
        while (l < r) {
            int mid = l + r + 1 >> 1;
            if (a[mid] - mid > idx)
                r = mid - 1;
            else
                l = mid;
        }
        return l == r && a[l] - l <= idx ? idx + l + 1 : idx;
    }
}
/**
 * Your Solution object will be instantiated and called as such:
 * Solution obj = new Solution(n, blacklist);
 * int param_1 = obj.pick();
 */

注意blackList为空的情况！

528. 按权重随机选择

给你一个 下标从 0 开始 的正整数数组 w ，其中 w[i] 代表第 i 个下标的权重。
请你实现一个函数 pickIndex ，它可以 随机地 从范围 [0, w.length - 1] 内（含 0 和 w.length - 1）选出并返回一个下标。选取下标 i 的概率为 w[i] / sum(w) 。

例如，对于 w = [1, 3]，挑选下标 0 的概率为 1 / (1 + 3) = 0.25 （即，25%），而选取下标 1 的概率为 3 / (1 + 3) = 0.75（即，75%）。

示例 1：
输入： [“Solution”,”pickIndex”] [[[1]],[]] 输出： [null,0] 解释： Solution solution = new Solution([1]); solution.pickIndex(); // 返回 0，因为数组中只有一个元素，所以唯一的选择是返回下标 0。
示例 2：
输入： [“Solution”,”pickIndex”,”pickIndex”,”pickIndex”,”pickIndex”,”pickIndex”] [[[1,3]],[],[],[],[],[]] 输出： [null,1,1,1,1,0] 解释： Solution solution = new Solution([1, 3]); solution.pickIndex(); // 返回 1，返回下标 1，返回该下标概率为 3/4 。 solution.pickIndex(); // 返回 1 solution.pickIndex(); // 返回 1 solution.pickIndex(); // 返回 1 solution.pickIndex(); // 返回 0，返回下标 0，返回该下标概率为 1/4 。由于这是一个随机问题，允许多个答案，因此下列输出都可以被认为是正确的: [null,1,1,1,1,0] [null,1,1,1,1,1] [null,1,1,1,0,0] [null,1,1,1,0,1] [null,1,0,1,0,0] …… 诸若此类。

提示：

1 <= w.length <= 104
1 <= w[i] <= 105
pickIndex 将被调用不超过 104 次

思路：
方法1：
不同权重的数返回的概率不同，故可维护一个前缀和，总和为sum，
在([0, sum) + 1)中选择一个随机数idx，二分搜索返回第一个大于等于idx的数的下标

class Solution {
    int[] pre;
    int total;
    public Solution(int[] w) {
        pre = new int[w.length];
        pre[0] = w[0];
        for (int i = 1; i < w.length; i++)
            pre[i] = pre[i - 1] + w[i];
        total = pre[pre.length - 1];
    }
    public int pickIndex() {
        int target = (int)(Math.random() * total) + 1;
        return binarySearch(target);
    }
    private int binarySearch(int target) {
        int l = 0, r = pre.length - 1;
        while (l < r) {
            int mid = l + (r - l >> 1);
            if (pre[mid] < target)
                l = mid + 1;
            else
                r = mid;
        }
        return l;
    }
}
/**
 * Your Solution object will be instantiated and called as such:
 * Solution obj = new Solution(w);
 * int param_1 = obj.pickIndex();
 */

洗牌算法

水塘抽样

问题描述：
给定一个输入序列，长度未知，遍历一遍该序列，并随机返回序列中的某k个数(k < =n)，要求每个数被返回的概率相同，求最终的序列。
思路：
求解过程与序列长度无关，但还要等概率返回。
自己肯定想不出来，直接说正确做法了。
k = 1时

遍历输入序列
根据当前数的下标（从0开始）idx选择随机数x ∈ [0, idx + 1)
若x = 0，将返回值替换为a[idx]
返回返回值

k > 1 时

初始选定前k个数作为返回值，用一个长度为k的数组c记录
从下标为k开始遍历
若当前数的下标为idx，选择随机数x ∈ [0, idx + 1)，若x < k，用a[idx]替换c[x]
遍历完后返回c

证明：
k = 1时
考虑第一个数不被替换的概率
第一个数不被替换，说明遍历后边的数时选择的随机数都不为0，假设当前数下标为k, 选择随机数不为0的概率是k / k + 1
累乘
随机算法 - 图1

k > 1时

类似的操作

:::tips 水塘抽样算法常被用于总数不确定的问题中 :::

382. 链表随机节点

给定一个单链表，随机选择链表的一个节点，并返回相应的节点值。保证每个节点被选的概率一样。
进阶:
如果链表十分大且长度未知，如何解决这个问题？你能否使用常数级空间复杂度实现？
示例:
// 初始化一个单链表 [1,2,3]. ListNode head = new ListNode(1); head.next = new ListNode(2); head.next.next = new ListNode(3); Solution solution = new Solution(head); // getRandom()方法应随机返回1,2,3中的一个，保证每个元素被返回的概率相等。 solution.getRandom();

class Solution {
    ListNode root;
    public Solution(ListNode head) {
        root = head;
    }
    public int getRandom() {
        int res = -1, cnt = 0;
        ListNode cur = root;
        Random r = new Random();
        while (cur != null) {
            cnt++;
            int idx = r.nextInt(cnt);
            if (idx == 0)
                res = cur.val;
            cur = cur.next;
        }
        return res;
    }
}

398. 随机数索引

给定一个可能含有重复元素的整数数组，要求随机输出给定的数字的索引。您可以假设给定的数字一定存在于数组中。
注意：
数组大小可能非常大。使用太多额外空间的解决方案将不会通过测试。
示例:
int[] nums = new int[] {1,2,3,3,3}; Solution solution = new Solution(nums); // pick(3) 应该返回索引 2,3 或者 4。每个索引的返回概率应该相等。 solution.pick(3); // pick(1) 应该返回 0。因为只有nums[0]等于1。 solution.pick(1);

class Solution {
    int[] a;
    public Solution(int[] nums) {
        a = nums;
    }
    public int pick(int target) {
        int res = -1;
        int cnt = 0;
        Random r = new Random();
        for (int i = 0; i < a.length; i++) {
            if (a[i] == target) {
                cnt++;
                if (r.nextInt(cnt) == 0)
                    res = i;
            }
        }
        return res;
    }
}

470. 用 Rand7() 实现 Rand10()

给定方法 rand7 可生成 [1,7] 范围内的均匀随机整数，试写一个方法 rand10 生成 [1,10] 范围内的均匀随机整数。
你只能调用 rand7() 且不能调用其他方法。请不要使用系统的 Math.random() 方法。
每个测试用例将有一个内部参数 n，即你实现的函数 rand10() 在测试时将被调用的次数。请注意，这不是传递给 rand10() 的参数。

示例 1:
输入: 1 输出: [2]
示例 2:
输入: 2 输出: [2,8]
示例 3:
输入: 3 输出: [3,8,10]

提示:

1 <= n <= 105

进阶:

rand7()调用次数的期望值是多少 ?
你能否尽量少调用 rand7() ?

思路：
rand7函数只能生成1-7，七个数不能满足要求，故考虑使用两次rand7组成一个7进制数，范围是1-49
即x = (rand7() - 1) * 7 + rand7()
然后删除一些使其成为10的整数倍。可以考虑删除41-49，只保留1-40，然后返回模10的结果即可
期望调用次数为2 * 1 / (40 / 49) = 49 / 20

/**
 * The rand7() API is already defined in the parent class SolBase.
 * public int rand7();
 * @return a random integer in the range 1 to 7
 */
class Solution extends SolBase {
    public int rand10() {
        int x = (rand7() - 1) * 7 + rand7();
        if (x > 40) return rand10();
        return (x - 1) % 10 + 1;
    }
}

学习笔记

随机算法

497. 非重叠矩形中的随机点