在mysql中带了随机取数据的函数,在mysql中我们会有rand()函数,很多朋友都会直接使用,如果几百条数据肯定没事,如果几万或百万时你会发现,直接使用是错误的。下面我来介绍随机取数据一些优化方法。

    1. --
    2. SELECT * FROM table_name ORDER BY rand() LIMIT 5;
    3. -- 较慢
    4. SELECT * FROM `table`
    5. WHERE id >= (SELECT floor( RAND() * ((SELECT MAX(id) FROM `table`)-(SELECT MIN(id) FROM `table`)) + (SELECT MIN(id) FROM `table`)))
    6. ORDER BY id LIMIT 1;
    7. -- `table 有 id 字段
    8. SELECT *
    9. FROM `table` AS t1 JOIN (SELECT ROUND(RAND() * ((SELECT MAX(id) FROM `table`)-(SELECT MIN(id) FROM `table`))+(SELECT MIN(id) FROM `table`)) AS id) AS t2
    10. WHERE t1.id >= t2.id
    11. ORDER BY t1.id LIMIT 1;
    12. -- 快 `table 没有有 id 字段
    13. select * from (select @rownum:=@rownum + 1 as id,value from `table`,(select @rownum:=0) as a) as t1 join (
    14. select round( rand() * (
    15. (select max(b.id) from (select @rownum_max:=@rownum_max + 1 as id,value from `table`,(select @rownum_max:=0) as a) as b ) -
    16. (select min(b.id) from (select @rownum_min:=@rownum_min + 1 as id,value from `table`,(select @rownum_min:=0) as a) as b )
    17. )) +
    18. (select min(b.id) from (select @rownum_min1:=@rownum_min1 + 1 as id,value from `table`,(select @rownum_min1:=0) as a) as b ) as id
    19. ) as t2
    20. on t1.id>= t2.id
    21. order by t1.id limit 1

    rand在手册里是这么说的:
    RAND()
    RAND(N)
    返回在范围0到1.0内的随机浮点值。如果一个整数参数N被指定,它被用作种子值。

    1. mysql> select RAND();
    2. -> 0.5925
    3. mysql> select RAND(20);
    4. -> 0.1811
    5. mysql> select RAND(20);
    6. -> 0.1811
    7. mysql> select RAND();
    8. -> 0.2079
    9. mysql> select RAND();
    10. -> 0.7888

    你不能在一个ORDER BY子句用RAND()值使用列,因为ORDER BY将重复计算列多次。然而在MySQL3.23中,你可以做: SELECT FROM table_name ORDER BY RAND(),这是有利于得到一个来自SELECT FROM table1,table2 WHERE a=b AND c基本上都是查询max(id) * rand()来随机获取数据。

    1. SELECT *
    2. FROM `table` AS t1 JOIN (SELECT ROUND(RAND() * (SELECT MAX(id) FROM `table`)) AS id) AS t2
    3. WHERE t1.id >= t2.id
    4. ORDER BY t1.id ASC LIMIT 5;

    但是这样会产生连续的5条记录。解决办法只能是每次查询一条,查询5次。即便如此也值得,因为15万条的表,查询只需要0.01秒不到。

    完整查询语句是:

    1. -- 耗时:0.147433
    2. SELECT * FROM `table`
    3. WHERE id >= (SELECT floor( RAND() * ((SELECT MAX(id) FROM `table`)-(SELECT MIN(id) FROM `table`)) + (SELECT MIN(id) FROM `table`)))
    4. ORDER BY id LIMIT 1;
    5. -- 耗时:0.015130
    6. SELECT *
    7. FROM `table` AS t1 JOIN (SELECT ROUND(RAND() * ((SELECT MAX(id) FROM `table`)-(SELECT MIN(id) FROM `table`))+(SELECT MIN(id) FROM `table`)) AS id) AS t2
    8. WHERE t1.id >= t2.id
    9. ORDER BY t1.id LIMIT 1;

    执行效率需要0.02 sec.可惜的是,只有mysql 4.1.*以上才支持这样的子查询.