MySQL 8.0 窗口函数实战:12 个业务场景的高效查询写法

MySQL 8.0 窗口函数实战:12 个业务场景高效查询指南

窗口函数(Window Functions)是 MySQL 8.0 的核心增强功能,支持对分组数据执行计算而不聚合结果集。以下是 12 个典型业务场景的实战写法,附 SQL 示例和解析:


场景 1:销售排名

需求:为每个销售员的销售额生成部门内排名

SELECT 
    salesperson_id,
    department,
    sales_amount,
    RANK() OVER (PARTITION BY department ORDER BY sales_amount DESC) AS dept_rank
FROM sales_records;

说明

  • PARTITION BY department 按部门分组
  • RANK() 根据销售额降序排名,同金额并列名次(如 1,1,3)

场景 2:移动平均值

需求:计算股票 7 日移动平均价格

SELECT 
    trade_date,
    stock_price,
    AVG(stock_price) OVER (
        ORDER BY trade_date 
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) AS moving_avg_7d
FROM stock_daily;

关键点

  • ROWS BETWEEN 6 PRECEDING AND CURRENT ROW 定义滑动窗口为最近 7 天
  • 适用于时间序列平滑处理

场景 3:累计占比

需求:计算每个产品销售额占总销售额的累计百分比

SELECT 
    product_id,
    sales_amount,
    SUM(sales_amount) OVER (ORDER BY sales_amount DESC) / 
    SUM(sales_amount) OVER () * 100 AS cumulative_percent
FROM product_sales;

效果:输出按销售额降序排列的累计占比曲线


场景 4:同环比分析

需求:计算月度销售额环比增长率

SELECT 
    month,
    sales_amount,
    (sales_amount - LAG(sales_amount, 1) OVER (ORDER BY month)) / 
    LAG(sales_amount, 1) OVER (ORDER BY month) * 100 AS growth_rate
FROM monthly_sales;

函数解析

  • LAG(column, N) 获取前第 N 行的数据
  • 分母用上月数据,分子为本月增量

场景 5:分组 Top-N

需求:筛选每个品类销量前 3 的产品

WITH ranked_products AS (
    SELECT 
        product_id,
        category,
        sales_volume,
        DENSE_RANK() OVER (PARTITION BY category ORDER BY sales_volume DESC) AS rank_in_cat
    FROM products
)
SELECT * FROM ranked_products WHERE rank_in_cat <= 3;

技巧

  • 使用 DENSE_RANK() 避免名次断层(如 1,2,2,3)
  • CTE(公用表表达式)简化嵌套查询

场景 6:首尾记录对比

需求:比较用户首次和最后一次登录地点是否相同

SELECT 
    user_id,
    FIRST_VALUE(login_city) OVER (PARTITION BY user_id ORDER BY login_time) AS first_city,
    LAST_VALUE(login_city) OVER (PARTITION BY user_id ORDER BY login_time 
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_city
FROM user_logins;

注意LAST_VALUE() 需指定完整窗口范围,否则默认到当前行


场景 7:连续活跃检测

需求:标记连续 5 天登录的用户

SELECT 
    user_id,
    login_date,
    login_date - INTERVAL (DENSE_RANK() OVER (PARTITION BY user_id ORDER BY login_date)) DAY AS grp,
    COUNT(*) OVER (PARTITION BY user_id, login_date - INTERVAL (DENSE_RANK() OVER (PARTITION BY user_id ORDER BY login_date)) DAY) AS consecutive_days
FROM logins;

原理

  • 利用 DENSE_RANK() 生成序列号
  • 日期减序列号得到连续登录的组标识(grp)
  • 组内计数 >5 即达标

场景 8:差值填充

需求:用前一个有效值填充缺失的库存数据

SELECT 
    date,
    inventory,
    COALESCE(inventory, 
        LAST_VALUE(inventory) IGNORE NULLS OVER (ORDER BY date)) AS filled_inventory
FROM daily_inventory;

函数

  • LAST_VALUE(...) IGNORE NULLS 跳过空值取最近有效值
  • COALESCE() 处理首行为空的情况

场景 9:分位数分析

需求:计算员工薪资的部门四分位数

SELECT 
    employee_id,
    department,
    salary,
    NTILE(4) OVER (PARTITION BY department ORDER BY salary) AS quartile
FROM employees;

输出:quartile=1 代表最低 25% 薪资,4 代表最高 25%


场景 10:会话分割

需求:将用户事件按 30 分钟超时分割会话

SELECT 
    user_id,
    event_time,
    SUM(is_new_session) OVER (ORDER BY user_id, event_time) AS session_id
FROM (
    SELECT *,
        CASE WHEN TIMESTAMPDIFF(MINUTE, 
                LAG(event_time) OVER (PARTITION BY user_id ORDER BY event_time), 
                event_time) > 30 
            THEN 1 ELSE 0 END AS is_new_session
    FROM user_events
) t;

逻辑

  • 计算相邻事件时间差 >30 分钟则标记新会话(is_new_session=1)
  • 累加标记值生成会话 ID

场景 11:层级累加

需求:计算部门树形结构的预算累计值(从根节点向下)

SELECT 
    dept_id,
    parent_id,
    budget,
    SUM(budget) OVER (
        PARTITION BY root_id 
        ORDER BY dept_level 
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS cumulative_budget
FROM (
    SELECT *, 
        CONNECT_BY_ROOT dept_id AS root_id,
        LEVEL AS dept_level
    FROM departments
    START WITH parent_id IS NULL
    CONNECT BY parent_id = PRIOR dept_id
) t;

组合技

  • 递归 CTE(或 CONNECT BY)生成树形层级
  • 窗口函数按层级顺序累加

场景 12:异常波动检测

需求:标记日销售额超过历史均值 ±2 标准差的日期

SELECT 
    sale_date,
    daily_sales,
    CASE WHEN ABS(daily_sales - avg_sales) > 2 * std_dev THEN 1 ELSE 0 END AS is_anomaly
FROM (
    SELECT *,
        AVG(daily_sales) OVER () AS avg_sales,
        STDDEV(daily_sales) OVER () AS std_dev
    FROM daily_sales
) t;

统计方法

  • 窗口函数计算全局均值(avg_sales)和标准差(std_dev)
  • ABS() 判断偏离程度

窗口函数核心语法总结

组件 说明
PARTITION BY 定义分组列(类似 GROUP BY,但不聚合)
ORDER BY 指定排序顺序(决定计算顺序)
窗口帧
ROWS N PRECEDING 包含前 N 行
RANGE INTERVAL 按值范围定义窗口(如时间区间)
常用函数
ROW_NUMBER() 行号(无并列)
LEAD()/LAG() 访问偏移行数据
PERCENT_RANK() 百分比排名

最佳实践

  1. 在子查询中预先过滤数据,减少窗口计算量
  2. 对大型数据集,用 INDEX 优化 PARTITION BYORDER BY 涉及的列
  3. 避免在窗口帧中使用 UNBOUNDED FOLLOWING(可能导致全表扫描)
转载请说明出处内容投诉
CSS教程网 » MySQL 8.0 窗口函数实战:12 个业务场景的高效查询写法

发表评论

欢迎 访客 发表评论

一个令你着迷的主题!

查看演示 官网购买