Oracle data transform using analytic lag row_number case
So I needed to transform this source data:
| Event Category | Start Week | End Week |
|---|---|---|
| LEC | 1 | 1 |
| LEC | 1 | 11 |
| LEC | 1 | 11 |
| LEC | 3 | 3 |
| SEM | 2 | 2 |
| SEM | 4 | 6 |
| SEM | 9 | 9 |
| TUT | 1 | 12 |
| TUT | 4 | 6 |
Into this output format:
| Event Category | Number of Occurrences | Start Week | End Week |
|---|---|---|---|
| LEC | 3 | 1 | 1 |
| LEC | 2 | 2 | 2 |
| LEC | 3 | 3 | 3 |
| LEC | 2 | 4 | 11 |
| SEM | 1 | 2 | 2 |
| SEM | 1 | 4 | 6 |
| SEM | 1 | 9 | 9 |
| TUT | 1 | 1 | 3 |
| TUT | 2 | 4 | 6 |
| TUT | 1 | 7 | 12 |
Seems simple, but required a combination of CASE, LAG, and ROW_NUMBER – as Tom Kyte steps through in detail in this article from 2004.
Here is the solution:
SELECT q5.event_category , q5.num , MIN ( q5.week ) AS start_week , MAX ( q5.week ) AS end_week
FROM (
SELECT q4.week , q4.event_category , q4.num
, MAX ( q4.rn ) OVER ( PARTITION BY q4.event_category ORDER BY q4.week ) AS max_rn
FROM (
SELECT q3.week , q3.event_category , q3.num
, CASE
WHEN q3.num != q3.num_prev THEN q3.row_num
WHEN q3.week != q3.week_prev + 1 THEN q3.row_num
END AS rn -- mark the beginning of each group
FROM (
SELECT q2.week , q2.event_category , q2.num
, LAG ( q2.num , 1 , 0 ) OVER ( PARTITION BY q2.event_category ORDER BY q2.week ) AS num_prev
, LAG ( q2.week , 1 , 0 ) OVER ( PARTITION BY q2.event_category ORDER BY q2.week ) AS week_prev
, ROW_NUMBER ( ) OVER ( PARTITION BY q2.event_category ORDER BY q2.week ) AS row_num
FROM (
SELECT t.event_category , q1.week , COUNT(*) as num
FROM timetable t
JOIN (
SELECT LEVEL AS week
FROM DUAL
CONNECT BY LEVEL <= 52
) q1
ON q1.week BETWEEN t.startw AND t.endw
GROUP BY t.event_category , q1.week
) q2
) q3
) q4
) q5
GROUP BY q5.event_category , q5.num , q5.max_rn
ORDER BY q5.event_category , MIN ( q5.week )
;
Sample data:
CREATE TABLE timetable ( startw NUMBER , endw NUMBER , event_category VARCHAR2(3) ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 11 ,'LEC' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 11 ,'LEC' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 1 , 'LEC' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 3 , 3 , 'LEC' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 2 , 2 , 'SEM' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 4 , 6 , 'SEM' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 9 , 9 , 'SEM' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 12 , 'TUT' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 4 , 6 , 'TUT' ) ;
Leave a Reply