Oracle data transform using analytic lag row_number case
So I needed to transform this source data:
Event Category | Start Week | End Week |
---|---|---|
LEC | 1 | 1 |
LEC | 1 | 11 |
LEC | 1 | 11 |
LEC | 3 | 3 |
SEM | 2 | 2 |
SEM | 4 | 6 |
SEM | 9 | 9 |
TUT | 1 | 12 |
TUT | 4 | 6 |
Into this output format:
Event Category | Number of Occurrences | Start Week | End Week |
---|---|---|---|
LEC | 3 | 1 | 1 |
LEC | 2 | 2 | 2 |
LEC | 3 | 3 | 3 |
LEC | 2 | 4 | 11 |
SEM | 1 | 2 | 2 |
SEM | 1 | 4 | 6 |
SEM | 1 | 9 | 9 |
TUT | 1 | 1 | 3 |
TUT | 2 | 4 | 6 |
TUT | 1 | 7 | 12 |
Seems simple, but required a combination of CASE, LAG, and ROW_NUMBER – as Tom Kyte steps through in detail in this article from 2004.
Here is the solution:
SELECT q5.event_category , q5.num , MIN ( q5.week ) AS start_week , MAX ( q5.week ) AS end_week FROM ( SELECT q4.week , q4.event_category , q4.num , MAX ( q4.rn ) OVER ( PARTITION BY q4.event_category ORDER BY q4.week ) AS max_rn FROM ( SELECT q3.week , q3.event_category , q3.num , CASE WHEN q3.num != q3.num_prev THEN q3.row_num WHEN q3.week != q3.week_prev + 1 THEN q3.row_num END AS rn -- mark the beginning of each group FROM ( SELECT q2.week , q2.event_category , q2.num , LAG ( q2.num , 1 , 0 ) OVER ( PARTITION BY q2.event_category ORDER BY q2.week ) AS num_prev , LAG ( q2.week , 1 , 0 ) OVER ( PARTITION BY q2.event_category ORDER BY q2.week ) AS week_prev , ROW_NUMBER ( ) OVER ( PARTITION BY q2.event_category ORDER BY q2.week ) AS row_num FROM ( SELECT t.event_category , q1.week , COUNT(*) as num FROM timetable t JOIN ( SELECT LEVEL AS week FROM DUAL CONNECT BY LEVEL <= 52 ) q1 ON q1.week BETWEEN t.startw AND t.endw GROUP BY t.event_category , q1.week ) q2 ) q3 ) q4 ) q5 GROUP BY q5.event_category , q5.num , q5.max_rn ORDER BY q5.event_category , MIN ( q5.week ) ;
Sample data:
CREATE TABLE timetable ( startw NUMBER , endw NUMBER , event_category VARCHAR2(3) ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 11 ,'LEC' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 11 ,'LEC' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 1 , 'LEC' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 3 , 3 , 'LEC' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 2 , 2 , 'SEM' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 4 , 6 , 'SEM' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 9 , 9 , 'SEM' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 12 , 'TUT' ) ; INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 4 , 6 , 'TUT' ) ;
Leave a Reply