Oracle data transform using analytic lag row_number case

So I needed to transform this source data:

Event CategoryStart WeekEnd Week
LEC11
LEC111
LEC111
LEC33
SEM22
SEM46
SEM99
TUT112
TUT46

Into this output format:

Event CategoryNumber of OccurrencesStart WeekEnd Week
LEC311
LEC222
LEC333
LEC2411
SEM122
SEM146
SEM199
TUT113
TUT246
TUT1712

Seems simple, but required a combination of CASE, LAG, and ROW_NUMBER – as Tom Kyte steps through in detail in this article from 2004.

Here is the solution:

SELECT q5.event_category , q5.num , MIN ( q5.week ) AS start_week , MAX ( q5.week ) AS end_week
  FROM (
       SELECT q4.week , q4.event_category , q4.num
            , MAX ( q4.rn ) OVER ( PARTITION BY q4.event_category ORDER BY q4.week ) AS max_rn
        FROM (
             SELECT q3.week , q3.event_category , q3.num
                  , CASE
                       WHEN q3.num != q3.num_prev THEN q3.row_num
                       WHEN q3.week != q3.week_prev + 1 THEN q3.row_num
                       END AS rn  -- mark the beginning of each group
               FROM (
                    SELECT q2.week , q2.event_category , q2.num
                         , LAG  ( q2.num  , 1 , 0 ) OVER ( PARTITION BY q2.event_category ORDER BY q2.week ) AS num_prev
                         , LAG  ( q2.week , 1 , 0 ) OVER ( PARTITION BY q2.event_category ORDER BY q2.week ) AS week_prev
                         , ROW_NUMBER ( ) OVER ( PARTITION BY q2.event_category ORDER BY q2.week ) AS row_num
                      FROM (
                           SELECT t.event_category , q1.week , COUNT(*) as num
                             FROM timetable t
                             JOIN (
                                  SELECT LEVEL AS week
                                    FROM DUAL
                                 CONNECT BY LEVEL <= 52
                                  ) q1
                               ON q1.week BETWEEN t.startw AND t.endw
                            GROUP BY t.event_category , q1.week
                           ) q2
                    ) q3
             ) q4
       ) q5
 GROUP BY q5.event_category , q5.num , q5.max_rn
 ORDER BY q5.event_category , MIN ( q5.week )
;

Sample data:

CREATE TABLE timetable ( startw NUMBER , endw NUMBER , event_category VARCHAR2(3) ) ;
INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 11  ,'LEC' ) ;
INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 11  ,'LEC' ) ;
INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 1 , 'LEC' ) ;
INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 3 , 3 , 'LEC' ) ;
INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 2 , 2 , 'SEM' ) ;
INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 4 , 6 , 'SEM' ) ;
INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 9 , 9 , 'SEM' ) ;
INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 1 , 12 , 'TUT' ) ;
INSERT INTO timetable ( startw , endw , event_category ) VALUES ( 4 , 6 , 'TUT' ) ;
December 17, 2014

Leave a Reply

Your email address will not be published. Required fields are marked *