Oracle SQL – Identify sequential value ranges

This is easy to do with a technique called Tabibitosan.

What this technique does is compare the positions of each group’s rows to the overall set of rows, in order to work out if rows in the same group are next to each other or not.

E.g., with your example data, this looks like:

WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
                    SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 6 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 7 ID, 'Kevin' NAME, 'Production' department FROM dual)
-- end of mimicking your table with data in it. See the SQL below:
SELECT ID,
       NAME,
       department,
       row_number() OVER (ORDER BY ID) overall_rn,
       row_number() OVER (PARTITION BY department ORDER BY ID) department_rn,
       row_number() OVER (ORDER BY ID) - row_number() OVER (PARTITION BY department ORDER BY ID) grp
FROM   your_table;

        ID NAME    DEPARTMENT OVERALL_RN DEPARTMENT_RN        GRP
---------- ------- ---------- ---------- ------------- ----------
         1 Michael Marketing           1             1          0
         2 Alex    Marketing           2             2          0
         3 Tom     Marketing           3             3          0
         4 John    Sales               4             1          3
         5 Brad    Marketing           5             4          1
         6 Leo     Marketing           6             5          1
         7 Kevin   Production          7             1          6

Here, I’ve given all the rows across the entire set of data a row number in ascending id order (the overall_rn column), and I’ve given the rows in each department a row number (the department_rn column), again in ascending id order.

Now that I’ve done that, we can subtract one from the other (the grp column).

Notice how the number in the grp column remains the same for deparment rows that are next to each other, but it changes each time there’s a gap.

E.g. for the Marketing department, rows 1-3 are next to each other and have grp = 0, but the 4th Marketing row is actually on the 5th row of the overall results set, so it now has a different grp number. Since the 5th marketing row is on the 6th row of the overall set, it has the same grp number as the 4th marketing row, so we know they’re next to each other.

Once we have that grp information, it’s a simple matter of doing an aggregate query grouping on both the department and our new grp column, using min and max to find the start and end ids:

WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
                    SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 6 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 7 ID, 'Kevin' NAME, 'Production' department FROM dual)
-- end of mimicking your table with data in it. See the SQL below:
SELECT department,
       MIN(ID) start_id,
       MAX(ID) end_id
FROM   (SELECT ID,
               NAME,
               department,
               row_number() OVER (ORDER BY ID) - row_number() OVER (PARTITION BY department ORDER BY ID) grp
        FROM   your_table)
GROUP BY department, grp;

DEPARTMENT   START_ID     END_ID
---------- ---------- ----------
Marketing           1          3
Marketing           5          6
Sales               4          4
Production          7          7

N.B., I’ve assumed that gaps in the id columns aren’t important (i.e. if there was no row for id = 6 (so Leo and Kevin’s ids were 7 and 8 respectively), then Leo and Brad would still appear in the same group, with a start id = 5 and end id = 7.

If gaps in the id columns count as indicating a new group, then you could just use the id to label the overall set of rows (i.e. no need to caluclate the overall_rn; just use the id column instead).

That means your query would become:

WITH your_table AS (SELECT 1 ID, 'Michael' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 2 ID, 'Alex' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 3 ID, 'Tom' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 4 ID, 'John' NAME, 'Sales' department FROM dual UNION ALL
                    SELECT 5 ID, 'Brad' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 7 ID, 'Leo' NAME, 'Marketing' department FROM dual UNION ALL
                    SELECT 8 ID, 'Kevin' NAME, 'Production' department FROM dual)
-- end of mimicking your table with data in it. See the SQL below:
SELECT department,
       MIN(ID) start_id,
       MAX(ID) end_id
FROM   (SELECT ID,
               NAME,
               department,
               ID - row_number() OVER (PARTITION BY department ORDER BY ID) grp
        FROM   your_table)
GROUP BY department, grp;

DEPARTMENT   START_ID     END_ID
---------- ---------- ----------
Marketing           1          3
Sales               4          4
Marketing           5          5
Marketing           7          7
Production          8          8

Leave a Comment