Thursday, July 3, 2014

Uses of Partition By in Oracle 11g



Partition By

Partition By is well documented, and there are plenty of sites which explain it in more technical terms.  For anyone new to Oracle, or SQL, I would imagine Partition By is fairly obscure and seemingly untouchable.  I am going to lay out some examples that show some ordinary and unique uses of the key word.

Total Hours and % of Hours

Say you work in a college and want to produce data on students and the % of credits they have within a particular data set.  Take this example: 

Example 1 

Select

      StudID,
      Hours,
      Count(*)   Over (Partition By 'x') Total_Records,
      Sum(Hours) Over (Partition By 'x') Total_Hours,
    Hours / Sum(Hours) Over (Partition By 'x') Prct_of_Hours

From
(
    Select '001' StudID, 12 Hours From dual UNION ALL
    Select '002' StudID, 18 Hours From dual UNION ALL
    Select '003' StudID,  9 Hours From dual UNION ALL
    Select '005' StudID,  3 Hours From dual
) 

Output

ID
Hours
Total
Records
Total
Hours
Avg
Hours
% of
Hours
1
12
4
42
10.5
0.29
2
18
4
42
10.5
0.43
3
9
4
42
10.5
0.21
5
3
4
42
10.5
0.07



It is an obscure use of Partition By that will get you the data set count of records by partitioning on ‘x’, which will effectively include the whole group.   

Students & Avg Age per Course

A more conventional example would be pulling a list of all the course records for a semester and noting how many records there was for the CourseID, and have that count listed on each record.  You might also want to include the Student ID, Age, and Average Class Age.  Showing the student's age compared to the average age may give indication that the student is in the wrong class, is very advanced, or may need additional help.

Note that this may not necessarily be the total number of students enrolled, depending on your system, but could be with some massaging. 

Example 2


Select

    CourseID,

    StudID,
    Count(*) Over (Partition By CourseID) CRN_Records,
    Age,
    Round(Sum(Age) Over (Partition By CourseID) / Count(*) Over (Partition By CourseID), 1) Avg_Age
From
(
    Select '7777' CourseID, '001' StudID, '18' Age From dual UNION ALL
    Select '7777' CourseID, '002' StudID, '21' Age From dual UNION ALL
    Select '7777' CourseID, '003' StudID, '22' Age From dual UNION ALL
    Select '4444' CourseID, '004' StudID, '29' Age From dual UNION ALL
    Select '4444' CourseID, '005' StudID, '31' Age From dual UNION ALL
    Select '4444' CourseID, '006' StudID, '28' Age From dual UNION ALL
    Select '4444' CourseID, '007' StudID, '20' Age From dual UNION ALL
    Select '9999' CourseID, '008' StudID, '18' Age From dual UNION ALL
    Select '9999' CourseID, '009' StudID, '18' Age From dual
)
 


Output (notice that student 007 is an outlier as far as age goes):

COURSE
ID
STUD
ID
CourseID
Records
Age
Avg
Age
4444
005
4
31
27
4444
007
4
20
27
4444
006
4
28
27
4444
004
4
29
27
7777
003
3
22
20.3
7777
002
3
21
20.3
7777
001
3
18
20.3
9999
009
2
18
18
9999
008
2
18
18

Checking for Duplicates

Another unique use of partition by is duplicate checking.  Often times you're get students whose ID's (or specifically SSN) are duplicated in the system, and the records need to be merged.  Partition by can help with this by counting the number of records that appear with the same SSN, but with out the need of a group by.

Example 3


Select

    SSN,

    StudID,
    StudName,
  --Note: 'Order By' is required when using the Row_Number() function
    Count(*) Over (Partition By SSN) as SSN_Dup_Count,
    Row_Number() Over (Partition By SSN Order By SSN)
      || ' of ' || Count(*) Over (Partition By SSN) as SSN_Count
From
    (    Select '123' SSN, '001' StudID, 'John Smith'      StudName From dual UNION ALL
        Select '123' SSN, '002' StudID, 'Jon Smith'       StudName From dual UNION ALL
        Select '234' SSN, '003' StudID, 'Sam Jones'       StudName From dual UNION ALL
        Select '234' SSN, '004' StudID, 'Samantha Jones'  StudName From dual
    )

Output

SSN
STUD
ID
STUDNAME
SSN DUP COUNT
SSN COUNT
123
1
John Smith
2
1 of 2
123
2
Jon Smith
2
2 of 2
234
3
Sam Jones
2
1 of 2
234
4
Samantha Jones
2
2 of 2