web analytics

How To Retrieve TOP-N Rows from a Table in SQL?

Options

codeling 1599 - 6654
@2016-01-13 19:19:28

Top-N queries provide a method for limiting the number of rows returned from ordered sets of data. it is one of most commonly used queries in Web applications, such as it can be used to show query results split in pages. The postings in this topic show you how to do this in different databases.

Retrieving Top-N rows in Oracle

In a top-N query, you are generally interested in retrieving just the first N rows (the top N rows) of a big table with an hundreds of thousands or more rows.

Generally, there are two ways to approach this:

  • Have the client application run that query and fetch just the first N rows.
  • Use that query as an inline view, and use ROWNUM to limit the results, as in SELECT * FROM (your_query_here) WHERE ROWNUM <= N.

Two reasns make the second approach is superior to the first one. The first reason is that it requires less work by the client, because the database takes care of limiting the result set. The more important reason is the special processing the database can do to give you just the top N rows. 

Let's walk through the two approaches with a sample query:

select *
from t
order by unindexed_column;

Now, assume that T is a big table, with more than one million records, and each record is "fat" - say, 100 or more bytes. Also assume that UNINDEXED_COLUMN is, as its name implies, a column that is not indexed. And assume that you are interested in getting just the first 10 rows. Oracle Database would do the following:

1. Run a full-table scan on T.
2. Sort T by UNINDEXED_COLUMN. This is a full sort.
3. Presumably run out of sort area memory and need to swap temporary extents to disk.
4. Merge the temporary extents back to get the first 10 records when they are requested.
5. Clean up (release) the temporary extents as you are finished with them.

Now, that is a lot of I/O. Oracle Database has most likely copied the entire table into TEMP and written it out, just to get the first 10 rows.

Next, let's look at what Oracle Database can do conceptually with a top-N query:

select *

  from 

  ( select * 

    from t 

    order by unindexed_column

  )

  where ROWNUM < :N;

In this case, Oracle Database will take these steps:

1. Run a full-table scan on T, as before (you cannot avoid this step).
2. In an array of :N elements (presumably in memory this time), sort only :N rows.

The first N rows will populate this array of rows in sorted order. When the N +1 row is fetched, it will be compared to the last row in the array. If it would go into slot N +1 in the array, it gets thrown out. Otherwise, it is added to this array and sorted and one of the existing rows is discarded. Your sort area holds N rows maximum, so instead of sorting one million rows, you sort N rows.

This seemingly small detail of using an array concept and sorting just N rows can lead to huge gains in performance and resource usage. It takes a lot less RAM to sort 10 rows than it does to sort one million rows.

@2016-01-13 19:32:53

Returning TOP-N Rows in SQL Server

In Micrsoft SQL Server, you can use the the following SQL statement to retrieve the top-n rows in a table:

SELECT     TOP 10 *
FROM       Products
ORDER BY UnitPrice DESC

In the above statement, the query will return the first 10 products with the highest unit price.

Moreover, You can use TOP with PERCENT to obtains the top n percent of rows in a table in SQL Server:

SELECT     TOP 10 PERCENT *
FROM       Products
ORDER BY UnitPrice DESC

In the above statement, the query will return the first 10 percent of rows with its unit price sorted in descending order.

@2016-01-13 20:05:55

Retrieving TOP-N Rows in MySQL

In MySQL, you can use the the following SQL statement to retrieve the top-n rows in a table:

SELECT * FROM employee order by salary desc limit 10;

In the above statement, the query will return the first 10 employees with the highest salary.

Moreover, MySQL have a cool feature that will let you return an arbitrary range of rows (eg return rows 10-20). This is very handy for displaying pages of records:

SELECT * FROM employee order by salary desc limit 10 offset 20;
@2016-01-16 14:52:02

Retrieving TOP-N Rows in IBM DB2

DB2 provides a mechanism for limiting the records returned from an SQL query with the FETCH FIRST command. Using the FETCH FIRST syntax improves the performance of queries when not all results are required. Listed below are some examples explaining how to use FETCH FIRST.

The the following SQL statement shows you how to use FETCH FIRST command to retrieve the top-n rows in a DB2 table:

SELECT * FROM employee order by salary desc fetch first 10 rows only;

In the above statement, the query will return the first 10 employees with the highest salary.

Comments

You must Sign In to comment on this topic.


© 2024 Digcode.com