How to understand oracle Analysis function 07/16 Update SLTechnology News&Howtos

How to understand oracle Analysis function

2025-07-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article shows you how to understand the oracle analysis function, the content is concise and easy to understand, can definitely brighten your eyes, through the detailed introduction of this article, I hope you can get something.

Oracle development

1. Brief introduction of Oracle analysis function:

In the daily production environment, we are more exposed to OLTP systems (that is, Online Transaction Process), which are characterized by real-time requirements, or at least certain requirements for the response time; secondly, the business logic of these systems is generally complex and may need to go through many operations. For example, we often come into contact with the electronic mall.

In addition to these systems, there is a system called OLAP (Online Aanalyse Process), which is generally used for system decision-making. It is usually associated with the concepts of data warehouse, data analysis, data mining and so on. These systems are characterized by a large amount of data, low requirements for real-time response or do not pay attention to these requirements at all, mainly query and statistical operations.

Let's take a look at some typical examples:

① finds the top 10 employees in each sales region in the previous year

② finds customers who accounted for more than 20% of the total orders in the previous year by region.

③ finds the area of the department with the worst sales in the previous year

④ finds the best and worst selling products in the previous year

If we look at the above examples, we can see that these queries are somewhat different from the queries we encounter on a daily basis, including:

① requires different levels of aggregation of the same data.

② needs to compare multiple pieces of data with the same data multiple times in the table.

③ requires additional filtering operations on the sorted result set

Parse function syntax:

FUNCTION_NAME (,...)

OVER

()

Example:

Sum (sal) over (partition by deptno order by ename) new_alias

Sum is the name of the function

(sal) is the parameter of the analysis function. Each function has three parameters, and the parameter can be an expression, for example: sum (sal+comm)

Over is a keyword that identifies the parsing function, otherwise the query parser cannot distinguish between the sum () aggregate function and the sum () parsing function.

Partition by deptno is an optional partition clause. If there are no partition clauses, then all result sets can be treated as a single theater.

Order by ename is an optional order by clause, which is required by some functions and not by others. Functions that rely on sorted data, such as LAG and LEAD to access the first and last rows of the result set, must be used, while other functions, such as AVG, are not needed. This clause is mandatory when any sorted windowing function is used, specifying how the data in a set of parsing functions is sorted.

1) FUNCTION clause

ORACLE provides 26 analysis functions, which are divided into 5 categories by function.

Classification of analytical functions

Ranking function: used to find the top N queries

Windowing function: used to calculate different accumulations, such as SUM,COUNT,AVG,MIN,MAX, acting on a window of data

Example:

Sum (t.sal) over (order by t.deptnoret.ename) running_total

Sum (t.sal) over (partition by t.deptno order by t.ename) department_total

Reporting function: acts on all columns on a partition or group with the same name as the windowing function

Example:

Sum (t.sal) over () running_total2

Sum (t.sal) over (partition by t.deptno) department_total2

The key difference between tabulation and windowing functions is the lack of an ORDER BY clause on the OVER statement!

LAG,LEAD functions: these functions allow you to retrieve values forward or backward in the result set, and they are very useful in order to avoid self-joining of data.

VAR_POP,VAR_SAMP,STDEV_POPE and linear attenuation function: calculate the statistics of any unsorted partition

2) PARTITION clause

Partition by expression (that is, grouping), if the partition clause is omitted, the entire result set is treated as a single group

3) ORDER BY clause

The existence of ORDER BY in the analysis function will add a default windowing clause, which means that the collection of rows used in the calculation is the current row and all the previous rows in the current partition, and when there is no ORDER BY, the default window is all partitions after the Order by clause can be added nulls last, such as: order by comm desc nulls last means to ignore the row that comm column is empty when sorting.

4) WINDOWING clause

Used to define the collection of rows on which the parsing function will operate

The Windowing clause gives a way to define a changing or fixed data window, and the parsing function will manipulate the data

The default window is a fixed window, which only starts on the first line of a group and continues to the current line. To use the window, you must use the ORDER BY clause

Windows can be established according to two criteria: the range of data values (RANGES) or the row offset from the current row.

5) Rang window

Range 5 preceding: a sliding window will be generated that owns a collection of the first five rows of the current row in the group

ANGE windows work only for NUMBERS and DATES, because it is not possible to add or subtract N cells from VARCHAR2

Another limitation is that there can be only one column in ORDER BY, so the range is actually one-dimensional and cannot be in N-dimensional space.

Example:

Avg (t.sal) over (order by t.hiredate asc range 100preceding) Statistics the average wage in the first 100 days

6) Row window

With ROW partitions, there are no restrictions like RANGE partitions, data can be of any type, and ORDER BY can include many columns

7) Specifying window

UNBOUNDED PRECEDING: this window starts at each line of the current partition and ends at the current line being processed

CURRENT ROW: this window starts (and ends) on the current line

Numeric Expression PRECEDING: for this window, start with the line of the numeric expression (Numeric Expression) before the current line, and for RANGE, start with the value of the current row whose sequence value is less than that of the numeric expression.

Numeric Expression FOLLOWING: this window terminates (or starts) the line after the current line Numeric Expression line, and starts (or terminates) when the sequence value is greater than the range of the current line Numeric Expression line

Range between 100 preceding and 100 following: current line before 100, after before and after 100

Note: parsing functions allow you to sort and filter a data set, which SQL has never been able to achieve. Except for the final Order by clause, the parsing function is the last set of operations executed in the query, so you cannot use the parsing function directly in the predicate, that is, you cannot use the where or having clause on it!

2. A simple example of Oracle analysis function:

Let's look at the application of the analysis function through a practical example: look for customers whose total orders accounted for more than 20% of the total orders in the previous year by region.

[1] Test environment:

SQL > desc orders_tmp

Name Null? Type

CUST_NBR NOT NULL NUMBER (5)

REGION_ID NOT NULL NUMBER (5)

SALESPERSON_ID NOT NULL NUMBER (5)

YEAR NOT NULL NUMBER (4)

MONTH NOT NULL NUMBER (2)

TOT_ORDERS NOT NULL NUMBER (7)

TOT_SALES NOT NULL NUMBER (11pr 2)

[2] Test data:

SQL > select * from orders_tmp

CUST_NBR REGION_ID SALESPERSON_ID YEAR MONTH TOT_ORDERS TOT_SALES

11 7 11 2001 7 2 12204

4 5 4 2001 10 2 37802

7 6 7 2001 2 3 3750

10 6 8 2001 1 2 21691

10 6 7 2001 2 3 42624

15 7 12 2000 5 6 24

12 7 9 2000 6 2 50658

1 5 2 2000 3 2 44494

1 5 1 2000 9 2 74864

2 5 4 2000 3 2 35060

2 54 2000 4 4 6454

2 5 1 2000 10 4 35580

4 5 4 2000 12 2 39190

13 rows selected.

[3] Test statement:

SQL > select o.cust_nbr customer

2 o.region_id region

3 sum (o.tot_sales) cust_sales

4 sum (sum (o.tot_sales)) over (partition by o.region_id) region_sales

5 from orders_tmp o

6 where o.year = 2001

7 group by o.region_id, o.cust_nbr

CUSTOMER REGION CUST_SALES REGION_SALES

4 5 37802 37802

7 6 3750 68065

10 6 64315 68065

11 7 12204 12204

3. Analysis function OVER parsing:

Notice the green highlight above. Group by's intention is clear: group data by region ID and customers, so what's the use of Over? If we only need to count the total amount of orders for each customer in each region, then all we need is group by o.regionclassiidbook.custroomnbr. But we also want to show the total order amount for the customer's region on each line, which is different from the previous one: we need to accumulate by region on the basis of the previous grouping. It is clear that group by and sum cannot do this (because the level of aggregation operations is different, the former is for a customer, and the latter is for a group of customers).

This is what the over function does! Its purpose is to tell the SQL engine to partition the data by region and then accumulate the total order amount for each customer in each region (sum (sum (o.tot_sales).

Now that we know the total amount of orders for each customer and its corresponding region in 2001, here is the screening of major customers whose total personal orders account for more than 20% of the total regional orders.

SQL > select *

2 from (select o.cust_nbr customer

3 o.region_id region

4 sum (o.tot_sales) cust_sales

5 sum (sum (o.tot_sales)) over (partition by o.region_id) region_sales

6 from orders_tmp o

7 where o.year = 2001

8 group by o.region_id, o.cust_nbr) all_sales

9 where all_sales.cust_sales > all_sales.region_sales * 0.2

CUSTOMER REGION CUST_SALES REGION_SALES

4 5 37802 37802

10 6 64315 68065

11 7 12204 12204

SQL >

Now we already know who these big clients are! Oh, but that's not enough. what if we want to know the proportion of orders for each major customer? Take a look at the SQL statement below. All you need is a simple Round function.

SQL > select all_sales.*

2 100 * round (cust_sales / region_sales, 2) | |'% 'Percent

3 from (select o.cust_nbr customer

4 o.region_id region

5 sum (o.tot_sales) cust_sales

6 sum (sum (o.tot_sales)) over (partition by o.region_id) region_sales

7 from orders_tmp o

8 where o.year = 2001

9 group by o.region_id, o.cust_nbr) all_sales

10 where all_sales.cust_sales > all_sales.region_sales * 0.2

CUSTOMER REGION CUST_SALES REGION_SALES PERCENT

4 5 37802 37802

10 6 64315 68065 94%

11 7 12204 12204

SQL >

Summary:

The ① Over function indicates which fields to analyze, followed by Partition by to group the data. Note that Partition by can have multiple fields.

② Over function can be collocated with other aggregate functions and analysis functions, and play a different role. For example, SUM here, as well as Rank,Dense_rank and so on.

Oracle Development Project: analysis function 2 (Rank, Dense_rank, row_number)

Catalogue

1. Use rownum to record rankings

two。 Use analysis functions to rank records

3. Use analysis functions to rank records in groups

First, use rownum to record rankings:

In the previous article, "Oracle Development Topics: analysis functions," we learned about the basic application of analysis functions. Now let's consider the following questions:

① ranks all customers by total order

② ranks by region and total customer orders

③ finds the top 13 customers with total orders

④ finds the customers with the highest and lowest total orders.

⑤ identifies the top 25% of customers with total orders

According to the idea of the first article, we can only count the data of each group. if we need to rank, then we just need to simply add rownum, right? Let's put it into practice whether the fact is as simple as imagined.

[1] Test environment:

SQL > desc user_order

Name Null? Type

REGION_ID NUMBER (2)

CUSTOMER_ID NUMBER (2)

CUSTOMER_SALES NUMBER

[2] Test data:

SQL > select * from user_order order by customer_sales

REGION_ID CUSTOMER_ID CUSTOMER_SALES

51 151162

10 29 903383

6 7 971585

10 28 986964

9 21 1020541

9 22 1036146

8 16 1068467

6 8 1141638

5 3 1161286

5 5 1169926

8 19 1174421

7 12 1182275

7 11 1190421

6 10 1196748

6 9 1208959

10 30 1216858

5 2 1224992

9 24 1224992

9 23 1224992

8 18 1253840

7 15 1255591

7 13 1310434

10 27 1322747

8 20 1413722

6 6 1788836

10 26 1808949

5 4 1878275

7 14 1929774

8 17 1944281

9 25 2232703

30 rows selected.

Note that there are three records with the same total order. If we need to screen the top 12 customers now, what will happen if we use rownum?

SQL > select rownum, t.*

2 from (select *

3 from user_order

4 order by customer_sales desc) t

5 where rownum select region_id, customer_id, sum (customer_sales) total

2 rank () over (order by sum (customer_sales) desc) rank

3 dense_rank () over (order by sum (customer_sales) desc) dense_rank

4 row_number () over (order by sum (customer_sales) desc) row_number

5 from user_order

6 group by region_id, customer_id

REGION_ID CUSTOMER_ID TOTAL RANK DENSE_RANK ROW_NUMBER

8 18 1253840 11 11 11

5 2 1224992 12 12 12

9 23 1224992 12 12 13

9 24 1224992 12 12 14

10 30 1216858 15 13 15

30 rows selected.

Please note the green highlight above, which vividly demonstrates three different ranking strategies:

For the first same record by ①, the ranking of the three functions is the same: 12

When the second same record appears in ②, Rank and Dense_rank still give the same ranking 12, while row_number increases to 13, and so on to the third same record.

When the ③ rankings proceed to the next different record, you can see that the Rank function vacates the 1313 14 ranking between 12 and 15, because these two rankings are actually occupied by the second and third same records. While Dense_rank increases sequentially. The row_number function is also incremented sequentially.

Comparing the above three different strategies, we have to decide according to the needs of our customers when we choose:

① if the customer only needs a specified number of records, then it is easiest to adopt row_number, but there is a risk of missing records

② if customers need all the records that reach the ranking level, then rank or dense_rank is a good choice. As for which one to choose depends on the needs of the customer, choose dense_rank or get the largest record

Third, use the analysis function to rank the records in groups:

The above ranking is ranked by the total amount of orders, and now it goes a step further: what if it is ranked for the total amount of orders in each region? This means another grouping operation: grouping records by region and then ranking them. Fortunately, Oracle also provides such support, and all we need to do is add a grouping clause before order by in the over function: partition by region_id.

SQL > select region_id, customer_id

Sum (customer_sales) total

2 rank () over (partition by region_id

Order by sum (customer_sales) desc) rank

3 dense_rank () over (partition by region_id

Order by sum (customer_sales) desc) dense_rank

4 row_number () over (partition by region_id

Order by sum (customer_sales) desc) row_number

5 from user_order

6 group by region_id, customer_id

REGION_ID CUSTOMER_ID TOTAL RANK DENSE_RANK ROW_NUMBER

5 4 1878275 1 1 1

5 2 1224992 22 2

5 5 1169926 3 3 3

6 6 1788836 1 1 1

6 9 1208959 2 2 2

6 10 1196748 3 3 3

30 rows selected.

Now the rankings we see will be based on each region, not all regions! The role of the Partition by clause in the permutation function is to divide a result set into parts so that the permutation function can be applied to each subset.

Two of the five problems we mentioned above have been solved (1, 2), and the remaining 3 problems (Top/Bottom NMagic FirstLast, NTile) will be explained in the next article.

Oracle Development Project: analysis function 3 (Top/Bottom N, First/Last, NTile)

Catalogue

1. Permutations with null values

2.Top/Bottom N query

3.First/Last ranking query

4. Query by hierarchy

First, the arrangement with null values:

In the previous article "Oracle Development Project: analysis function 2 (Rank, Dense_rank, row_number)", we already knew how to arrange a batch of records in full and in groups. What if the arranged data contains null values?

SQL > select region_id, customer_id

2 sum (customer_sales) cust_sales

3 sum (sum (customer_sales)) over (partition by region_id) ran_total

4 rank () over (partition by region_id

5 order by sum (customer_sales) desc) rank

6 from user_order

7 group by region_id, customer_id

REGION_ID CUSTOMER_ID CUST_SALES RAN_TOTAL RANK

10 31 6238901 1

10 26 1808949 6238901 2

10 27 1322747 6238901 3

10 30 1216858 6238901 4

10 28 986964 6238901 5

10 29 903383 6238901 6

We can see that there is a record with the CUST_TOTAL field value of NULL, but it is actually ranked first! Obviously, it doesn't make sense. So let's readjust our ranking strategy and take a look at the following sentence:

SQL > select region_id, customer_id

2 sum (customer_sales) cust_total

3 sum (sum (customer_sales)) over (partition by region_id) reg_total

4 rank () over (partition by region_id

Order by sum (customer_sales) desc NULLS LAST) rank

5 from user_order

6 group by region_id, customer_id

REGION_ID CUSTOMER_ID CUST_TOTAL REG_TOTAL RANK

10 26 1808949 6238901 1

10 27 1322747 6238901 2

10 30 1216858 6238901 3

10 28 986964 6238901 4

10 29 903383 6238901 5

10 31 6238901 6

In the green highlight, NULLS LAST/FIRST tells Oracle to let the null value rank last and first.

Note that it is NULLS, not NULL.

2. Top/Bottom N query:

In daily work and production, we often encounter such inquiries: find out the top 5 order customers, find out the top 10 sales staff, and so on. Now this is a very simple problem for us. Let's demonstrate it with a practical example:

[1] identify the top 3 major customers with total orders:

SQL > select *

SQL > from (select region_id

SQL > customer_id

SQL > sum (customer_sales) cust_total

SQL > rank () over (order by sum (customer_sales) desc NULLS LAST) rank

SQL > from user_order

SQL > group by region_id, customer_id)

SQL > where rank

[2] identify the top 3 major customers in each region:

SQL > select *

2 from (select region_id

3 customer_id

4 sum (customer_sales) cust_total

5 sum (sum (customer_sales)) over (partition by region_id) reg_total

6 rank () over (partition by region_id

Order by sum (customer_sales) desc NULLS LAST) rank

7 from user_order

8 group by region_id, customer_id)

9 where rank select min (customer_id)

2 keep (dense_rank first order by sum (customer_sales) desc) first

3 min (customer_id)

4 keep (dense_rank last order by sum (customer_sales) desc) last

5 from user_order

6 group by customer_id

FIRST LAST

31 1

Here are a few things that seem confusing:

① Why do you use the min function here?

What is this thing called ② Keep?

What does ③ fist/last do?

What's the difference between ④ dense_rank and dense_rank ()? can I change it to rank?

First, answer the first question: the purpose of the min function is to ensure that a unique record is returned when there is more than one First/Last. What will happen if we get rid of it?

SQL > select keep (dense_rank first order by sum (customer_sales) desc) first

2 keep (dense_rank last order by sum (customer_sales) desc) last

3 from user_order

4 group by customer_id

Select keep (dense_rank first order by sum (customer_sales) desc) first

ERROR at line 1:

ORA-00907: missing right parenthesis

Let's take a look at the second question: what is keep for? From the above results, we already know that Oracle only "retains" 2 pieces of data for ranking results, which is the role of keep. Tell Oracle to keep only records that meet the keep criteria.

So what is a qualified record? This is the third question. Dense_rank is the strategy that tells Oracle to arrange, and first/last tells the final filter criteria.

Question 4: what if we replace dense_rank with rank?

SQL > select min (region_id)

2 keep (rank first order by sum (customer_sales) desc) first

3 min (region_id)

4 keep (rank last order by sum (customer_sales) desc) last

5 from user_order

6 group by region_id

Select min (region_id)

ERROR at line 1:

ORA-02000: missing DENSE_RANK

4. Query by hierarchy:

Now we have seen how to get Top/Bottom N, the first and last record, through Oracle's analysis function. Sometimes we get a demand like this: find out the customers who are in the top 1 to 5 in total order volume.

Familiar, isn't it? We will immediately think of the method mentioned in the second point, but the rank function only ranks for us and does not know the relative position of each ranking in the overall ranking. At this time, we introduce another analysis function, NTile. Let's take the above requirements as an example:

SQL > select region_id

2 customer_id

3 ntile (5) over (order by sum (customer_sales) desc) til

4 from user_order

5 group by region_id, customer_id

REGION_ID CUSTOMER_ID TILE

10 31 1

9 25 1

10 26 1

6 6 1

8 18 2

5 2 2

9 23 3

6 9 3

7 11 3

5 3 4

6 8 4

8 16 4

6 7 5

10 29 5

5 1 5

The Ntil function calculates the proportion of the ranking of each record in the recordset, and we see that all records are divided into five levels, so if we only need the first one to 5 records, we only need to intercept records with a TILE value of 1. If we need the top 25% of the records (that is, 1 Universe 4), then we only need to set ntile (4).

Oracle Development Project: window function

1. Brief introduction of window function

two。 Window function example-full Statistics

3. Window function advance-rolling statistics (cumulative / mean)

4. Window function advance-statistics based on time range

5. Window function advance-first_value/last_value

6. Window function advance-compare adjacent records

1. Brief introduction of window function:

The analysis functions we have learned so far are particularly useful in calculating / counting data over a period of time, but what if the calculation / statistics need to be done along with each record in the recordset? To give some examples:

① lists the total amount of orders for each month and for the whole year

② lists the total monthly orders and the total orders up to the current month

③ lists the total orders for the last month, the current month, and the next month, as well as the total orders for the whole year.

④ lists daily turnover and total turnover over the past week

⑤ lists daily turnover and average daily turnover over the past week

Looking back at the analysis functions we introduced earlier, we will find that these requirements are somewhat different from the previous ones: the analysis functions we introduced earlier are used to calculate / count a clear phase / recordset, while there are some requirements, such as 2, that need to be counted as you traverse each record in the recordset.

In other words, statistics occur not only once, but many times. Statistics do not occur after the formation of the recordset, but during the formation of the recordset.

This is the application of the window function we are going to introduce this time. It applies to the following situations:

① specifies a batch of records: for example, from the current record to the end of the last record in a part

② by specifying a time interval: for example, the first 30 days before the trading day

③ specifies a range value: for example, all records that account for 5% of the current total transaction volume

Second, window function example-full statistics:

Let's take a look at the application of the window function by taking the demand: listing the total amount of orders per month and the total amount of orders for the whole year as an example.

[1] Test environment:

SQL > desc orders

Is the name empty? Types

MONTH NUMBER (2)

TOT_SALES NUMBER

SQL >

[2] Test data:

SQL > select * from orders

MONTH TOT_SALES

1 610697

2 428676

3 637031

4 541146

5 592935

6 501485

7 606914

8 460520

9 392898

10 510117

11 532889

12 492458

12 rows have been selected.

[3] Test statement:

Recall that in the previous article "Oracle Development Project: analysis function (OVER)", we used sum (sum (tot_sales)) over (partition by region_id) to count the total orders for each partition. Now we want to count not only each partition, but all the partitions. Partition by region_id no longer works here.

Oracle provides a clause for this situation: rows between... Preceding and... Following . Literally guess what it means: all records before XXX and after XXX, how to verify the actual situation through examples:

SQL > select month

2 sum (tot_sales) month_sales

3 sum (sum (tot_sales)) over (order by month

4 rows between unbounded preceding and unbounded following) total_sales

5 from orders

6 group by month

MONTH MONTH_SALES TOTAL_SALES

1 610697 6307766

2 428676 6307766

3 637031 6307766

4 541146 6307766

5 592935 6307766

6 501485 6307766

7 606914 6307766

8 460520 6307766

9 392898 6307766

10 510117 6307766

11 532889 6307766

12 492458 6307766

12 rows have been selected.

The green highlight code plays a key role here, telling oracle to count monthly sales from the first record to the last record. This statistic has been performed 12 times during the formation of the recordset, which is quite time-consuming! But at least we solved the problem.

Unbounded preceding and unbouned following means for the first and last records of all current records, that is, all records in the table. What if we specify directly from the beginning to the end of the first record? Look at the following results:

SQL > select month

2 sum (tot_sales) month_sales

3 sum (sum (tot_sales)) over (order by month

4 rows between 1 preceding and unbounded following) all_sales

5 from orders

6 group by month

MONTH MONTH_SALES ALL_SALES

1 610697 6307766

2 428676 6307766

3 637031 5697069

4 541146 5268393

5 592935 4631362

6 501485 4090216

7 606914 3497281

8 460520 2995796

9 392898 2388882

10 510117 1928362

11 532889 1535464

12 492458 1025347

12 rows have been selected.

It is obvious that this sentence is wrong. In this case, actual 1 does not mean to start with the first record, but refers to the previous record of the current record. The modifier in front of preceding tells the window the number of records referenced when the function is executed, just as unbounded tells oracle that no matter what the current record is, as long as there are many records in front of it, they are included in the scope of statistics.

Third, window function advance-rolling statistics (cumulative / mean):

Consider the second requirement mentioned earlier: list the total monthly orders and the total orders up to the current month. In other words, the record for February should show the sum of the total orders for that month and the total orders for January and February. The total amount of orders for that month and the sum of the total orders for February and March should be displayed in March, and so on.

It is obvious that this demand needs to count the sum of the total orders in the N months when counting the total orders in the N months. Think about the above sentence, if only we could replace and unbounded following with logic that represents the current month! Fortunately, Oracle takes our need into account, so we just need to change the statement slightly to: curreent row.

SQL > select month

2 sum (tot_sales) month_sales

3 sum (sum (tot_sales)) over (order by month

4 rows between unbounded preceding and current row) current_total_sales

5 from orders

6 group by month

MONTH MONTH_SALES CURRENT_TOTAL_SALES

1 610697 610697

2 428676 1039373

3 637031 1676404

4 541146 2217550

5 592935 2810485

6 501485 3311970

7 606914 3918884

8 460520 4379404

9 392898 4772302

10 510117 5282419

11 532889 5815308

12 492458 6307766

12 rows have been selected.

Now we can get the total rolling sales! The following statistics look more perfect, showing all the data we need:

SQL > select month

2 sum (tot_sales) month_sales

3 sum (sum (tot_sales)) over (order by month

4 rows between unbounded preceding and current row) current_total_sales

5 sum (sum (tot_sales)) over (order by month

6 rows between unbounded preceding and unbounded following) total_sales

7 from orders

8 group by month

MONTH MONTH_SALES CURRENT_TOTAL_SALES TOTAL_SALES

1 610697 610697 6307766

2 428676 1039373 6307766

3 637031 1676404 6307766

4 541146 2217550 6307766

5 592935 2810485 6307766

6 501485 3311970 6307766

7 606914 3918884 6307766

8 460520 4379404 6307766

9 392898 4772302 6307766

10 510117 5282419 6307766

11 532889 5815308 6307766

12 492458 6307766 6307766

12 rows have been selected.

In some sales reports, we often see the need for averaging, sometimes averaging data for the whole year, sometimes averaging all data so far. It's simple, just put:

Sum (sum (tot_sales)) can be replaced with avg (sum (tot_sales)).

Fourth, the window function is advanced-according to the time range:

As we said earlier, the window function applies not only to the specified recordset for statistics, but also to the specified range for statistics. For example, the following SQL statement counts the sales for the day and the estimated sales within five days:

Select trunc (order_dt) day

Sum (sale_price) daily_sales

Avg (sum (sale_price)) over (order by trunc (order_dt))

Range between interval'2' day preceding

And interval'2' day following) five_day_avg

From cust_order

Where sale_price is not null

And order_dt between to_date ('01Mujulmi 2001')

And to_date ('31Mujulmi 2001')

To count a specified range, Oracle uses the keywords range and interval to specify a range. The above example tells Oracle to find records within the first two days and the next two days of the current date and count their sales averages.

Fifth, the window function advanced-first_value/last_value:

Oracle provides two additional functions, first_value and last_value, to find the first and last records in the window recordset. Suppose our report needs to show the sales of the current month, the previous month, and the next month, as well as the average sales every 3 months, these two functions come in handy.

Select month

First_value (sum (tot_sales)) over (order by month

Rows between 1 preceding and 1 following) prev_month

Sum (tot_sales) monthly_sales

Last_value (sum (tot_sales)) over (order by month

Rows between 1 preceding and 1 following) next_month

Avg (sum (tot_sales)) over (order by month

Rows between 1 preceding and 1 following) rolling_avg

From orders

Where year = 2001

And region_id = 6

Group by month

Order by month

First of all, let's take a look: rows between 1 preceding and 1 following tells Oracle to find and count within the range of the first and the latter of the current records, while first_value and last_value find the first and third records respectively in these three records, so that we can easily get the sales records and averages of the adjacent three months!

6. Window function advance-compare adjacent records:

Through the study of the fifth part, we know how to use the window function to display adjacent records. Now what should we do if we want to display the sales of the current month and the sales of the previous month each time?

From the introduction of the fifth part, we can know that we can do it with first_value (sum (tot_sales) over (order by month rows between 1 preceding and 0 following)). In fact, Oracle has an easier way for us to compare two records, which is the lag function.

The leg function, similar to the preceding and following clauses, can be applied by the relative position of the current record, especially useful when comparing two adjacent records in the same adjacent recordset.

Select month

Sum (tot_sales) monthly_sales

Lag (sum (tot_sales), 1) over (order by month) prev_month_sales

From orders

Where year = 2001

And region_id = 6

Group by month

Order by month

The 1 in lag (sum (tot_sales), 1) is based on January.

Report function: a special project of Oracle development

1. Brief introduction of report function

2.RATIO_TO_REPORT function

1. Brief introduction of report function:

Reviewing the previous section on full statistics in "Oracle Development Topics: window functions", we used the Oracle:

Sum (sum (tot_sales)) over (order by month rows between unbounded preceding and unbounded following)

To calculate the total amount of orders for the whole year, this function is executed every time a record is retrieved during the formation of the recordset, and it executes a total of 12 times. This is very time-consuming. In fact, we have an easier way:

SQL > select month

2 sum (tot_sales) month_sales

3 sum (sum (tot_sales)) over (order by month

4 rows between unbounded preceding and unbounded following) win_sales

5 sum (sum (tot_sales)) over () rpt_sales

6 from orders

7 group by month

MONTH MONTH_SALES WINDOW_SALES REPORT_SALES

1 610697 6307766 6307766

2 428676 6307766 6307766

3 637031 6307766 6307766

4 541146 6307766 6307766

5 592935 6307766 6307766

6 501485 6307766 6307766

7 606914 6307766 6307766

8 460520 6307766 6307766

9 392898 6307766 6307766

10 510117 6307766 6307766

11 532889 6307766 6307766

12 492458 6307766 6307766

12 rows have been selected.

The empty parentheses of the over function indicate that all records of the recordset should be included in the statistical range, and if partition by is used, partition first, and then count each partition in turn.

2. RATIO_TO_REPORT function:

The report function (window function) is especially suitable for situations where detailed data and statistics need to be displayed at the same time in the report. For example, there is often a need in a sales report to list the total monthly sales in the previous year, the sales at the end of the year, and the proportion of monthly sales to total annual sales:

Method ①:

Select all_sales.*

100 * round (cust_sales / region_sales, 2) | |'% 'Percent

From (select o.cust_nbr customer

O.region_id region

Sum (o.tot_sales) cust_sales

Sum (sum (o.tot_sales)) over (partition by o.region_id) region_sales

From orders_tmp o

Where o.year = 2001

Group by o.region_id, o.cust_nbr) all_sales

Where all_sales.cust_sales > all_sales.region_sales * 0.2

This is a stupid method and the easiest to understand.

Method ②:

Select region_id, salesperson_id

Sum (tot_sales) sp_sales

Round (sum (tot_sales) / sum (sum (tot_sales))

Over (partition by region_id), 2) percent_of_region

From orders

Where year = 2001

Group by region_id, salesperson_id

Order by region_id, salesperson_id

Method ③

Select region_id, salesperson_id

Sum (tot_sales) sp_sales

Round (ratio_to_report (sum (tot_sales)

Over (partition by region_id), 2) sp_ratio

From orders

Where year = 2001

Group by region_id, salesperson_id

Order by region_id, salesperson_id

The Ratio_to_report function provided by Oracle allows us to calculate the percentage of each record in its corresponding recordset or its subset.

Oracle Development Project: summary of Analysis functions

This is a summary of all previous articles on parsing functions:

I. in terms of statistics:

Sum () Over ([Partition by] [Order by])

Rows Between Preceding And Following)

Sum () Over ([Partition by] [Order by])

Rows Between Preceding And Current Row)

Sum () Over ([Partition by] [Order by])

Range Between Interval 'Day' Preceding

And Interval 'Day' Following)

For more information, please refer to "Oracle Development Project: analysis function (OVER)" and "Oracle Development Project: window function".

2. Arrangement:

Rank () Over ([Partition by] [Order by] [Nulls First/Last])

Dense_rank () Over ([Patition by] [Order by] [Nulls First/Last])

Row_number () Over ([Partitionby] [Order by] [Nulls First/Last])

Ntile () Over ([Partition by] [Order by])

For details, please refer to "Oracle Development Project: analysis function 2"

Third, the maximum / minimum value search:

Min () / Max () Keep (Dense_rank First/Last [Partition by] [Order by])

For details, please refer to "Oracle Development Project: analysis function 3"

Fourth, the search of the first record / the last record:

First_value / Last_value (Sum () Over ([Patition by] [Order by])

Rows Between Preceding And Following))

For more information, please refer to "Oracle Development Project: window functions"

5. Comparison between adjacent records:

Lag (Sum (), 1) Over ([Patition by] [Order by])

For details, please refer to "Oracle Development Project: report function"

Oracle Development Project: 26 Analytical functions

Reference: "Mastering Oracle SQL" (By Alan Beaulieu, Sanjay Mishra O'Reilly June 2004-00632-2)

PLSQL development notes and summary

* *

Basic structure of PLSQL

* *

Basic data type variable

1. Basic data type

Number digital type

Int integer type

Pls_integer integer type, an error occurs when an overflow occurs

Binary_integer integer type, representing signed integers

Char fixed-length characters with a maximum of 255characters

Varchar2 variable length character type, up to 2000 characters

Long variable length character type, longest 2GB

Date date type

Boolean Boolean type (one of TRUE, FALSE and NULL)

Some of the data types used in PL/SQL and those used in Oracle databases have exactly the same meaning, while others have different meanings.

two。 The definition method of basic data type variables

Variable name type identifier [not null]: = value

Declare

Age number (3): = 26;-length 3, initial value 26

Begin

Commit

End

Where the syntax format of the constant is defined:

Constant name constant type identifier [not null]: = value

Declare

Pi constant number (9): = 3.1415926 pi is a numeric constant with a length of 9 and an initial value of 3.1415926

Begin

Commit

End

Expression.

Variables and constants often need to form various expressions for operation. Here are the operation rules of common expressions in PL/SQL.

1. Numerical expression

Numerical expressions in PL/SQL programs are composed of numeric constants, variables, functions and arithmetic operators. The arithmetic operators that can be used include + (addition),-(subtraction), * (multiplication) and / (division).

) and * (multiplicative), etc.

Execute the following PL/SQL program in the command window, which defines an integer variable named result and calculates the value of 10'3'4-20'5'2. The theoretical result should be 27.

Set serveroutput on

Declare

Result integer

Begin

Result:=10+3*4-20 years 5 years 2

Dbms_output.put_line ('result of operation is:' | | to_char (result))

End

The output of the dbms_output.put_line function can only be a string, so the to_char function is used to convert the numeric result to a character.

two。 Character expression

Character expressions consist of character-type constants, variables, functions, and character operators, and the only character operator that can be used is the concatenation operator "| |".

3. Relational expression

Relational expressions consist of character expressions or numeric expressions and relational operators, and the following nine relational operators can be used.

< 小于 >

Greater than

= equals (not assignment operator: =)

Like is similar to

In is in. In

= greater than or equal to

! = not equal to or

Between is in. Between

Expressions on both sides of relational expression operators must have the same data type.

4. Logical expression

Logical expressions consist of logical constants, variables, functions, and logical operators. Common logical operators include the following three types.

NOT: logical not

OR: logical or

AND: logic vs.

The priority of operation is NOT, AND, and OR.

PLSQL function

PL/SQL programs provide many functions for extension. In addition to the functions available in the standard SQL language, the following three most common data type conversion functions are available.

To_char: converts other types of data to character types.

To_date: converts other types of data to date types.

To_number: converts other types of data to numeric values.

Continue to add..

System output printing

Use pl/sql to print a sentence on the database server:

Set serveroutput on-- sets the database output, which is turned off by default and needs to be reset each time the window is reopened.

BEGIN

DBMS_OUTPUT.PUT_LINE ('Hello PL/SQL')

END

Case insensitive in pl/sql programs (print declared variables)

Set serveroutput on

DECLARE

V_char varchar2 (20): ='a'

V_char1 varchar2 (20): ='b'

BEGIN

DBMS_OUTPUT.PUT_LINE (v_char)

DBMS_OUTPUT.PUT_LINE (v_char1)

END

Pl statement block is the smallest programming block in pl/sql, in which begin end can be nested.

Begin

Dbms_output.put_line ('Hello World')

Dbms_output.put_line ('2: 3'| | (2: 3))

Dbms_output.put_line ('what''s')

End

Variable declaration in PL/SQL

All variables must be declared in declare, which is not allowed in the program.

The default value of a variable that is not initialized is null, and the null is invisible on the screen. Naming conventions: variables in PL/SQL usually start with v _ (equivalent to the custom of defining variables in the as and begin areas of stored procedures).

Note that number can also store decimals, up to 38 digits, so it is recommended that all integers be saved in binary_integer in the future.

Long is a character type and the boolean type cannot be printed.

Standard variable types: numbers, characters, time, Boolean.

Declare

V_number1 number

V_number2 number (3Pol 2)

V_number3 binary_integer: = 1

V_name varchar2 (20): = 'kettas'

V_date date: = sysdate

V_long long: ='ni hao'

Veterb boolean: = true

Begin

If (v_number1 is null) then

Dbms_output.put_line ('hello')

End if

Dbms_output.put_line (v_number1)

Dbms_output.put_line (v_number2)

Dbms_output.put_line (v_number3)

Dbms_output.put_line (v_name)

Dbms_output.put_line (v_date)

Dbms_output.put_line (v_long)

-- dbms_output.put_line (vroomb);-- execute the ORACLE prompt "incorrect number or type of parameters when calling 'PUT_LINE'"

End

Note:

With regard to the meaning of the two numbers in parentheses in the declaration number (4pj3), the first number is called precision, and the latter is called scale.

Scale:

When the scale is positive, it indicates the number of digits rounded to the decimal point

When the scale is negative, it indicates the number of digits rounded to the decimal point

Precision:

From the non-zero at the front of the number to the exact position of the scale

V_Number number (4jue 3): = 123.12312

1. Rounding to the scale to get 123.123

2. Determine the position where the scale is accurate to 123123, with an accuracy of 6 digits (. Symbols do not count)

2. Judge the upper limit of 6-digit (> 4) accuracy according to the accuracy-the error report cannot be stored.

Number (3Maxime 3): = 44445

1. Round off according to scale-3 to get 44000

2. Move the decimal point forward by 3 digits 44. This position is the position to which the scale is accurate.

3. Judge 2 digits according to the accuracy (10 then)

Exit

End if

V_i:=v_i+1

Dbms_output.put_line ('hehe')

End loop

Dbms_output.put_line ('over')

End

Simplified writing method of loop

Declare

Veteri binary_integer: = 0

Begin

Loop

Exit when Veteri > 10

Veteri: = v_i+1

Dbms_output.put_line ('hehe')

End loop

Dbms_output.put_line ('over')

End

While cycle

Declare

V_i binary_integer:=0

Begin

While v_iv_id

Begin

V_id:=10

Open c_student

Close c_student

End

The third way of defining cursors, cursors with parameters, is the most frequently used.

Declare

Cursor c_student (v_id binary_integer) is select * from book where id > v_id

Begin

Open c_student (10)

Close c_student

End

When using cursors, don't forget to turn off the cursors.

Declare

V_student book%rowtype

Cursor c_student (v_id binary_integer) is select * from book where id > v_id

Begin

Open c_student (10)

Fetch c_student into v_student

Close c_student

Dbms_output.put_line (v_student.name)

End

How to traverse the cursor fetch

The property of the cursor found,%notfound,%isopen,%rowcount.

% found: if the previous fetch statement returns a row of data,% found returns true, and reports an ORA-1001 exception if used on a cursor that is not open.

% notfound, as opposed to% found behavior.

% isopen to determine whether the cursor is open.

% rowcount: the pointer displacement of the current cursor, the number of rows of data retrieved by the current position cursor, referenced if not opened, and returns ORA-1001.

Note:

The usage of no_data_found is different from that of% notfound. The summary is as follows.

1) SELECT. . . INTO statement triggers no_data_found

2) trigger% notfound when an explicit cursor (static and dynamic) where clause is not found

3) trigger sql%notfound when the where clause of the UPDATE or DELETE statement is not found

4) in the cursor Fetch loop, use% notfound or% found to determine the exit condition of the loop, not no_data_found.

Here are a few examples:

Create table BOOK

(

ID VARCHAR2 (10) not null

BOOKNAME VARCHAR2 (10) not null

PRICE VARCHAR2 (10) not null

CID VARCHAR2 (10) not null

);

-- insert

Create or replace procedure say_hello (

I_name in varchar2

O_result_msg out varchar2

)

V_price varchar2 (100)

E_myException exception

Begin

Insert into book (id,bookname,price) values (1pm 2pm 3)

O_result_msg: = 'success'

Exception

When others then

Rollback

O_result_msg: = substr (sqlerrm, 1200)

End

-- update or delete

Create or replace procedure say_hello (

I_name in varchar2

O_result_msg out varchar2

)

V_price varchar2 (100)

E_myException exception

Begin

Update book set price = '55' where bookname = i_name

Delete from book where bookname = i_name

If sql%notfound then

Raise e_myException

End if

/ *

If sql%rowcount = 0 then-- 2

Raise e_myException

End if

, /

O_result_msg: = 'success'

Exception

When e_myException then

Rollback

O_result_msg: = 'update or delete dail'

End

-- select

Create or replace procedure say_hello (

I_name in varchar2

O_result_msg out varchar2

)

V_price varchar2 (100)

E_myException exception

Begin

Select price into v_price from book where bookname = i_name

O_result_msg: = 'success'

Exception

When no_data_found then

Rollback

O_result_msg: = 'select into dail'

End

Traversing cursors in loop mode

Declare

V_bookname varchar2 (100)

Cursor c_book (i_id number) is select bookname from book where id = i_id

Begin

Open c_book (i_id)

Loop

Fetch c_book into v_bookname

Exit when c_student%notfound

Update book set price = '33' where bookname = v_bookname

End Loop

Close c_book

End

Declare

V_bookname varchar2 (100)

Cursor c_book (i_id number) is select bookname from book where id = i_id

Begin

Open c_book (i_id)

Fetch c_book into v_bookname

While c_book%Found

Loop

Update book set price = '33' where bookname = v_bookname

Fetch c_book into v_bookname

End Loop

Close c_book

End

The while loop traverses the cursor. Note that the first time the cursor is opened, the fetch,%found is null, and the loop cannot be entered.

Solution: while nvl (cymbidium) loop

Declare

V_bookname varchar2 (100)

Cursor c_book (i_id number) is select bookname from book where id = i_id

Begin

Open c_book (i_id)

While nvl (croupbook% magic true)-or this way: while c_book%found is null or c_book%found loop

Fetch c_book into v_bookname

Update book set price = '33' where bookname = v_bookname

End Loop

Close c_book

End

For loop traversal is the simplest and most frequently used, and there is no need to declare Close cursors and fetch operations (without opening and closing cursors to achieve the most efficient way to traverse cursors)

Declare

Cursor c_book (i_id number) is select bookname from book where id = i_id

Begin

For cur in c_book (i_id)-- pass the input parameter i_id directly to cursor

Loop

Update book set price = '53' where bookname = cur.bookname

End loop

End

For goto examples, it is generally not recommended to use goto, which will change the structure of the program.

Declare

I number:=0

Begin

If iTunes 0 then

Goto hello

End if

Begin

Dbms_output.put_line ('hello')

Goto over

End

Begin

Dbms_output.put_line ('world')

Goto over

End

Dbms_output.put_line ('over')

End

* *

Oracle stored procedure

* *

When talking about some rules in the writing of a stored procedure, take a look at the rules that execute it, and execute the stored procedure sp_get_product_prompt in the command window

Set serveroutput on

Var ret1 varchar2 (200)

Var ret2 varchar2 (200)

Exec sp_get_product_prompt (83 Ret1);-- or execute

Print ret1

Print ret2

Set serveroutput on

Declare

Ret1 varchar2 (200)

Ret2 varchar2 (200)

Begin

Sp_get_product_prompt (83 ret 1 ret 2)

Dbms_output.put_line (ret1)

Dbms_output.put_line (ret2)

End

Stored procedure input parameters, regardless of type, the default value is null, input parameters and output parameters cannot have length, in which the keyword as can be replaced by is, and the variables in the stored procedure are declared between as and begin. At the same time, other stored procedures can be called in the stored procedure. If you want to ensure that the transaction between stored procedures will not be affected, it can be defined as an autonomous transaction.

Create or replace procedure say_hello (

V_name in varchar2

V_flag number

O_ret out number

)

Begin

If v_name is null and v_flag is null then-- both v_name and v_flag are equal to null

O_ret: = 10

Else

O_ret: = 100

End if

End

Give the default value when the input parameter is null

Create or replace procedure say_hello (

I_name in varchar2

I_flag number

O_ret out number

)

V_name varchar2 (100)

Begin

If i_name is null then

V_name: ='0'

Else

V_name: = i_name

End if

Insert into phone (.., wname..,) values (.., vault _ name _)

End

Or directly call the nvl function in the insert statement to assign the default value

Insert into phone (.., wname..,) values (.., nvl);-if''is written as'', then the value of vname in insert is still''equivalent to a null value.

Stored procedure with one parameter

Input parameter in, input parameter cannot be assigned: =, but it can be assigned to the variable defined later in as

Input parameter in, which can be used as a variable to judge the condition.

If you don't write it by default, it is in.

The stored procedure is not overloaded, and the parameterized say_hello replaces the existing nonparametric say _ hello.

Create or replace procedure say_hello (v_name in varchar2)

Begin

The stored procedure input parameter v_name cannot be used as the assignment target.

Dbms_output.put_line ('hello' | | v_name)

End

The input parameters of stored procedure are used as variables to judge the condition.

Create or replace procedure say_hello (

I_opFlag in number

)

V_name varchar2 (100)

Begin

If i_opFlag = 1 then

V_name: ='0'

Else

V_name: = ''

End if

Dbms_output.put_line ('hello' | | v_name)

End

Use the variables defined in the stored procedure to handle the null values of the input parameters:

Create or replace procedure say_hello (

I_name in varchar2

)

V_name varchar2 (100)

Begin

If i_name is null then

V_name: ='0'

Else

V_name: = isimplnamebomachi-assign input to the definition variable

End if

Dbms_output.put_line ('hello' | | v_name)

End

Stored procedure with multiple parameters

Create or replace procedure say_hello (

V_first_name in varchar2

V_last_name in varchar2)

Begin

End

Out output parameters, which are used to assign values to one or more variables using stored procedures, similar to return values

Create or replace procedure say_hello (

V_name in varchar2

V_content out varchar2

)

Begin

Vastly contentlyVaritieshello` | v_name

End

Call:

Declare

V_con varchar2 (200)

V_in varchar2 (20): = 'wang'

Begin

Say_hello (VenerinMagnevicicon)

Dbms_output.put_line (v_con)

End

In out parameter, which is both assigned and taken

Create or replace procedure say_hello (v_name in out varchar2)

Begin

V_name:='hi'| | v_name

End

Call:

Declare

V_inout varchar2 (20): = 'wangsu'

Begin

Say_hello (v_inout)

Dbms_output.put_line (v_inout)

End

Assign default values to input parameters of stored procedures

Create or replace procedure say_hello (

V_name varchar2 default 'susu'

V_content varchar2 default 'hello'

)

Begin

Dbms_output.put_line (v_name | |''| | v_content)

End

Call: (it is better to call by specifying the name of the formal parameter)

Begin

Say_hello ()

End

Begin

Say_hello ('cheng')

End

Begin

Say_hello (vault name = > 'cheng')

End

* *

Function in PLSQL

* *

The difference between FUNCTION and PROCEDURE

1. The function has a return value, but the procedure does not.

2. The function call is in an expression, and the procedure is used as a statement of the pl/sql program

Procedures and functions are stored in the database in compiled form, and functions can have no parameters or have multiple parameters and a return value. Process

There are zero or more parameters and no return value. Functions and procedures can receive or return zero or more values, functions and procedures through a parameter list

The main difference is not in the return value, but in the way they are called. The procedure is called as a separate execution statement, and the function is expressed legally.

The mode of calling

Create or replace function func (v_name in varchar2)

Return varchar2

Begin

Return (v_name | | 'hello')

End

Call:

Declare

V_name varchar2 (20)

Begin

V_name:=func ('cheng')

Dbms_output.put_line (v_name)

End

Functions with out arguments

Create or replace function func (

V_name in varchar2

V_content out varchar2

)

Return varchar2

Begin

V_content:=v_name | | 'hello'

Return v_content

End

Call:

Declare

V_name varchar2 (20)

V_name1 varchar2 (20)

Begin

V_name1:=func ('susu',v_name);-- returns the vname value

Dbms_output.put_line (v_name1);-- print func results

Dbms_output.put_line (v_name);-- print v_name results

End

Functions with in out arguments

Create or replace function func (

V_name in out varchar2)

Return varchar2

Begin

V_name:=v_name | | 'hello'

Return 'cheng'

End

Call:

Declare

V_inout varchar2 (20): = 'world'

V_ret varchar2 (20)

Begin

V_ret:=func (v_inout);-- returns the value of the call vinout (as an output parameter)

Dbms_output.put_line (v_ret);-- print func results

Dbms_output.put_line (v_inout);-- returns v_name result

End

Part of the content is referenced to CSDN: http://download.csdn.net/source/405714

The content of the article continues to be updated.. You are welcome to give us your advice.

Brief introduction of Analysis function

Note: n represents numeric type, C represents character type, D represents date type, [] indicates that inner parameters can be ignored, and fmt represents format.

The parsing function calculates group by-based columns, and the rows queried in groups are called "window", and the comparisons are redefined for each row during execution according to over (). The range of rows that perform the calculation is determined by reference to "current row". This point must be understood clearly. It is the principle of analyzing functions to generate data. If this is blurred, then you may not be so handy when applying analysis functions.

Parsing functions are very similar to the aggregate functions discussed in the previous section, except that they return multiple rows for each grouping sequence. In the example in this section, the two functions will be applied at the same time to better reflect the difference between them. Through the exercises in this chapter, I believe you will notice that some of the aggregate functions and analysis functions are the same command, and this is indeed the case. If you distinguish from the grammatical format, the aggregate function without over () is the aggregate function, and the over () is the parsing function:)

It is important to note that in addition to the operation of the order by clause, the parsing function will be executed last in the SQL statement. Therefore, parsing functions can only be applied to select columns or order by clauses (remember, don't throw them into where, group by, having, and so on). Therefore, the results of functions with the same name as aggregate functions and analysis functions may be different because of the different operational logic here.

At the same time, some parsing functions support distinct when selecting columns. If you specify this parameter, you can only specify the partition clause in the over condition, not the order by clause.

The grammatical structure of the analysis function is more complex, but most functions have the same grammatical structure, so it is introduced uniformly before, and the function grammatical structure is not explained too much in the subsequent introduction of a single function.

Basically all analysis functions are in this format:

Function name ([parameter]) OVER (analytic_clause)

Analytic_clause contains: [partition clause] [order clause [window clause]]

L Partition clause: Partition by exp1 [, exp2]...

Partition has nothing to say, has few powerful parameters, and is mainly used for grouping, which can be understood as group by in select. However, it does not conflict with the group by clause followed by the select statement.

L Order clause: Order by exp1 [ASC | desc] [, exp2 [asc | desc]]. [nulls first | last]. Some functions support the window clause.

The parameters of Order by are basically the same as those of order by in select. Let's just take it that way. Nulls first | last is used to define the position of nulls in the packet sequence. We know that the definition of null in oracle is unknown, so nulls always comes first when ordery is default. The nulls first | last parameter can come in handy if you want to control a column with a value of null.

L Window clause: En, post a picture

It looks complex, but it is actually simple, and the probability of application is quite low, so I won't introduce it in detail.

L AVG ([DISTINCT | ALL] expr) OVER (analytic_clause) calculates the average.

For example:

-- aggregate function

SELECT col, AVG (value) FROM tmp1 GROUP BY col ORDER BY col

-- Analysis function

SELECT col, AVG (value) OVER (PARTITION BY col ORDER BY col)

FROM tmp1

ORDER BY col

L SUM ([DISTINCT | ALL] expr) OVER (analytic_clause)

For example:

-- aggregate function

SELECT col, sum (value) FROM tmp1 GROUP BY col ORDER BY col

-- Analysis function

SELECT col, sum (value) OVER (PARTITION BY col ORDER BY col)

FROM tmp1

ORDER BY col

L COUNT ({* | [DISTINCT | ALL] expr}) OVER (analytic_clause) queries the number of rows in each group sequence.

For example:

-- the number of grouped queries col

SELECT col,count (0) over (partition by col order by col) ct FROM tmp1

L FIRST () fetches the first row from the collection returned by DENSE_RANK.

For example:

-- aggregate function

SELECT col

MIN (value) KEEP (DENSE_RANK FIRST ORDER BY col) "Min Value"

MAX (value) KEEP (DENSE_RANK LAST ORDER BY col) "Max Value"

FROM tmp1

GROUP BY col

-- Analysis function

SELECT col

MIN (value) KEEP (DENSE_RANK FIRST ORDER BY col) OVER (PARTITION BY col)

MAX (value) KEEP (DENSE_RANK LAST ORDER BY col) OVER (PARTITION BY col)

FROM tmp1

ORDER BY col

You can see that the results are basically similar, but the result of ex1 is the column after group by, while ex2 has a return for every row.

L LAST () is the same as above and will not be detailed.

For example, see the above example.

L FIRST_VALUE (col) OVER (analytic_clause) returns the first record queried by the over () condition

For example:

Insert into tmp1 values ('test6','287')

SELECT col

FIRST_VALUE (value) over (partition by col order by value) "First"

LAST_VALUE (value) over (partition by col order by value) "Last"

FROM tmp1

L LAST_VALUE (col) OVER (analytic_clause) returns the last record queried by the over () condition

For example, see the above example.

L LAG (col [, n] [, n]) over ([partition_clause] order_by_clause) lag is an interesting function that returns the value of the N1 row before the specified column col (N2 if the previous N1 row has been out of the reference range, N2 is returned by default). If N1 is not specified, the default value is 1.

For example:

SELECT col

Value

LAG (value) over (order by value) "Lag"

LEAD (value) over (order by value) "Lead"

FROM tmp1

L LEAD (col [, n] [, n]) over ([partition_clause] order_by_clause) is the opposite of the above function, which returns the value of row N1 after the specified column col.

For example, see the above example

L MAX (col) OVER (analytic_clause) gets the maximum value in the packet sequence.

For example:

-- aggregate function

SELECT col

Max (value) "Max"

Min (value) "Min"

FROM tmp1

GROUP BY col

-- Analysis function

SELECT col

Value

Max (value) over (partition by col order by value) "Max"

Min (value) over (partition by col order by value) "Min"

FROM tmp1

L MIN (col) OVER (analytic_clause) gets the minimum value in the packet sequence.

For example, see the above example.

L RANK () OVER ([partition_clause] order_by_clause) has been introduced in the previous aggregate functions of RANK and DENSE_RANK. No nonsense here. Let's just look at the example.

For example:

Insert into tmp1 values ('test2',120)

SELECT col

Value

RANK () OVER (order by value) "RANK"

DENSE_RANK () OVER (order by value) "DENSE_RANK"

ROW_NUMBER () OVER (order by value) "ROW_NUMBER"

FROM tmp1

L DENSE_RANK () OVER ([partition_clause] order_by_clause)

For example, see the above example.

L ROW_NUMBER () OVER ([partition_clause] order_by_clause) this function needs to say two more sentences, through the above comparison, I believe you should be able to see some clues. As mentioned earlier, if dense_rank encounters a duplicate value in a column when sorting, the sequence value of the row of the duplicate value is the same, and the sequence value of the subsequent sequence value is still increasing, while rank is that the sequence value of the row of the duplicate value is the same, but the sequence value after that increases from + the number of duplicate rows, while row_number always increases the sequence value regardless of whether there are duplicate rows or not.

For example, see the above example.

L CUME_DIST () OVER ([partition_clause] order_by_clause) returns the relative position of the row in the grouping sequence, with a return value between 0 and 1. Note that if the column of the order by is desc, the largest row in the group returns the column value 1, and if the order by is asc, the smallest row in the group returns the column value 1.

For example: SELECT col, value, CUME_DIST () OVER (ORDER BY value DESC) FROM tmp1

L NTILE (n) OVER ([partition_clause] order_by_clause)

Ntile is an interesting statistical function. It will group records according to the number of groups you specify (n).

For example: SELECT t. Tile (5) over (order by value desc) FROM tmp1 t

L PERCENT_RANK () OVER ([partition_clause] order_by_clause) is similar to CUME_DIST in that this function returns the relative position of each line in the packet sequence. The return value is also between 0 and 1, but the starting value is always 0 and the ending value is always 1.

For example: SELECT col, value, PERCENT_RANK () OVER (ORDER BY value) FROM tmp1

L PERCENTILE_CONT (n) WITHIN GROUP (ORDER BY col [DESC | ASC]) OVER (partition_clause)

The function of this function is exactly the same as that described in the previous aggregate function, except that one is the aggregate function and the other is the analysis function.

For example:

-- aggregate function

SELECT col, max (value), min (value), sum (value)

PERCENTILE_CONT (0.5) WITHIN GROUP (ORDER BY value) a

PERCENTILE_CONT (0.8) WITHIN GROUP (ORDER BY value) b

FROM TMP1

Group by col

-- Analysis function

SELECT col

Value

Sum (value) over (partition by col) "Sum"

PERCENTILE_CONT (0.5) WITHIN GROUP (ORDER BY value) OVER (PARTITION BY col) "CONTa"

PERCENTILE_CONT (0.8) WITHIN GROUP (ORDER BY value) OVER (PARTITION BY col) "CONTb"

FROM TMP1

L PERCENTILE_DISC (n) WITHIN GROUP (ORDER BY col [DESC | ASC]) OVER (partition_clause)

The function of this function is exactly the same as that described in the previous aggregate function, except that one is the aggregate function and the other is the analysis function.

For example:

-- aggregate function

SELECT col, max (value), min (value), sum (value)

PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY value) a

PERCENTILE_DISC (0.8) WITHIN GROUP (ORDER BY value) b

FROM TMP1

Group by col

-- Analysis function

SELECT col

Value

Sum (value) over (partition by col) "Sum"

PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY value) OVER (PARTITION BY col) "CONTa"

PERCENTILE_DISC (0.8) WITHIN GROUP (ORDER BY value) OVER (PARTITION BY col) "CONTb"

FROM TMP1

L RATIO_TO_REPORT (col) over ([partition_clause]) this function calculates the percentage of the col column values of this row in the packet sequence sum (col). If the col column is empty, a null value is returned.

For example:

SELECT col, value

RATIO_TO_REPORT (value) OVER (PARTITION BY col) "RATIO_TO_REPORT"

FROM TMP1

L STDDEV ([distinct | all] col) OVER (analytic_clause) returns the standard deviation of the column.

For example:

-- aggregate function

SELECT col, STDDEV (value) FROM TMP1 GROUP BY col

-- Analysis function

SELECT col, value

STDDEV (value) OVER (PARTITION BY col ORDER BY value) "STDDEV"

FROM TMP1

L STDDEV_SAMP (col) OVER (analytic_clause) function is the same as above, except that if the packet sequence has only one line, the STDDEV_SAMP function returns a null value and STDDEV returns 0.

For example:

-- aggregate function

SELECT col, STDDEV (value), STDDEV_SAMP (value) FROM TMP1 GROUP BY col

-- Analysis function

SELECT col, value

STDDEV (value) OVER (PARTITION BY col ORDER BY value) "STDDEV"

STDDEV_SAMP (value) OVER (PARTITION BY col ORDER BY value) "STDDEV_SAMP"

FROM TMP1

L STDDEV_POP (col) OVER (analytic_clause) returns the overall standard deviation of the grouped sequence

For example:

-- aggregate function

SELECT col, STDDEV_POP (value) FROM TMP1 GROUP BY col

-- Analysis function

SELECT col, value

STDDEV_POP (value) OVER (PARTITION BY col ORDER BY value) "STDDEV_POP"

FROM TMP1

L VAR_POP (col) OVER (analytic_clause) returns the total variance of the packet sequence, and VAR_POP calculates as follows: (SUM (expr2)-SUM (expr) 2 / COUNT (expr)) / COUNT (expr)

For example:

-- aggregate function

SELECT col, VAR_POP (value) FROM TMP1 GROUP BY col

-- Analysis function

SELECT col, value

VAR_POP (value) OVER (PARTITION BY col ORDER BY value) "VAR_POP"

FROM TMP1

L VAR_SAMP (col) OVER (analytic_clause) is similar to the above, this function returns the sample variance of the grouping sequence, and its formula is: (SUM (expr2)-SUM (expr) 2 / COUNT (expr)) / (COUNT (expr)-1)

For example:

-- aggregate function

SELECT col, VAR_SAMP (value) FROM TMP1 GROUP BY col

-- Analysis function

SELECT col, value

VAR_SAMP (value) OVER (PARTITION BY col ORDER BY value) "VAR_SAMP"

FROM TMP1

L VARIANCE (col) OVER (analytic_clause) this function returns the variance of the grouping sequence, which is calculated by Oracle as follows:

Returns 0 if the number of lines in the expression is 1, and VAR_SAMP if the number of lines in the expression is greater than 1

For example:

-- aggregate function

SELECT col, VAR_SAMP (value), VARIANCE (value) FROM TMP1 GROUP BY col

-- Analysis function

SELECT col, value

VAR_SAMP (value) OVER (PARTITION BY col ORDER BY value) "VAR_SAMP"

VARIANCE (value) OVER (PARTITION BY col ORDER BY value) "VARIANCE"

FROM TMP1

The above content is how to understand the oracle analysis function, have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.