In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article shows you how to understand the oracle analysis function, the content is concise and easy to understand, can definitely brighten your eyes, through the detailed introduction of this article, I hope you can get something.
Oracle development
1. Brief introduction of Oracle analysis function:
In the daily production environment, we are more exposed to OLTP systems (that is, Online Transaction Process), which are characterized by real-time requirements, or at least certain requirements for the response time; secondly, the business logic of these systems is generally complex and may need to go through many operations. For example, we often come into contact with the electronic mall.
In addition to these systems, there is a system called OLAP (Online Aanalyse Process), which is generally used for system decision-making. It is usually associated with the concepts of data warehouse, data analysis, data mining and so on. These systems are characterized by a large amount of data, low requirements for real-time response or do not pay attention to these requirements at all, mainly query and statistical operations.
Let's take a look at some typical examples:
① finds the top 10 employees in each sales region in the previous year
② finds customers who accounted for more than 20% of the total orders in the previous year by region.
③ finds the area of the department with the worst sales in the previous year
④ finds the best and worst selling products in the previous year
If we look at the above examples, we can see that these queries are somewhat different from the queries we encounter on a daily basis, including:
① requires different levels of aggregation of the same data.
② needs to compare multiple pieces of data with the same data multiple times in the table.
③ requires additional filtering operations on the sorted result set
Parse function syntax:
FUNCTION_NAME (,...)
OVER
()
Example:
Sum (sal) over (partition by deptno order by ename) new_alias
Sum is the name of the function
(sal) is the parameter of the analysis function. Each function has three parameters, and the parameter can be an expression, for example: sum (sal+comm)
Over is a keyword that identifies the parsing function, otherwise the query parser cannot distinguish between the sum () aggregate function and the sum () parsing function.
Partition by deptno is an optional partition clause. If there are no partition clauses, then all result sets can be treated as a single theater.
Order by ename is an optional order by clause, which is required by some functions and not by others. Functions that rely on sorted data, such as LAG and LEAD to access the first and last rows of the result set, must be used, while other functions, such as AVG, are not needed. This clause is mandatory when any sorted windowing function is used, specifying how the data in a set of parsing functions is sorted.
1) FUNCTION clause
ORACLE provides 26 analysis functions, which are divided into 5 categories by function.
Classification of analytical functions
Ranking function: used to find the top N queries
Windowing function: used to calculate different accumulations, such as SUM,COUNT,AVG,MIN,MAX, acting on a window of data
Example:
Sum (t.sal) over (order by t.deptnoret.ename) running_total
Sum (t.sal) over (partition by t.deptno order by t.ename) department_total
Reporting function: acts on all columns on a partition or group with the same name as the windowing function
Example:
Sum (t.sal) over () running_total2
Sum (t.sal) over (partition by t.deptno) department_total2
The key difference between tabulation and windowing functions is the lack of an ORDER BY clause on the OVER statement!
LAG,LEAD functions: these functions allow you to retrieve values forward or backward in the result set, and they are very useful in order to avoid self-joining of data.
VAR_POP,VAR_SAMP,STDEV_POPE and linear attenuation function: calculate the statistics of any unsorted partition
2) PARTITION clause
Partition by expression (that is, grouping), if the partition clause is omitted, the entire result set is treated as a single group
3) ORDER BY clause
The existence of ORDER BY in the analysis function will add a default windowing clause, which means that the collection of rows used in the calculation is the current row and all the previous rows in the current partition, and when there is no ORDER BY, the default window is all partitions after the Order by clause can be added nulls last, such as: order by comm desc nulls last means to ignore the row that comm column is empty when sorting.
4) WINDOWING clause
Used to define the collection of rows on which the parsing function will operate
The Windowing clause gives a way to define a changing or fixed data window, and the parsing function will manipulate the data
The default window is a fixed window, which only starts on the first line of a group and continues to the current line. To use the window, you must use the ORDER BY clause
Windows can be established according to two criteria: the range of data values (RANGES) or the row offset from the current row.
5) Rang window
Range 5 preceding: a sliding window will be generated that owns a collection of the first five rows of the current row in the group
ANGE windows work only for NUMBERS and DATES, because it is not possible to add or subtract N cells from VARCHAR2
Another limitation is that there can be only one column in ORDER BY, so the range is actually one-dimensional and cannot be in N-dimensional space.
Example:
Avg (t.sal) over (order by t.hiredate asc range 100preceding) Statistics the average wage in the first 100 days
6) Row window
With ROW partitions, there are no restrictions like RANGE partitions, data can be of any type, and ORDER BY can include many columns
7) Specifying window
UNBOUNDED PRECEDING: this window starts at each line of the current partition and ends at the current line being processed
CURRENT ROW: this window starts (and ends) on the current line
Numeric Expression PRECEDING: for this window, start with the line of the numeric expression (Numeric Expression) before the current line, and for RANGE, start with the value of the current row whose sequence value is less than that of the numeric expression.
Numeric Expression FOLLOWING: this window terminates (or starts) the line after the current line Numeric Expression line, and starts (or terminates) when the sequence value is greater than the range of the current line Numeric Expression line
Range between 100 preceding and 100 following: current line before 100, after before and after 100
Note: parsing functions allow you to sort and filter a data set, which SQL has never been able to achieve. Except for the final Order by clause, the parsing function is the last set of operations executed in the query, so you cannot use the parsing function directly in the predicate, that is, you cannot use the where or having clause on it!
2. A simple example of Oracle analysis function:
Let's look at the application of the analysis function through a practical example: look for customers whose total orders accounted for more than 20% of the total orders in the previous year by region.
[1] Test environment:
SQL > desc orders_tmp
Name Null? Type
-
CUST_NBR NOT NULL NUMBER (5)
REGION_ID NOT NULL NUMBER (5)
SALESPERSON_ID NOT NULL NUMBER (5)
YEAR NOT NULL NUMBER (4)
MONTH NOT NULL NUMBER (2)
TOT_ORDERS NOT NULL NUMBER (7)
TOT_SALES NOT NULL NUMBER (11pr 2)
[2] Test data:
SQL > select * from orders_tmp
CUST_NBR REGION_ID SALESPERSON_ID YEAR MONTH TOT_ORDERS TOT_SALES
11 7 11 2001 7 2 12204
4 5 4 2001 10 2 37802
7 6 7 2001 2 3 3750
10 6 8 2001 1 2 21691
10 6 7 2001 2 3 42624
15 7 12 2000 5 6 24
12 7 9 2000 6 2 50658
1 5 2 2000 3 2 44494
1 5 1 2000 9 2 74864
2 5 4 2000 3 2 35060
2 54 2000 4 4 6454
2 5 1 2000 10 4 35580
4 5 4 2000 12 2 39190
13 rows selected.
[3] Test statement:
SQL > select o.cust_nbr customer
2 o.region_id region
3 sum (o.tot_sales) cust_sales
4 sum (sum (o.tot_sales)) over (partition by o.region_id) region_sales
5 from orders_tmp o
6 where o.year = 2001
7 group by o.region_id, o.cust_nbr
CUSTOMER REGION CUST_SALES REGION_SALES
--
4 5 37802 37802
7 6 3750 68065
10 6 64315 68065
11 7 12204 12204
3. Analysis function OVER parsing:
Notice the green highlight above. Group by's intention is clear: group data by region ID and customers, so what's the use of Over? If we only need to count the total amount of orders for each customer in each region, then all we need is group by o.regionclassiidbook.custroomnbr. But we also want to show the total order amount for the customer's region on each line, which is different from the previous one: we need to accumulate by region on the basis of the previous grouping. It is clear that group by and sum cannot do this (because the level of aggregation operations is different, the former is for a customer, and the latter is for a group of customers).
This is what the over function does! Its purpose is to tell the SQL engine to partition the data by region and then accumulate the total order amount for each customer in each region (sum (sum (o.tot_sales).
Now that we know the total amount of orders for each customer and its corresponding region in 2001, here is the screening of major customers whose total personal orders account for more than 20% of the total regional orders.
SQL > select *
2 from (select o.cust_nbr customer
3 o.region_id region
4 sum (o.tot_sales) cust_sales
5 sum (sum (o.tot_sales)) over (partition by o.region_id) region_sales
6 from orders_tmp o
7 where o.year = 2001
8 group by o.region_id, o.cust_nbr) all_sales
9 where all_sales.cust_sales > all_sales.region_sales * 0.2
CUSTOMER REGION CUST_SALES REGION_SALES
--
4 5 37802 37802
10 6 64315 68065
11 7 12204 12204
SQL >
Now we already know who these big clients are! Oh, but that's not enough. what if we want to know the proportion of orders for each major customer? Take a look at the SQL statement below. All you need is a simple Round function.
SQL > select all_sales.*
2 100 * round (cust_sales / region_sales, 2) | |'% 'Percent
3 from (select o.cust_nbr customer
4 o.region_id region
5 sum (o.tot_sales) cust_sales
6 sum (sum (o.tot_sales)) over (partition by o.region_id) region_sales
7 from orders_tmp o
8 where o.year = 2001
9 group by o.region_id, o.cust_nbr) all_sales
10 where all_sales.cust_sales > all_sales.region_sales * 0.2
CUSTOMER REGION CUST_SALES REGION_SALES PERCENT
4 5 37802 37802
10 6 64315 68065 94%
11 7 12204 12204
SQL >
Summary:
The ① Over function indicates which fields to analyze, followed by Partition by to group the data. Note that Partition by can have multiple fields.
② Over function can be collocated with other aggregate functions and analysis functions, and play a different role. For example, SUM here, as well as Rank,Dense_rank and so on.
Oracle Development Project: analysis function 2 (Rank, Dense_rank, row_number)
Catalogue
=
1. Use rownum to record rankings
two。 Use analysis functions to rank records
3. Use analysis functions to rank records in groups
First, use rownum to record rankings:
In the previous article, "Oracle Development Topics: analysis functions," we learned about the basic application of analysis functions. Now let's consider the following questions:
① ranks all customers by total order
② ranks by region and total customer orders
③ finds the top 13 customers with total orders
④ finds the customers with the highest and lowest total orders.
⑤ identifies the top 25% of customers with total orders
According to the idea of the first article, we can only count the data of each group. if we need to rank, then we just need to simply add rownum, right? Let's put it into practice whether the fact is as simple as imagined.
[1] Test environment:
SQL > desc user_order
Name Null? Type
-
REGION_ID NUMBER (2)
CUSTOMER_ID NUMBER (2)
CUSTOMER_SALES NUMBER
[2] Test data:
SQL > select * from user_order order by customer_sales
REGION_ID CUSTOMER_ID CUSTOMER_SALES
51 151162
10 29 903383
6 7 971585
10 28 986964
9 21 1020541
9 22 1036146
8 16 1068467
6 8 1141638
5 3 1161286
5 5 1169926
8 19 1174421
7 12 1182275
7 11 1190421
6 10 1196748
6 9 1208959
10 30 1216858
5 2 1224992
9 24 1224992
9 23 1224992
8 18 1253840
7 15 1255591
7 13 1310434
10 27 1322747
8 20 1413722
6 6 1788836
10 26 1808949
5 4 1878275
7 14 1929774
8 17 1944281
9 25 2232703
30 rows selected.
Note that there are three records with the same total order. If we need to screen the top 12 customers now, what will happen if we use rownum?
SQL > select rownum, t.*
2 from (select *
3 from user_order
4 order by customer_sales desc) t
5 where rownum select region_id, customer_id, sum (customer_sales) total
2 rank () over (order by sum (customer_sales) desc) rank
3 dense_rank () over (order by sum (customer_sales) desc) dense_rank
4 row_number () over (order by sum (customer_sales) desc) row_number
5 from user_order
6 group by region_id, customer_id
REGION_ID CUSTOMER_ID TOTAL RANK DENSE_RANK ROW_NUMBER
--
8 18 1253840 11 11 11
5 2 1224992 12 12 12
9 23 1224992 12 12 13
9 24 1224992 12 12 14
10 30 1216858 15 13 15
30 rows selected.
Please note the green highlight above, which vividly demonstrates three different ranking strategies:
For the first same record by ①, the ranking of the three functions is the same: 12
When the second same record appears in ②, Rank and Dense_rank still give the same ranking 12, while row_number increases to 13, and so on to the third same record.
When the ③ rankings proceed to the next different record, you can see that the Rank function vacates the 1313 14 ranking between 12 and 15, because these two rankings are actually occupied by the second and third same records. While Dense_rank increases sequentially. The row_number function is also incremented sequentially.
Comparing the above three different strategies, we have to decide according to the needs of our customers when we choose:
① if the customer only needs a specified number of records, then it is easiest to adopt row_number, but there is a risk of missing records
② if customers need all the records that reach the ranking level, then rank or dense_rank is a good choice. As for which one to choose depends on the needs of the customer, choose dense_rank or get the largest record
Third, use the analysis function to rank the records in groups:
The above ranking is ranked by the total amount of orders, and now it goes a step further: what if it is ranked for the total amount of orders in each region? This means another grouping operation: grouping records by region and then ranking them. Fortunately, Oracle also provides such support, and all we need to do is add a grouping clause before order by in the over function: partition by region_id.
SQL > select region_id, customer_id
Sum (customer_sales) total
2 rank () over (partition by region_id
Order by sum (customer_sales) desc) rank
3 dense_rank () over (partition by region_id
Order by sum (customer_sales) desc) dense_rank
4 row_number () over (partition by region_id
Order by sum (customer_sales) desc) row_number
5 from user_order
6 group by region_id, customer_id
REGION_ID CUSTOMER_ID TOTAL RANK DENSE_RANK ROW_NUMBER
--
5 4 1878275 1 1 1
5 2 1224992 22 2
5 5 1169926 3 3 3
6 6 1788836 1 1 1
6 9 1208959 2 2 2
6 10 1196748 3 3 3
30 rows selected.
Now the rankings we see will be based on each region, not all regions! The role of the Partition by clause in the permutation function is to divide a result set into parts so that the permutation function can be applied to each subset.
Two of the five problems we mentioned above have been solved (1, 2), and the remaining 3 problems (Top/Bottom NMagic FirstLast, NTile) will be explained in the next article.
Oracle Development Project: analysis function 3 (Top/Bottom N, First/Last, NTile)
Catalogue
=
1. Permutations with null values
2.Top/Bottom N query
3.First/Last ranking query
4. Query by hierarchy
First, the arrangement with null values:
In the previous article "Oracle Development Project: analysis function 2 (Rank, Dense_rank, row_number)", we already knew how to arrange a batch of records in full and in groups. What if the arranged data contains null values?
SQL > select region_id, customer_id
2 sum (customer_sales) cust_sales
3 sum (sum (customer_sales)) over (partition by region_id) ran_total
4 rank () over (partition by region_id
5 order by sum (customer_sales) desc) rank
6 from user_order
7 group by region_id, customer_id
REGION_ID CUSTOMER_ID CUST_SALES RAN_TOTAL RANK
--
10 31 6238901 1
10 26 1808949 6238901 2
10 27 1322747 6238901 3
10 30 1216858 6238901 4
10 28 986964 6238901 5
10 29 903383 6238901 6
We can see that there is a record with the CUST_TOTAL field value of NULL, but it is actually ranked first! Obviously, it doesn't make sense. So let's readjust our ranking strategy and take a look at the following sentence:
SQL > select region_id, customer_id
2 sum (customer_sales) cust_total
3 sum (sum (customer_sales)) over (partition by region_id) reg_total
4 rank () over (partition by region_id
Order by sum (customer_sales) desc NULLS LAST) rank
5 from user_order
6 group by region_id, customer_id
REGION_ID CUSTOMER_ID CUST_TOTAL REG_TOTAL RANK
--
10 26 1808949 6238901 1
10 27 1322747 6238901 2
10 30 1216858 6238901 3
10 28 986964 6238901 4
10 29 903383 6238901 5
10 31 6238901 6
In the green highlight, NULLS LAST/FIRST tells Oracle to let the null value rank last and first.
Note that it is NULLS, not NULL.
2. Top/Bottom N query:
In daily work and production, we often encounter such inquiries: find out the top 5 order customers, find out the top 10 sales staff, and so on. Now this is a very simple problem for us. Let's demonstrate it with a practical example:
[1] identify the top 3 major customers with total orders:
SQL > select *
SQL > from (select region_id
SQL > customer_id
SQL > sum (customer_sales) cust_total
SQL > rank () over (order by sum (customer_sales) desc NULLS LAST) rank
SQL > from user_order
SQL > group by region_id, customer_id)
SQL > where rank
[2] identify the top 3 major customers in each region:
SQL > select *
2 from (select region_id
3 customer_id
4 sum (customer_sales) cust_total
5 sum (sum (customer_sales)) over (partition by region_id) reg_total
6 rank () over (partition by region_id
Order by sum (customer_sales) desc NULLS LAST) rank
7 from user_order
8 group by region_id, customer_id)
9 where rank select min (customer_id)
2 keep (dense_rank first order by sum (customer_sales) desc) first
3 min (customer_id)
4 keep (dense_rank last order by sum (customer_sales) desc) last
5 from user_order
6 group by customer_id
FIRST LAST
--
31 1
Here are a few things that seem confusing:
① Why do you use the min function here?
What is this thing called ② Keep?
What does ③ fist/last do?
What's the difference between ④ dense_rank and dense_rank ()? can I change it to rank?
First, answer the first question: the purpose of the min function is to ensure that a unique record is returned when there is more than one First/Last. What will happen if we get rid of it?
SQL > select keep (dense_rank first order by sum (customer_sales) desc) first
2 keep (dense_rank last order by sum (customer_sales) desc) last
3 from user_order
4 group by customer_id
Select keep (dense_rank first order by sum (customer_sales) desc) first
*
ERROR at line 1:
ORA-00907: missing right parenthesis
Let's take a look at the second question: what is keep for? From the above results, we already know that Oracle only "retains" 2 pieces of data for ranking results, which is the role of keep. Tell Oracle to keep only records that meet the keep criteria.
So what is a qualified record? This is the third question. Dense_rank is the strategy that tells Oracle to arrange, and first/last tells the final filter criteria.
Question 4: what if we replace dense_rank with rank?
SQL > select min (region_id)
2 keep (rank first order by sum (customer_sales) desc) first
3 min (region_id)
4 keep (rank last order by sum (customer_sales) desc) last
5 from user_order
6 group by region_id
Select min (region_id)
*
ERROR at line 1:
ORA-02000: missing DENSE_RANK
4. Query by hierarchy:
Now we have seen how to get Top/Bottom N, the first and last record, through Oracle's analysis function. Sometimes we get a demand like this: find out the customers who are in the top 1 to 5 in total order volume.
Familiar, isn't it? We will immediately think of the method mentioned in the second point, but the rank function only ranks for us and does not know the relative position of each ranking in the overall ranking. At this time, we introduce another analysis function, NTile. Let's take the above requirements as an example:
SQL > select region_id
2 customer_id
3 ntile (5) over (order by sum (customer_sales) desc) til
4 from user_order
5 group by region_id, customer_id
REGION_ID CUSTOMER_ID TILE
10 31 1
9 25 1
10 26 1
6 6 1
8 18 2
5 2 2
9 23 3
6 9 3
7 11 3
5 3 4
6 8 4
8 16 4
6 7 5
10 29 5
5 1 5
The Ntil function calculates the proportion of the ranking of each record in the recordset, and we see that all records are divided into five levels, so if we only need the first one to 5 records, we only need to intercept records with a TILE value of 1. If we need the top 25% of the records (that is, 1 Universe 4), then we only need to set ntile (4).
Oracle Development Project: window function
1. Brief introduction of window function
two。 Window function example-full Statistics
3. Window function advance-rolling statistics (cumulative / mean)
4. Window function advance-statistics based on time range
5. Window function advance-first_value/last_value
6. Window function advance-compare adjacent records
1. Brief introduction of window function:
The analysis functions we have learned so far are particularly useful in calculating / counting data over a period of time, but what if the calculation / statistics need to be done along with each record in the recordset? To give some examples:
① lists the total amount of orders for each month and for the whole year
② lists the total monthly orders and the total orders up to the current month
③ lists the total orders for the last month, the current month, and the next month, as well as the total orders for the whole year.
④ lists daily turnover and total turnover over the past week
⑤ lists daily turnover and average daily turnover over the past week
Looking back at the analysis functions we introduced earlier, we will find that these requirements are somewhat different from the previous ones: the analysis functions we introduced earlier are used to calculate / count a clear phase / recordset, while there are some requirements, such as 2, that need to be counted as you traverse each record in the recordset.
In other words, statistics occur not only once, but many times. Statistics do not occur after the formation of the recordset, but during the formation of the recordset.
This is the application of the window function we are going to introduce this time. It applies to the following situations:
① specifies a batch of records: for example, from the current record to the end of the last record in a part
② by specifying a time interval: for example, the first 30 days before the trading day
③ specifies a range value: for example, all records that account for 5% of the current total transaction volume
Second, window function example-full statistics:
Let's take a look at the application of the window function by taking the demand: listing the total amount of orders per month and the total amount of orders for the whole year as an example.
[1] Test environment:
SQL > desc orders
Is the name empty? Types
-
MONTH NUMBER (2)
TOT_SALES NUMBER
SQL >
[2] Test data:
SQL > select * from orders
MONTH TOT_SALES
--
1 610697
2 428676
3 637031
4 541146
5 592935
6 501485
7 606914
8 460520
9 392898
10 510117
11 532889
12 492458
12 rows have been selected.
[3] Test statement:
Recall that in the previous article "Oracle Development Project: analysis function (OVER)", we used sum (sum (tot_sales)) over (partition by region_id) to count the total orders for each partition. Now we want to count not only each partition, but all the partitions. Partition by region_id no longer works here.
Oracle provides a clause for this situation: rows between... Preceding and... Following . Literally guess what it means: all records before XXX and after XXX, how to verify the actual situation through examples:
SQL > select month
2 sum (tot_sales) month_sales
3 sum (sum (tot_sales)) over (order by month
4 rows between unbounded preceding and unbounded following) total_sales
5 from orders
6 group by month
MONTH MONTH_SALES TOTAL_SALES
1 610697 6307766
2 428676 6307766
3 637031 6307766
4 541146 6307766
5 592935 6307766
6 501485 6307766
7 606914 6307766
8 460520 6307766
9 392898 6307766
10 510117 6307766
11 532889 6307766
12 492458 6307766
12 rows have been selected.
The green highlight code plays a key role here, telling oracle to count monthly sales from the first record to the last record. This statistic has been performed 12 times during the formation of the recordset, which is quite time-consuming! But at least we solved the problem.
Unbounded preceding and unbouned following means for the first and last records of all current records, that is, all records in the table. What if we specify directly from the beginning to the end of the first record? Look at the following results:
SQL > select month
2 sum (tot_sales) month_sales
3 sum (sum (tot_sales)) over (order by month
4 rows between 1 preceding and unbounded following) all_sales
5 from orders
6 group by month
MONTH MONTH_SALES ALL_SALES
1 610697 6307766
2 428676 6307766
3 637031 5697069
4 541146 5268393
5 592935 4631362
6 501485 4090216
7 606914 3497281
8 460520 2995796
9 392898 2388882
10 510117 1928362
11 532889 1535464
12 492458 1025347
12 rows have been selected.
It is obvious that this sentence is wrong. In this case, actual 1 does not mean to start with the first record, but refers to the previous record of the current record. The modifier in front of preceding tells the window the number of records referenced when the function is executed, just as unbounded tells oracle that no matter what the current record is, as long as there are many records in front of it, they are included in the scope of statistics.
Third, window function advance-rolling statistics (cumulative / mean):
Consider the second requirement mentioned earlier: list the total monthly orders and the total orders up to the current month. In other words, the record for February should show the sum of the total orders for that month and the total orders for January and February. The total amount of orders for that month and the sum of the total orders for February and March should be displayed in March, and so on.
It is obvious that this demand needs to count the sum of the total orders in the N months when counting the total orders in the N months. Think about the above sentence, if only we could replace and unbounded following with logic that represents the current month! Fortunately, Oracle takes our need into account, so we just need to change the statement slightly to: curreent row.
SQL > select month
2 sum (tot_sales) month_sales
3 sum (sum (tot_sales)) over (order by month
4 rows between unbounded preceding and current row) current_total_sales
5 from orders
6 group by month
MONTH MONTH_SALES CURRENT_TOTAL_SALES
1 610697 610697
2 428676 1039373
3 637031 1676404
4 541146 2217550
5 592935 2810485
6 501485 3311970
7 606914 3918884
8 460520 4379404
9 392898 4772302
10 510117 5282419
11 532889 5815308
12 492458 6307766
12 rows have been selected.
Now we can get the total rolling sales! The following statistics look more perfect, showing all the data we need:
SQL > select month
2 sum (tot_sales) month_sales
3 sum (sum (tot_sales)) over (order by month
4 rows between unbounded preceding and current row) current_total_sales
5 sum (sum (tot_sales)) over (order by month
6 rows between unbounded preceding and unbounded following) total_sales
7 from orders
8 group by month
MONTH MONTH_SALES CURRENT_TOTAL_SALES TOTAL_SALES
--
1 610697 610697 6307766
2 428676 1039373 6307766
3 637031 1676404 6307766
4 541146 2217550 6307766
5 592935 2810485 6307766
6 501485 3311970 6307766
7 606914 3918884 6307766
8 460520 4379404 6307766
9 392898 4772302 6307766
10 510117 5282419 6307766
11 532889 5815308 6307766
12 492458 6307766 6307766
12 rows have been selected.
In some sales reports, we often see the need for averaging, sometimes averaging data for the whole year, sometimes averaging all data so far. It's simple, just put:
Sum (sum (tot_sales)) can be replaced with avg (sum (tot_sales)).
Fourth, the window function is advanced-according to the time range:
As we said earlier, the window function applies not only to the specified recordset for statistics, but also to the specified range for statistics. For example, the following SQL statement counts the sales for the day and the estimated sales within five days:
Select trunc (order_dt) day
Sum (sale_price) daily_sales
Avg (sum (sale_price)) over (order by trunc (order_dt))
Range between interval'2' day preceding
And interval'2' day following) five_day_avg
From cust_order
Where sale_price is not null
And order_dt between to_date ('01Mujulmi 2001')
And to_date ('31Mujulmi 2001')
To count a specified range, Oracle uses the keywords range and interval to specify a range. The above example tells Oracle to find records within the first two days and the next two days of the current date and count their sales averages.
Fifth, the window function advanced-first_value/last_value:
Oracle provides two additional functions, first_value and last_value, to find the first and last records in the window recordset. Suppose our report needs to show the sales of the current month, the previous month, and the next month, as well as the average sales every 3 months, these two functions come in handy.
Select month
First_value (sum (tot_sales)) over (order by month
Rows between 1 preceding and 1 following) prev_month
Sum (tot_sales) monthly_sales
Last_value (sum (tot_sales)) over (order by month
Rows between 1 preceding and 1 following) next_month
Avg (sum (tot_sales)) over (order by month
Rows between 1 preceding and 1 following) rolling_avg
From orders
Where year = 2001
And region_id = 6
Group by month
Order by month
First of all, let's take a look: rows between 1 preceding and 1 following tells Oracle to find and count within the range of the first and the latter of the current records, while first_value and last_value find the first and third records respectively in these three records, so that we can easily get the sales records and averages of the adjacent three months!
6. Window function advance-compare adjacent records:
Through the study of the fifth part, we know how to use the window function to display adjacent records. Now what should we do if we want to display the sales of the current month and the sales of the previous month each time?
From the introduction of the fifth part, we can know that we can do it with first_value (sum (tot_sales) over (order by month rows between 1 preceding and 0 following)). In fact, Oracle has an easier way for us to compare two records, which is the lag function.
The leg function, similar to the preceding and following clauses, can be applied by the relative position of the current record, especially useful when comparing two adjacent records in the same adjacent recordset.
Select month
Sum (tot_sales) monthly_sales
Lag (sum (tot_sales), 1) over (order by month) prev_month_sales
From orders
Where year = 2001
And region_id = 6
Group by month
Order by month
The 1 in lag (sum (tot_sales), 1) is based on January.
Report function: a special project of Oracle development
1. Brief introduction of report function
2.RATIO_TO_REPORT function
1. Brief introduction of report function:
Reviewing the previous section on full statistics in "Oracle Development Topics: window functions", we used the Oracle:
Sum (sum (tot_sales)) over (order by month rows between unbounded preceding and unbounded following)
To calculate the total amount of orders for the whole year, this function is executed every time a record is retrieved during the formation of the recordset, and it executes a total of 12 times. This is very time-consuming. In fact, we have an easier way:
SQL > select month
2 sum (tot_sales) month_sales
3 sum (sum (tot_sales)) over (order by month
4 rows between unbounded preceding and unbounded following) win_sales
5 sum (sum (tot_sales)) over () rpt_sales
6 from orders
7 group by month
MONTH MONTH_SALES WINDOW_SALES REPORT_SALES
--
1 610697 6307766 6307766
2 428676 6307766 6307766
3 637031 6307766 6307766
4 541146 6307766 6307766
5 592935 6307766 6307766
6 501485 6307766 6307766
7 606914 6307766 6307766
8 460520 6307766 6307766
9 392898 6307766 6307766
10 510117 6307766 6307766
11 532889 6307766 6307766
12 492458 6307766 6307766
12 rows have been selected.
The empty parentheses of the over function indicate that all records of the recordset should be included in the statistical range, and if partition by is used, partition first, and then count each partition in turn.
2. RATIO_TO_REPORT function:
The report function (window function) is especially suitable for situations where detailed data and statistics need to be displayed at the same time in the report. For example, there is often a need in a sales report to list the total monthly sales in the previous year, the sales at the end of the year, and the proportion of monthly sales to total annual sales:
Method ①:
Select all_sales.*
100 * round (cust_sales / region_sales, 2) | |'% 'Percent
From (select o.cust_nbr customer
O.region_id region
Sum (o.tot_sales) cust_sales
Sum (sum (o.tot_sales)) over (partition by o.region_id) region_sales
From orders_tmp o
Where o.year = 2001
Group by o.region_id, o.cust_nbr) all_sales
Where all_sales.cust_sales > all_sales.region_sales * 0.2
This is a stupid method and the easiest to understand.
Method ②:
Select region_id, salesperson_id
Sum (tot_sales) sp_sales
Round (sum (tot_sales) / sum (sum (tot_sales))
Over (partition by region_id), 2) percent_of_region
From orders
Where year = 2001
Group by region_id, salesperson_id
Order by region_id, salesperson_id
Method ③
Select region_id, salesperson_id
Sum (tot_sales) sp_sales
Round (ratio_to_report (sum (tot_sales)
Over (partition by region_id), 2) sp_ratio
From orders
Where year = 2001
Group by region_id, salesperson_id
Order by region_id, salesperson_id
The Ratio_to_report function provided by Oracle allows us to calculate the percentage of each record in its corresponding recordset or its subset.
Oracle Development Project: summary of Analysis functions
This is a summary of all previous articles on parsing functions:
I. in terms of statistics:
Sum () Over ([Partition by] [Order by])
Sum () Over ([Partition by] [Order by])
Rows Between Preceding And Following)
Sum () Over ([Partition by] [Order by])
Rows Between Preceding And Current Row)
Sum () Over ([Partition by] [Order by])
Range Between Interval 'Day' Preceding
And Interval 'Day' Following)
For more information, please refer to "Oracle Development Project: analysis function (OVER)" and "Oracle Development Project: window function".
2. Arrangement:
Rank () Over ([Partition by] [Order by] [Nulls First/Last])
Dense_rank () Over ([Patition by] [Order by] [Nulls First/Last])
Row_number () Over ([Partitionby] [Order by] [Nulls First/Last])
Ntile () Over ([Partition by] [Order by])
For details, please refer to "Oracle Development Project: analysis function 2"
Third, the maximum / minimum value search:
Min () / Max () Keep (Dense_rank First/Last [Partition by] [Order by])
For details, please refer to "Oracle Development Project: analysis function 3"
Fourth, the search of the first record / the last record:
First_value / Last_value (Sum () Over ([Patition by] [Order by])
Rows Between Preceding And Following))
For more information, please refer to "Oracle Development Project: window functions"
5. Comparison between adjacent records:
Lag (Sum (), 1) Over ([Patition by] [Order by])
For details, please refer to "Oracle Development Project: report function"
Oracle Development Project: 26 Analytical functions
Reference: "Mastering Oracle SQL" (By Alan Beaulieu, Sanjay Mishra O'Reilly June 2004-00632-2)
PLSQL development notes and summary
* *
Basic structure of PLSQL
* *
Basic data type variable
1. Basic data type
Number digital type
Int integer type
Pls_integer integer type, an error occurs when an overflow occurs
Binary_integer integer type, representing signed integers
Char fixed-length characters with a maximum of 255characters
Varchar2 variable length character type, up to 2000 characters
Long variable length character type, longest 2GB
Date date type
Boolean Boolean type (one of TRUE, FALSE and NULL)
Some of the data types used in PL/SQL and those used in Oracle databases have exactly the same meaning, while others have different meanings.
two。 The definition method of basic data type variables
Variable name type identifier [not null]: = value
Declare
Age number (3): = 26;-length 3, initial value 26
Begin
Commit
End
Where the syntax format of the constant is defined:
Constant name constant type identifier [not null]: = value
Declare
Pi constant number (9): = 3.1415926 pi is a numeric constant with a length of 9 and an initial value of 3.1415926
Begin
Commit
End
Expression.
Variables and constants often need to form various expressions for operation. Here are the operation rules of common expressions in PL/SQL.
1. Numerical expression
Numerical expressions in PL/SQL programs are composed of numeric constants, variables, functions and arithmetic operators. The arithmetic operators that can be used include + (addition),-(subtraction), * (multiplication) and / (division).
) and * (multiplicative), etc.
Execute the following PL/SQL program in the command window, which defines an integer variable named result and calculates the value of 10'3'4-20'5'2. The theoretical result should be 27.
--
Set serveroutput on
Declare
Result integer
Begin
Result:=10+3*4-20 years 5 years 2
Dbms_output.put_line ('result of operation is:' | | to_char (result))
End
--
The output of the dbms_output.put_line function can only be a string, so the to_char function is used to convert the numeric result to a character.
two。 Character expression
Character expressions consist of character-type constants, variables, functions, and character operators, and the only character operator that can be used is the concatenation operator "| |".
3. Relational expression
Relational expressions consist of character expressions or numeric expressions and relational operators, and the following nine relational operators can be used.
< 小于 >Greater than
= equals (not assignment operator: =)
Like is similar to
In is in. In
= greater than or equal to
! = not equal to or
Between is in. Between
Expressions on both sides of relational expression operators must have the same data type.
4. Logical expression
Logical expressions consist of logical constants, variables, functions, and logical operators. Common logical operators include the following three types.
NOT: logical not
OR: logical or
AND: logic vs.
The priority of operation is NOT, AND, and OR.
PLSQL function
PL/SQL programs provide many functions for extension. In addition to the functions available in the standard SQL language, the following three most common data type conversion functions are available.
To_char: converts other types of data to character types.
To_date: converts other types of data to date types.
To_number: converts other types of data to numeric values.
Continue to add..
System output printing
Use pl/sql to print a sentence on the database server:
Set serveroutput on-- sets the database output, which is turned off by default and needs to be reset each time the window is reopened.
BEGIN
DBMS_OUTPUT.PUT_LINE ('Hello PL/SQL')
END
Case insensitive in pl/sql programs (print declared variables)
--
Set serveroutput on
DECLARE
V_char varchar2 (20): ='a'
V_char1 varchar2 (20): ='b'
BEGIN
DBMS_OUTPUT.PUT_LINE (v_char)
DBMS_OUTPUT.PUT_LINE (v_char1)
END
Pl statement block is the smallest programming block in pl/sql, in which begin end can be nested.
Begin
Dbms_output.put_line ('Hello World')
Dbms_output.put_line ('2: 3'| | (2: 3))
Dbms_output.put_line ('what''s')
End
--
Variable declaration in PL/SQL
All variables must be declared in declare, which is not allowed in the program.
The default value of a variable that is not initialized is null, and the null is invisible on the screen. Naming conventions: variables in PL/SQL usually start with v _ (equivalent to the custom of defining variables in the as and begin areas of stored procedures).
Note that number can also store decimals, up to 38 digits, so it is recommended that all integers be saved in binary_integer in the future.
Long is a character type and the boolean type cannot be printed.
Standard variable types: numbers, characters, time, Boolean.
--
Declare
V_number1 number
V_number2 number (3Pol 2)
V_number3 binary_integer: = 1
V_name varchar2 (20): = 'kettas'
V_date date: = sysdate
V_long long: ='ni hao'
Veterb boolean: = true
Begin
If (v_number1 is null) then
Dbms_output.put_line ('hello')
End if
Dbms_output.put_line (v_number1)
Dbms_output.put_line (v_number2)
Dbms_output.put_line (v_number3)
Dbms_output.put_line (v_name)
Dbms_output.put_line (v_date)
Dbms_output.put_line (v_long)
-- dbms_output.put_line (vroomb);-- execute the ORACLE prompt "incorrect number or type of parameters when calling 'PUT_LINE'"
End
--
Note:
With regard to the meaning of the two numbers in parentheses in the declaration number (4pj3), the first number is called precision, and the latter is called scale.
Scale:
When the scale is positive, it indicates the number of digits rounded to the decimal point
When the scale is negative, it indicates the number of digits rounded to the decimal point
Precision:
From the non-zero at the front of the number to the exact position of the scale
V_Number number (4jue 3): = 123.12312
1. Rounding to the scale to get 123.123
2. Determine the position where the scale is accurate to 123123, with an accuracy of 6 digits (. Symbols do not count)
2. Judge the upper limit of 6-digit (> 4) accuracy according to the accuracy-the error report cannot be stored.
Number (3Maxime 3): = 44445
1. Round off according to scale-3 to get 44000
2. Move the decimal point forward by 3 digits 44. This position is the position to which the scale is accurate.
3. Judge 2 digits according to the accuracy (10 then)
Exit
End if
V_i:=v_i+1
Dbms_output.put_line ('hehe')
End loop
Dbms_output.put_line ('over')
End
Simplified writing method of loop
Declare
Veteri binary_integer: = 0
Begin
Loop
Exit when Veteri > 10
Veteri: = v_i+1
Dbms_output.put_line ('hehe')
End loop
Dbms_output.put_line ('over')
End
While cycle
Declare
V_i binary_integer:=0
Begin
While v_iv_id
Begin
V_id:=10
Open c_student
Close c_student
End
The third way of defining cursors, cursors with parameters, is the most frequently used.
Declare
Cursor c_student (v_id binary_integer) is select * from book where id > v_id
Begin
Open c_student (10)
Close c_student
End
When using cursors, don't forget to turn off the cursors.
Declare
V_student book%rowtype
Cursor c_student (v_id binary_integer) is select * from book where id > v_id
Begin
Open c_student (10)
Fetch c_student into v_student
Close c_student
Dbms_output.put_line (v_student.name)
End
How to traverse the cursor fetch
The property of the cursor found,%notfound,%isopen,%rowcount.
% found: if the previous fetch statement returns a row of data,% found returns true, and reports an ORA-1001 exception if used on a cursor that is not open.
% notfound, as opposed to% found behavior.
% isopen to determine whether the cursor is open.
% rowcount: the pointer displacement of the current cursor, the number of rows of data retrieved by the current position cursor, referenced if not opened, and returns ORA-1001.
Note:
The usage of no_data_found is different from that of% notfound. The summary is as follows.
1) SELECT. . . INTO statement triggers no_data_found
2) trigger% notfound when an explicit cursor (static and dynamic) where clause is not found
3) trigger sql%notfound when the where clause of the UPDATE or DELETE statement is not found
4) in the cursor Fetch loop, use% notfound or% found to determine the exit condition of the loop, not no_data_found.
Here are a few examples:
Create table BOOK
(
ID VARCHAR2 (10) not null
BOOKNAME VARCHAR2 (10) not null
PRICE VARCHAR2 (10) not null
CID VARCHAR2 (10) not null
);
-- insert
Create or replace procedure say_hello (
I_name in varchar2
O_result_msg out varchar2
)
As
V_price varchar2 (100)
E_myException exception
Begin
Insert into book (id,bookname,price) values (1pm 2pm 3)
O_result_msg: = 'success'
Exception
When others then
Rollback
O_result_msg: = substr (sqlerrm, 1200)
End
-- update or delete
Create or replace procedure say_hello (
I_name in varchar2
O_result_msg out varchar2
)
As
V_price varchar2 (100)
E_myException exception
Begin
Update book set price = '55' where bookname = i_name
Delete from book where bookname = i_name
If sql%notfound then
Raise e_myException
End if
/ *
If sql%rowcount = 0 then-- 2
Raise e_myException
End if
, /
O_result_msg: = 'success'
Exception
When e_myException then
Rollback
O_result_msg: = 'update or delete dail'
End
-- select
Create or replace procedure say_hello (
I_name in varchar2
O_result_msg out varchar2
)
As
V_price varchar2 (100)
E_myException exception
Begin
Select price into v_price from book where bookname = i_name
O_result_msg: = 'success'
Exception
When no_data_found then
Rollback
O_result_msg: = 'select into dail'
End
Traversing cursors in loop mode
Declare
V_bookname varchar2 (100)
Cursor c_book (i_id number) is select bookname from book where id = i_id
Begin
Open c_book (i_id)
Loop
Fetch c_book into v_bookname
Exit when c_student%notfound
Update book set price = '33' where bookname = v_bookname
End Loop
Close c_book
End
Or
Declare
V_bookname varchar2 (100)
Cursor c_book (i_id number) is select bookname from book where id = i_id
Begin
Open c_book (i_id)
Fetch c_book into v_bookname
While c_book%Found
Loop
Update book set price = '33' where bookname = v_bookname
Fetch c_book into v_bookname
End Loop
Close c_book
End
The while loop traverses the cursor. Note that the first time the cursor is opened, the fetch,%found is null, and the loop cannot be entered.
Solution: while nvl (cymbidium) loop
Declare
V_bookname varchar2 (100)
Cursor c_book (i_id number) is select bookname from book where id = i_id
Begin
Open c_book (i_id)
While nvl (croupbook% magic true)-or this way: while c_book%found is null or c_book%found loop
Fetch c_book into v_bookname
Update book set price = '33' where bookname = v_bookname
End Loop
Close c_book
End
For loop traversal is the simplest and most frequently used, and there is no need to declare Close cursors and fetch operations (without opening and closing cursors to achieve the most efficient way to traverse cursors)
Declare
Cursor c_book (i_id number) is select bookname from book where id = i_id
Begin
For cur in c_book (i_id)-- pass the input parameter i_id directly to cursor
Loop
Update book set price = '53' where bookname = cur.bookname
End loop
End
For goto examples, it is generally not recommended to use goto, which will change the structure of the program.
Declare
I number:=0
Begin
If iTunes 0 then
Goto hello
End if
Begin
Dbms_output.put_line ('hello')
Goto over
End
Begin
Dbms_output.put_line ('world')
Goto over
End
Dbms_output.put_line ('over')
End
* *
Oracle stored procedure
* *
When talking about some rules in the writing of a stored procedure, take a look at the rules that execute it, and execute the stored procedure sp_get_product_prompt in the command window
Set serveroutput on
Var ret1 varchar2 (200)
Var ret2 varchar2 (200)
Exec sp_get_product_prompt (83 Ret1);-- or execute
Print ret1
Print ret2
Or
Set serveroutput on
Declare
Ret1 varchar2 (200)
Ret2 varchar2 (200)
Begin
Sp_get_product_prompt (83 ret 1 ret 2)
Dbms_output.put_line (ret1)
Dbms_output.put_line (ret2)
End
Stored procedure input parameters, regardless of type, the default value is null, input parameters and output parameters cannot have length, in which the keyword as can be replaced by is, and the variables in the stored procedure are declared between as and begin. At the same time, other stored procedures can be called in the stored procedure. If you want to ensure that the transaction between stored procedures will not be affected, it can be defined as an autonomous transaction.
Create or replace procedure say_hello (
V_name in varchar2
V_flag number
O_ret out number
)
As
Begin
If v_name is null and v_flag is null then-- both v_name and v_flag are equal to null
O_ret: = 10
Else
O_ret: = 100
End if
End
Give the default value when the input parameter is null
Create or replace procedure say_hello (
I_name in varchar2
I_flag number
O_ret out number
)
As
V_name varchar2 (100)
Begin
If i_name is null then
V_name: ='0'
Else
V_name: = i_name
End if
Insert into phone (.., wname..,) values (.., vault _ name _)
End
Or directly call the nvl function in the insert statement to assign the default value
Insert into phone (.., wname..,) values (.., nvl);-if''is written as'', then the value of vname in insert is still''equivalent to a null value.
Stored procedure with one parameter
Input parameter in, input parameter cannot be assigned: =, but it can be assigned to the variable defined later in as
Input parameter in, which can be used as a variable to judge the condition.
If you don't write it by default, it is in.
The stored procedure is not overloaded, and the parameterized say_hello replaces the existing nonparametric say _ hello.
Create or replace procedure say_hello (v_name in varchar2)
As
Begin
The stored procedure input parameter v_name cannot be used as the assignment target.
Dbms_output.put_line ('hello' | | v_name)
End
The input parameters of stored procedure are used as variables to judge the condition.
Create or replace procedure say_hello (
I_opFlag in number
)
As
V_name varchar2 (100)
Begin
If i_opFlag = 1 then
V_name: ='0'
Else
V_name: = ''
End if
Dbms_output.put_line ('hello' | | v_name)
End
Use the variables defined in the stored procedure to handle the null values of the input parameters:
Create or replace procedure say_hello (
I_name in varchar2
)
As
V_name varchar2 (100)
Begin
If i_name is null then
V_name: ='0'
Else
V_name: = isimplnamebomachi-assign input to the definition variable
End if
Dbms_output.put_line ('hello' | | v_name)
End
Stored procedure with multiple parameters
Create or replace procedure say_hello (
V_first_name in varchar2
V_last_name in varchar2)
As
Begin
Dbms_output.put_line ('hello' | | v_first_name | |'. | | v_last_name)
End
Out output parameters, which are used to assign values to one or more variables using stored procedures, similar to return values
Create or replace procedure say_hello (
V_name in varchar2
V_content out varchar2
)
Begin
Vastly contentlyVaritieshello` | v_name
End
Call:
Declare
V_con varchar2 (200)
V_in varchar2 (20): = 'wang'
Begin
Say_hello (VenerinMagnevicicon)
Dbms_output.put_line (v_con)
End
In out parameter, which is both assigned and taken
Create or replace procedure say_hello (v_name in out varchar2)
As
Begin
V_name:='hi'| | v_name
End
Call:
Declare
V_inout varchar2 (20): = 'wangsu'
Begin
Say_hello (v_inout)
Dbms_output.put_line (v_inout)
End
Assign default values to input parameters of stored procedures
Create or replace procedure say_hello (
V_name varchar2 default 'susu'
V_content varchar2 default 'hello'
)
As
Begin
Dbms_output.put_line (v_name | |''| | v_content)
End
Call: (it is better to call by specifying the name of the formal parameter)
Begin
Say_hello ()
End
Or
Begin
Say_hello ('cheng')
End
Or
Begin
Say_hello (vault name = > 'cheng')
End
* *
Function in PLSQL
* *
The difference between FUNCTION and PROCEDURE
1. The function has a return value, but the procedure does not.
2. The function call is in an expression, and the procedure is used as a statement of the pl/sql program
Procedures and functions are stored in the database in compiled form, and functions can have no parameters or have multiple parameters and a return value. Process
There are zero or more parameters and no return value. Functions and procedures can receive or return zero or more values, functions and procedures through a parameter list
The main difference is not in the return value, but in the way they are called. The procedure is called as a separate execution statement, and the function is expressed legally.
The mode of calling
Create or replace function func (v_name in varchar2)
Return varchar2
Is
Begin
Return (v_name | | 'hello')
End
Call:
Declare
V_name varchar2 (20)
Begin
V_name:=func ('cheng')
Dbms_output.put_line (v_name)
End
Functions with out arguments
Create or replace function func (
V_name in varchar2
V_content out varchar2
)
Return varchar2
Is
Begin
V_content:=v_name | | 'hello'
Return v_content
End
Call:
Declare
V_name varchar2 (20)
V_name1 varchar2 (20)
Begin
V_name1:=func ('susu',v_name);-- returns the vname value
Dbms_output.put_line (v_name1);-- print func results
Dbms_output.put_line (v_name);-- print v_name results
End
Functions with in out arguments
Create or replace function func (
V_name in out varchar2)
Return varchar2
Is
Begin
V_name:=v_name | | 'hello'
Return 'cheng'
End
Call:
Declare
V_inout varchar2 (20): = 'world'
V_ret varchar2 (20)
Begin
V_ret:=func (v_inout);-- returns the value of the call vinout (as an output parameter)
Dbms_output.put_line (v_ret);-- print func results
Dbms_output.put_line (v_inout);-- returns v_name result
End
Part of the content is referenced to CSDN: http://download.csdn.net/source/405714
The content of the article continues to be updated.. You are welcome to give us your advice.
Brief introduction of Analysis function
Note: n represents numeric type, C represents character type, D represents date type, [] indicates that inner parameters can be ignored, and fmt represents format.
The parsing function calculates group by-based columns, and the rows queried in groups are called "window", and the comparisons are redefined for each row during execution according to over (). The range of rows that perform the calculation is determined by reference to "current row". This point must be understood clearly. It is the principle of analyzing functions to generate data. If this is blurred, then you may not be so handy when applying analysis functions.
Parsing functions are very similar to the aggregate functions discussed in the previous section, except that they return multiple rows for each grouping sequence. In the example in this section, the two functions will be applied at the same time to better reflect the difference between them. Through the exercises in this chapter, I believe you will notice that some of the aggregate functions and analysis functions are the same command, and this is indeed the case. If you distinguish from the grammatical format, the aggregate function without over () is the aggregate function, and the over () is the parsing function:)
It is important to note that in addition to the operation of the order by clause, the parsing function will be executed last in the SQL statement. Therefore, parsing functions can only be applied to select columns or order by clauses (remember, don't throw them into where, group by, having, and so on). Therefore, the results of functions with the same name as aggregate functions and analysis functions may be different because of the different operational logic here.
At the same time, some parsing functions support distinct when selecting columns. If you specify this parameter, you can only specify the partition clause in the over condition, not the order by clause.
The grammatical structure of the analysis function is more complex, but most functions have the same grammatical structure, so it is introduced uniformly before, and the function grammatical structure is not explained too much in the subsequent introduction of a single function.
Basically all analysis functions are in this format:
Function name ([parameter]) OVER (analytic_clause)
Analytic_clause contains: [partition clause] [order clause [window clause]]
L Partition clause: Partition by exp1 [, exp2]...
Partition has nothing to say, has few powerful parameters, and is mainly used for grouping, which can be understood as group by in select. However, it does not conflict with the group by clause followed by the select statement.
L Order clause: Order by exp1 [ASC | desc] [, exp2 [asc | desc]]. [nulls first | last]. Some functions support the window clause.
The parameters of Order by are basically the same as those of order by in select. Let's just take it that way. Nulls first | last is used to define the position of nulls in the packet sequence. We know that the definition of null in oracle is unknown, so nulls always comes first when ordery is default. The nulls first | last parameter can come in handy if you want to control a column with a value of null.
L Window clause: En, post a picture
It looks complex, but it is actually simple, and the probability of application is quite low, so I won't introduce it in detail.
L AVG ([DISTINCT | ALL] expr) OVER (analytic_clause) calculates the average.
For example:
-- aggregate function
SELECT col, AVG (value) FROM tmp1 GROUP BY col ORDER BY col
-- Analysis function
SELECT col, AVG (value) OVER (PARTITION BY col ORDER BY col)
FROM tmp1
ORDER BY col
L SUM ([DISTINCT | ALL] expr) OVER (analytic_clause)
For example:
-- aggregate function
SELECT col, sum (value) FROM tmp1 GROUP BY col ORDER BY col
-- Analysis function
SELECT col, sum (value) OVER (PARTITION BY col ORDER BY col)
FROM tmp1
ORDER BY col
L COUNT ({* | [DISTINCT | ALL] expr}) OVER (analytic_clause) queries the number of rows in each group sequence.
For example:
-- the number of grouped queries col
SELECT col,count (0) over (partition by col order by col) ct FROM tmp1
L FIRST () fetches the first row from the collection returned by DENSE_RANK.
For example:
-- aggregate function
SELECT col
MIN (value) KEEP (DENSE_RANK FIRST ORDER BY col) "Min Value"
MAX (value) KEEP (DENSE_RANK LAST ORDER BY col) "Max Value"
FROM tmp1
GROUP BY col
-- Analysis function
SELECT col
MIN (value) KEEP (DENSE_RANK FIRST ORDER BY col) OVER (PARTITION BY col)
MAX (value) KEEP (DENSE_RANK LAST ORDER BY col) OVER (PARTITION BY col)
FROM tmp1
ORDER BY col
You can see that the results are basically similar, but the result of ex1 is the column after group by, while ex2 has a return for every row.
L LAST () is the same as above and will not be detailed.
For example, see the above example.
L FIRST_VALUE (col) OVER (analytic_clause) returns the first record queried by the over () condition
For example:
Insert into tmp1 values ('test6','287')
SELECT col
FIRST_VALUE (value) over (partition by col order by value) "First"
LAST_VALUE (value) over (partition by col order by value) "Last"
FROM tmp1
L LAST_VALUE (col) OVER (analytic_clause) returns the last record queried by the over () condition
For example, see the above example.
L LAG (col [, n] [, n]) over ([partition_clause] order_by_clause) lag is an interesting function that returns the value of the N1 row before the specified column col (N2 if the previous N1 row has been out of the reference range, N2 is returned by default). If N1 is not specified, the default value is 1.
For example:
SELECT col
Value
LAG (value) over (order by value) "Lag"
LEAD (value) over (order by value) "Lead"
FROM tmp1
L LEAD (col [, n] [, n]) over ([partition_clause] order_by_clause) is the opposite of the above function, which returns the value of row N1 after the specified column col.
For example, see the above example
L MAX (col) OVER (analytic_clause) gets the maximum value in the packet sequence.
For example:
-- aggregate function
SELECT col
Max (value) "Max"
Min (value) "Min"
FROM tmp1
GROUP BY col
-- Analysis function
SELECT col
Value
Max (value) over (partition by col order by value) "Max"
Min (value) over (partition by col order by value) "Min"
FROM tmp1
L MIN (col) OVER (analytic_clause) gets the minimum value in the packet sequence.
For example, see the above example.
L RANK () OVER ([partition_clause] order_by_clause) has been introduced in the previous aggregate functions of RANK and DENSE_RANK. No nonsense here. Let's just look at the example.
For example:
Insert into tmp1 values ('test2',120)
SELECT col
Value
RANK () OVER (order by value) "RANK"
DENSE_RANK () OVER (order by value) "DENSE_RANK"
ROW_NUMBER () OVER (order by value) "ROW_NUMBER"
FROM tmp1
L DENSE_RANK () OVER ([partition_clause] order_by_clause)
For example, see the above example.
L ROW_NUMBER () OVER ([partition_clause] order_by_clause) this function needs to say two more sentences, through the above comparison, I believe you should be able to see some clues. As mentioned earlier, if dense_rank encounters a duplicate value in a column when sorting, the sequence value of the row of the duplicate value is the same, and the sequence value of the subsequent sequence value is still increasing, while rank is that the sequence value of the row of the duplicate value is the same, but the sequence value after that increases from + the number of duplicate rows, while row_number always increases the sequence value regardless of whether there are duplicate rows or not.
For example, see the above example.
L CUME_DIST () OVER ([partition_clause] order_by_clause) returns the relative position of the row in the grouping sequence, with a return value between 0 and 1. Note that if the column of the order by is desc, the largest row in the group returns the column value 1, and if the order by is asc, the smallest row in the group returns the column value 1.
For example: SELECT col, value, CUME_DIST () OVER (ORDER BY value DESC) FROM tmp1
L NTILE (n) OVER ([partition_clause] order_by_clause)
Ntile is an interesting statistical function. It will group records according to the number of groups you specify (n).
For example: SELECT t. Tile (5) over (order by value desc) FROM tmp1 t
L PERCENT_RANK () OVER ([partition_clause] order_by_clause) is similar to CUME_DIST in that this function returns the relative position of each line in the packet sequence. The return value is also between 0 and 1, but the starting value is always 0 and the ending value is always 1.
For example: SELECT col, value, PERCENT_RANK () OVER (ORDER BY value) FROM tmp1
L PERCENTILE_CONT (n) WITHIN GROUP (ORDER BY col [DESC | ASC]) OVER (partition_clause)
The function of this function is exactly the same as that described in the previous aggregate function, except that one is the aggregate function and the other is the analysis function.
For example:
-- aggregate function
SELECT col, max (value), min (value), sum (value)
PERCENTILE_CONT (0.5) WITHIN GROUP (ORDER BY value) a
PERCENTILE_CONT (0.8) WITHIN GROUP (ORDER BY value) b
FROM TMP1
Group by col
-- Analysis function
SELECT col
Value
Sum (value) over (partition by col) "Sum"
PERCENTILE_CONT (0.5) WITHIN GROUP (ORDER BY value) OVER (PARTITION BY col) "CONTa"
PERCENTILE_CONT (0.8) WITHIN GROUP (ORDER BY value) OVER (PARTITION BY col) "CONTb"
FROM TMP1
L PERCENTILE_DISC (n) WITHIN GROUP (ORDER BY col [DESC | ASC]) OVER (partition_clause)
The function of this function is exactly the same as that described in the previous aggregate function, except that one is the aggregate function and the other is the analysis function.
For example:
-- aggregate function
SELECT col, max (value), min (value), sum (value)
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY value) a
PERCENTILE_DISC (0.8) WITHIN GROUP (ORDER BY value) b
FROM TMP1
Group by col
-- Analysis function
SELECT col
Value
Sum (value) over (partition by col) "Sum"
PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY value) OVER (PARTITION BY col) "CONTa"
PERCENTILE_DISC (0.8) WITHIN GROUP (ORDER BY value) OVER (PARTITION BY col) "CONTb"
FROM TMP1
L RATIO_TO_REPORT (col) over ([partition_clause]) this function calculates the percentage of the col column values of this row in the packet sequence sum (col). If the col column is empty, a null value is returned.
For example:
SELECT col, value
RATIO_TO_REPORT (value) OVER (PARTITION BY col) "RATIO_TO_REPORT"
FROM TMP1
L STDDEV ([distinct | all] col) OVER (analytic_clause) returns the standard deviation of the column.
For example:
-- aggregate function
SELECT col, STDDEV (value) FROM TMP1 GROUP BY col
-- Analysis function
SELECT col, value
STDDEV (value) OVER (PARTITION BY col ORDER BY value) "STDDEV"
FROM TMP1
L STDDEV_SAMP (col) OVER (analytic_clause) function is the same as above, except that if the packet sequence has only one line, the STDDEV_SAMP function returns a null value and STDDEV returns 0.
For example:
-- aggregate function
SELECT col, STDDEV (value), STDDEV_SAMP (value) FROM TMP1 GROUP BY col
-- Analysis function
SELECT col, value
STDDEV (value) OVER (PARTITION BY col ORDER BY value) "STDDEV"
STDDEV_SAMP (value) OVER (PARTITION BY col ORDER BY value) "STDDEV_SAMP"
FROM TMP1
L STDDEV_POP (col) OVER (analytic_clause) returns the overall standard deviation of the grouped sequence
For example:
-- aggregate function
SELECT col, STDDEV_POP (value) FROM TMP1 GROUP BY col
-- Analysis function
SELECT col, value
STDDEV_POP (value) OVER (PARTITION BY col ORDER BY value) "STDDEV_POP"
FROM TMP1
L VAR_POP (col) OVER (analytic_clause) returns the total variance of the packet sequence, and VAR_POP calculates as follows: (SUM (expr2)-SUM (expr) 2 / COUNT (expr)) / COUNT (expr)
For example:
-- aggregate function
SELECT col, VAR_POP (value) FROM TMP1 GROUP BY col
-- Analysis function
SELECT col, value
VAR_POP (value) OVER (PARTITION BY col ORDER BY value) "VAR_POP"
FROM TMP1
L VAR_SAMP (col) OVER (analytic_clause) is similar to the above, this function returns the sample variance of the grouping sequence, and its formula is: (SUM (expr2)-SUM (expr) 2 / COUNT (expr)) / (COUNT (expr)-1)
For example:
-- aggregate function
SELECT col, VAR_SAMP (value) FROM TMP1 GROUP BY col
-- Analysis function
SELECT col, value
VAR_SAMP (value) OVER (PARTITION BY col ORDER BY value) "VAR_SAMP"
FROM TMP1
L VARIANCE (col) OVER (analytic_clause) this function returns the variance of the grouping sequence, which is calculated by Oracle as follows:
Returns 0 if the number of lines in the expression is 1, and VAR_SAMP if the number of lines in the expression is greater than 1
For example:
-- aggregate function
SELECT col, VAR_SAMP (value), VARIANCE (value) FROM TMP1 GROUP BY col
-- Analysis function
SELECT col, value
VAR_SAMP (value) OVER (PARTITION BY col ORDER BY value) "VAR_SAMP"
VARIANCE (value) OVER (PARTITION BY col ORDER BY value) "VARIANCE"
FROM TMP1
The above content is how to understand the oracle analysis function, have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.