In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article shows you how to use regular expressions in oracle10g database, the content is concise and easy to understand, can definitely make your eyes bright, through the detailed introduction of this article, I hope you can get something.
If there is a string aabcd and a (b | c) d search is specified, the search will be followed by b or c, followed by an of d.
Regular expression:'a (b | c) d'
Matching string: 'aabcd'aabcd indicates result * search an and match successfully * search b but failed mismatch * search c failed, reset and continue search mismatch * search an and match successfully * search b and succeed Remember c as a selection match * search d but fail to match * search as the last memory option c failed, reset and continue to search for mismatch * search a failed, reset and continue search mismatch * search a failed, reset and continue search mismatch * search a failed, reset and continue search mismatch
A (b | c) d does not match the given string aabcd.
To implement regular expression support in SQL or PL/SQL, a new set of functions is needed. These functions are:
The function name indicates that REGEXP_LIKE is similar to the LIKE operator, but performs regular expression matching instead of simple pattern matching REGEXP_INSTR searches a given string for a regular expression pattern and returns the location of the match. REGEXP_REPLACE searches for a regular expression pattern and replaces it with a replacement string REGEXP_SUBSTR searches for a regular expression pattern in a given string and returns a matching substring
Metacharacter
Metacharacters are characters with special meaning, such as wildcard characters, repeating characters, mismatched characters, or a range of characters.
You can use multiple predefined metacharacter symbols in patterns that match the function.
Symbol description *
Match zero or more matches
| |
The selective operator used to specify selective matches
^ / $
Match the beginning and end of a line
[]
A square bracket expression used to match a list (to match any expression in the list)
[^ exp]
If the delimited character is inside the square brackets, the expression is rejected.
{m}
Exact matching m times
{m,n}
Match at least m times, but not more than n times
[::]
Specify a character class and match any character in that class
\
It can have four different meanings: (1) to represent itself; (2) to reference the next character; (3) to introduce an operator; and (4) to do nothing.
+
Match one or more matches
?
Match zero or one match
.
Match any character in the supported character set (except NULL)
()
Group expressions (as a subexpression)
\ n
Backward reference expression
[=]
Specify equivalence class
[..]
Specify a comparison element (such as a multi-character element)
Use basic search
The following example demonstrates the use of a regular expression function. Perform the following steps:
1.
In the terminal window, change to the / home/oracle/wkdir directory and start SQL*Plus.
Connect to Oracle using the user ID oe/oe and password oe/oe.
Cd / home/oracle/wkdirsqlplus oe/oe
two。
Check the syntax of the REGEXP_LIKE function:
REGEXP_LIKE (srcstr, pattern [, match_option])
In this function:
Srcstr: search valu
Pattern: regular expression
Match_option: the option to change the default match. You can contain one or more of the following values:
"c": use case-sensitive matching (default)
"I": use case-sensitive matching
"n": an operator that allows you to match any character
"m": treats the source string as multiple lines
To find all products whose names contain SSP/S, SSP/V, SSS/V, or SSS/S in the PRODUCT_NAME column of the PRODUCT_INFORMATION table, execute the following script:
@ relike.sql
The relike.sql script contains the following SQL:
SELECT product_nameFROM oe.product_informationWHERE regexp_like (product_name,'SS [PS] / [VS]')
3.
The REGEXP_INSTR function returns the position of the given pattern in the string. Check the syntax:
REGEXP_INSTR (srcstr, pattern [, position [, occurrence [, return_option [, match_option])
In this function:
Position: search start location
Occurrence: the match to search for
Return_option: indicates the beginning or end of the match
Match_option: the option to change the default match. You can contain one or more of the following values:
"c": use case-sensitive matching (default)
"I": use case-sensitive matching
"n": an operator that allows you to match any character
"m": treats the source string as multiple lines
To search the product name to determine the location of the first non-alphabetic character, whether it is uppercase or lowercase, execute the following script:
@ reinstr.sql
The reinstr.sql script contains the following SQL:
COLUMN non_alpha FORMAT 9999999999
SELECT product_name, REGEXP_INSTR (product_name,'[^ [: alpha:]]') non_alpha
FROM oe.product_information
Note that [^ [:]] represents a character class and matches any character in that class; [: alpha:] matches any alphabetic character. In this example, you will reject the expression by using ^.
4.
The REGEXP_SUBSTR function returns the given string based on the pattern of the match. Check the syntax:
REGEXP_SUBSTR (srcstr, pattern [, position [, occurrence [, match_option])
In this function:
Position: search start location
Occurrence: the match to search for
Match_option: the option to change the default match. You can contain one or more of the following values:
"c": use case-sensitive matching (default)
"I": use case-sensitive matching
"n": an operator that allows you to match any character
"m": treats the source string as multiple lines
To extract e-mail names from the CUSTOMERS table, extract only the e-mail names of customers located in Switzerland. To do this, return the contents of the CUST_EMAIL column (the NLS_TERRITORY before the customer's @ symbol equals Switzerland). Execute the following script:
@ resubstr.sql
The resubstr.sql script contains the following SQL:
SELECT REGEXP_SUBSTR (cust_email,'[^ @] +')
FROM oe.customers
WHERE nls_territory = 'SWITZERLAND'
Notice that in this example, the result returns the first substring without the @ symbol.
5.
The EGEXP_REPLACE function returns the "replaced" substring in the given string. Check the syntax:
REGEXP_REPLACE (srcstr, pattern [, replacestr [, position [, occurrence [, match_option])
In this function:
Position: search start location
Occurrence: the match to search for
Replacestr: string that replaces the pattern
Match_option: the option to change the default match. You can contain one or more of the following values:
"c": use case-sensitive matching (default)
"I": use case-sensitive matching
"n": an operator that allows you to match any character
"m": treats the source string as multiple lines
To return the information in the CATALOG_URL column of the PRODUCT_INFORMATION table, you can perform a full scan of the column. But this will result in hundreds of rows being returned because it lists specific HTML page locations in multiple directory fields. In this example, however, you only want to find the individual domain names themselves, not the low-level pages they contain. To find domain names that do not contain all unnecessary information, use the REGEXP_REPLACE function. Execute the following script:
@ rereplace.sql
The rereplace.sql script contains the following SQL:
SELECT UNIQUE REGEXP_REPLACE (catalog_url, 'http://([^/]+).*','\ 1')
FROM oe.product_information
The following is an introduction to how to handle strings:
Http:// the expression first looks for the string literal; there are no special metacharacters here. The expression then searches for a series of characters (as long as they are not slashes (/)). . * the expression ends when you use this part to traverse the rest of the string. \ 1 the matching expression is replaced with a backward reference 1, which is anything that matches in the first set of parentheses.
Use multilingual capabilities
Regular expression functions support multilingual functionality and can be used in locale-sensitive applications. To combine regular expressions with the NLS language feature of Oracle, perform the following steps:
1.
Execute the following script to find the product description in Portuguese:
@ multiport.sql
The multiport.sql script contains the following SQL:
SELECT regexp_substr (to_char (translated_name),'^ [a Murz] +') FROM oe.product_descriptionsWHERE language_id = 'PT'AND translated_name like' G%'
Notice that the data is not displayed.
^ is outside the square brackets, which means that you will search for any string or substring that begins with any character, from a to z.
two。
Execute the same query, but this time use a case-sensitive "I". Execute the following script:
@ multiport2.sql
The multiport2.sql script contains the following SQL:
SELECT regexp_substr (to_char (translated_name),'^ [a Murz] +', 1, 1,'i') FROM oe.product_descriptionsWHERE language_id = 'PT'AND translated_name like' G%'
3.
Because the returned string is truncated when a non-English character is encountered, the result is still incomplete. This is because the range [a _ MZ] is sensitive to NLS_LANGUAGE. Therefore, you need to set the NLS_LANGUAGE parameter accordingly to return the full result. Execute the following query:
@ multiport3.sql
The multiport3.sql script contains the following SQL:
ALTER SESSION SET NLS_LANGUAGE=PORTUGUESE;SELECT regexp_substr (to_char (translated_name),'^ [a Murz] +', 1, 1,'i') FROM oe.product_descriptionsWHERE language_id = 'PT'AND translated_name like' G%'
4.
The final step is to review the results in English and Portuguese to ensure that the translation has been completed. Execute the following script:
@ multiport4.sql
The multiport4.sql script contains the following SQL:
SELECT REGEXP_SUBSTR (i.product_name,'^ [amurz] +', 1, 1,'i') |'='
| | regexp_substr (to_char (d.translated_name),'^ [a murz] +', 1, 1,'i') |
FROM oe.product_descriptions d, oe.product_information i
WHERE d.language_id = 'PT'
AND d.translated_name like'G%'
AND i.product_id = d.product_id; ALTER SESSION SET NLS_LANGUAGE=AMERICAN
Return to the list of topics
Regular expressions and check constraints
To use regular expressions in check constraints, perform the following steps:
1.
Adding a check constraint to the CUST_EMAIL column of the CUSTOMERS table ensures that only strings containing the @ symbol are accepted. Execute the following script:
@ chkemail.sql
The chkemail.sql script contains the following SQL:
INSERT INTO customers VALUES, cust_address_typ ('1003 Canyon Road','87501','Santa Fe','NM','US'), phone_list_typ (' + 1 505 243 4144'), 'us','AMERICA','100',' ChrisP+creme.com', 149, null, null
Null)
Because no validation was performed, an e-mail address that did not contain the @ symbol was accepted. Perform a rollback before proceeding to the next step.
ROLLBACK
two。
Enforce the constraint by executing the following script:
@ chkemail2.sql
The chkemail2.sql script contains the following SQL:
ALTER TABLE customersADD CONSTRAINT cust_email_addrCHECK (REGEXP_LIKE (cust_email,'@')) NOVALIDATE
3.
Execute @ chkemail.sql to test the constraint again.
Because the e-mail address does not contain the required symbol, it violates the check constraint. The NOVALIDATE clause ensures that existing data is not checked.
4.
Delete the constraint by executing the following script:
@ chkemail3.sql
The chkemail3.sql script contains the following SQL:
ALTER TABLE customers DROP CONSTRAINT cust_email_addr
Backward reference
A useful feature of regular expressions is the ability to store subexpressions for later reuse; this is also known as a backward reference (which is summarized in Table 10). It allows complex substitution functions, such as swapping patterns in new locations or displaying repeated words or letters. The matching portion of the subexpression is saved in a temporary buffer. The buffer is numbered from left to right and uses the
The\ digit symbol is accessed, where digit is a number between 1 and 9 that matches the digit subexpression, which is displayed in a set of parentheses.
The following example shows the conversion of the name Ellen Hildi Smith to Smith Ellen Hildi by referencing each subexpression by number.
SELECT REGEXP_REPLACE (
'Ellen Hildi Smith'
'(. *)',\ 3,\ 1\ 2')
FROM dual
REGEXP_REPLACE ('EL
-
Smith, Ellen Hildi
The SQL statement shows three separate subexpressions enclosed in parentheses. Each individual subexpression contains a matching metacharacter (. Followed by the * metacharacter, indicating that any character (except the newline character) must match zero or more times Spaces separate subexpressions, and spaces must match. Parentheses create a subexpression that gets a value and can be referenced with\ digit. The first subexpression is assigned\ 1, the second\ 2, and so on. These backward references are used in the last argument (\ 3,\ 1\ 2) of this function, which effectively returns replacement substrings and arranges them in the desired format (including commas and spaces). Table
11 details the components of the regular expression.
Backward references are useful for replacing, formatting, and replacing values, and you can use them to find adjacent values. The following example shows the use of the REGEP_SUBSTR function to find any repeated alphanumeric values separated by spaces. The results displayed give a substring that identifies the recurring word is.
SELECT REGEXP_SUBSTR (
'The final test is is the implementation'
'([[: alnum:]] +) ([[: space:]] +)\ 1') AS substr
FROM dual
SUBSTR
-
Is is
Match parameter option
You may have noticed that regular expression operators and functions contain an optional matching parameter. This parameter controls whether case sensitivity, newline matching, and multiline input are preserved.
Practical Application of regular expression
You can use regular expressions not only in queues, but also anywhere you use SQL operators or functions, such as in the PL/SQL language. You can write triggers that take advantage of regular expressions to validate, generate, or extract values.
The following example demonstrates how you can apply the REGEXP_LIKE operator to data validation in a column check constraint. It verifies the correct social security number format when inserted or updated. Social security numbers in formats such as 12345-6789 and 123456789 are acceptable values for such column constraints. Valid data must start with three numbers, followed by a hyphen, plus two digits and a hyphen, and finally four digits. The other expression allows only nine consecutive numbers. The vertical bar symbol (|) separates the options.
ALTER TABLE students ADD CONSTRAINT stud_ssn_ck CHECK (REGEXP_LIKE (ssn,'^ ([[: digit:]] {3}-[[: digit:]] {2}-[[: digit:]] {4} | [[: digit:]] {9}) $'))
The beginning or end characters indicated by ^ and $are not acceptable. Make sure your regular expression is not split into multiple lines or contains any unnecessary spaces, unless you want it to be formatted and matched accordingly. Table
12 illustrates the various components of the regular expression example.
The next step is to visit the Oracle Database 10g page: / global/cn/products/database/oracle10g/index.html
Compare regular expressions with existing features
Regular expressions have several advantages over the common LIKE operators and INSTR, SUBSTR, and REPLACE functions. These traditional SQL functions are not convenient for pattern matching. Only the LIKE operator matches by using the% and _ characters, but LIKE does not support repetition of expressions, complex alternations, character ranges, character lists, POSIX character classes, and so on. In addition, the new regular expression function allows you to detect recurring words and pattern exchanges. The examples here give you an overview of the field of regular expressions and how you can use them in your application.
Really enrich your toolkit
Because regular expressions help solve complex problems, they are very powerful. Some functions of regular expressions are difficult to emulate with traditional SQL functions. When you learn about the basic building blocks of this slightly mysterious language, regular expressions will be an integral part of your toolkit (not only in SQL but also in other programming languages). Although trial and error are sometimes necessary to make your patterns right, there is no doubt about the simplicity and power of regular expressions.
Alice Rischert (ar280@yahoo.com) is chairman of database application development and design in the Department of computer Technology and applications at Columbia University. She wrote Oracle.
SQL interaction Manual version 2 (Prentice Hall,2002) and upcoming Oracle
SQL example (Prentice Hall,2003). Rischert has more than 15 years of experience as a database designer, DBA, and project manager at Fortune 100 companies, and she has been using Oracle products since Oracle version 5.
Table 1: positioning metacharacters
The metacharacter description ^ positions the expression to the beginning of the line $to the end of the line
Table 2: quantifier or repetition operator
Quantifier description * match 0 or more times? Match 0 or 1 time + match 1 or more times {m} exactly m times {m,} at least m times {m, n} at least m times but not more than n times
Table 3: predefined POSIX character classes
Character class description [: alpha:] alphabetic character [: lower:] lowercase alphabetic character [: upper:] uppercase alphanumeric character [: digit:] numeric [: alnum:] alphanumeric character [: space:] white space character (printing prohibited), such as carriage return, line feed, vertical tab and feed [: punct:] punctuation character [: cntrl:] control character (printing prohibited) [: print:] printable character
Table 4: substitution matching and grouping of expressions
Metacharacter description | substitution delimited substitution option, usually used with the grouping operator () to group subexpressions into a replacement unit, quantifier unit, or backward reference unit (see "backward reference" section) [char] character list represents a character list; most metacharacters in a character list (except for character classes, ^ and-metacharacters) are understood as text
Table 5:REGEXP_LIKE operator
Syntax states that REGEXP_LIKE (source_string, pattern [, match_parameter]) source_string supports character data types (CHAR, VARCHAR2, CLOB, NCHAR, NVARCHAR2 and NCLOB, but does not include LONG). The pattern parameter is another name for a regular expression. Match_parameter allows optional parameters such as handling newline characters, preserving multiline formatting, and providing case-sensitive control.
Table 6:REGEXP_INSTR function
Syntax description REGEXP_INSTR (source_string, pattern [, start_position [, occurrence [, return_option [, match_parameter]) this function looks for pattern and returns the first location of the pattern. Feel free to specify the start_position you want to start the search for. The occurrence parameter defaults to 1 unless you specify that you want to find a pattern that appears next. The default value of return_option is 0, which returns the starting position of the pattern, and a value of 1 returns the starting position of the next character that meets the matching criteria.
Table 7: description of a 5-digit plus 4-digit zip code expression
Syntax description must match blank [: digit:] POSIX numeric class] the end of the {5} character list repeats exactly 5 times (the beginning of the subexpression-a literal hyphen) Because it is not a range metacharacter in a character list [: digit:] POSIX [: digit:] class [beginning of character list] end of character list {4} character list repeats exactly 4 times) closing parentheses, ending subexpression? The quantifier matches the subexpression of the grouping 0 or 1, so that the 4-bit code can choose the $positioning metacharacter to indicate the end of the line.
Table 8:REGEXP_SUBSTR function
Syntax description REGEXP_SUBSTR (source_string, pattern [, position [, occurrence [, match_parameter]) the REGEXP_SUBSTR function returns a substring that matches the pattern.
Table 9: REGEXP_REPLACE function
Syntax description REGEXP_REPLACE (source_string, pattern [, replace_string [, position [, occurrence, [match_parameter]) this function replaces the matching pattern with a specified replace_string, allowing complex "search and replace" operations.
Table 10: backward reference metacharacters
The metacharacter indicates that the\ digit backslash is followed by a number between 1 and 9, which matches the digit subexpression enclosed in parentheses. Note: backslashes have another meaning in regular expressions, depending on the context, and may also represent Escape characters.
Table 11: description of pattern exchange regular expressions
Regular expression project description (the beginning of the first subexpression. Match any single character except newline character * repeat operator, match the previous. Metacharacters 0 to n times) the end of the first subexpression; the matching result is obtained in\ 1 (in this case, the result is Ellen. The white space that must exist (the beginning of the second subexpression. Matches any single character except the newline character * repeat operator, matching the previous. Metacharacter 0 to n times) the end of the second subexpression; the matching result is obtained in\ 2 (in this case, the result is Hildi. Blank (the beginning of the third subexpression. Match any single character except newline character * repeat operator, match the previous. Metacharacter 0 to n times) the end of the third subexpression; the matching result is obtained in\ 3 (in this case, the result is Smith. )
Table 12: description of regular expressions for social security numbers
The regular expression project states the first character of the ^ line (regular expressions cannot have any leading characters until they match. ) (start subexpression and list alternate options separated by | metacharacters [: digit:] POSIX numeric class] end of character list {3} character list repeats exactly 3 times-hyphen [beginning of character list [: digit:] POSIX numeric class] character list ends {2} character list repeats exactly 2 times-another hyphen [character list] The end of the {4} character list at the beginning of [: digit:] POSIX numeric class] character list is repeated exactly 4 times | replace metacharacters End the first option and start the next replacement expression [: digit:] POSIX numeric class] the end of the character list {9} repeats exactly 9 times) ending parentheses, ending the subexpression group $positioning metacharacters used for replacement, indicating the end of the line There are no extra characters to match the pattern. The above is how to use regular expressions in the oracle10g database. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.