Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the problem of character processing similar to regular expressions in SqlServer

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

Editor to share with you how to solve SqlServer character processing problems similar to regular expressions, I believe most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

SQL Serve provides simple character fuzzy matching features, such as like and patindex, but it is not enough for some character processing scenarios. Several daily problems encountered are:

1. How many times does the same character / string appear?

two。 The position of the nth occurrence of the same character

3. Multiple identical characters are consecutive and merged into one character

4. Whether it is a valid IP/ ID card number / mobile phone number, etc.

one。 How many times does the same character / string appear?

The same character can be calculated by replacing it with an empty string

Declare @ text varchar (1000) declare @ str varchar (10) set @ text = 'ABCBDBE'set @ str =' B'select len (@ text)-len (replace (@ text,@str,''))

The same string is still replaced, because it is multiple characters, method 1 needs to be divided once after replacement, and method 2 does not need to add a character when replacing.

Method 1declare @ text varchar (1000) declare @ str varchar (10) set @ text = 'ABBBCBBBDBBBE'set @ str =' BBB'select (len (@ text)-len (replace (@ text,@str,'') / len (@ str)-method 2declare @ text varchar (1000) declare @ str varchar (10) set @ text = 'ABBBCBBBDBBBE'set @ str =' BBB'select len (replace (@ text,@str,@str+'_'))-len (@ text)

two。 The same character / string, the position of the Nth occurrence

The function of SQL SERVER to locate the character position is CHARINDEX:

CHARINDEX (expressionToFind, expressionToSearch [, start_location])

You can start the search from the specified location, but you can't take the location where the Nth occurrence occurs. You need to write your own SQL to supplement it. There are several ways of thinking:

1. Custom function that adds a count to the charindex at a time in the loop until N

If object_id ('NthChar','FN') is not null drop function NthcharGOcreate function NthChar (@ source_string as nvarchar (4000), @ sub_string as nvarchar (1024), @ nth as int) returns int as begin declare @ postion int declare @ count int set @ postion = CHARINDEX (@ sub_string) @ source_string) set @ count = 0 while @ postion > 0 begin set @ count = @ count + 1 if @ count = @ nth begin break end set @ postion = CHARINDEX (@ sub_string, @ source_string, @ postion + 1) End return @ postion end GO--select dbo.NthChar ('abcabc','abc',2)-- 4

two。 Through CTE, treat the entire table field operation processed, adding a count to charindex each time recursively until N

If object_id ('tempdb..#T') is not null drop table # Tcreate table # T (source_string nvarchar (4000)) insert into # T values (N' We') insert into # T values (N' me') declare @ sub_string nvarchar (1024) declare @ nth intset @ sub_string = N' we 'set @ nth = 2 With T (source_string, starts, pos, nth) as (select source_string, 1, charindex (@ sub_string, source_string), 1 from # t union all select source_string, pos + 1, charindex (@ sub_string, source_string, pos + 1), nth+1 from Twhere pos > 0) select source_string, pos, nthfrom Twhere pos 0 and nth = @ nthorder by source_string, starts--source_string pos nth-- We 3 2

3. With the help of digital table (tally table), to do charindex at different starting points, you need to construct a digital table first.

-- numbers/tally tableIF EXISTS (select * from dbo.sysobjects where id = object_id (N' [dbo]. [Numbers]') and OBJECTPROPERTY (id, NumberIsUserTable`) = 1) DROP TABLE dbo.Numbers--= Create and populate the Tally table on the fly SELECT TOP 1000000 IDENTITY (int,1,1) AS number INTO dbo.Numbers FROM master.dbo.syscolumns sc1 Master.dbo.syscolumns sc2--= Add a Primary Key to maximize performance ALTER TABLE dbo.Numbers ADD CONSTRAINT PK_numbers_number PRIMARY KEY CLUSTERED (number)-- = create a digital table above Allow the general public to use it GRANT SELECT ON dbo.Numbers TO PUBLIC-- once You don't need to create DECLARE @ source_string nvarchar (4000), @ sub_string nvarchar (1024), @ nth intSET @ source_string = 'abcabcvvvvabc'SET @ sub_string =' abc'SET @ nth = 2 every time WITH T AS (SELECT ROW_NUMBER () OVER (ORDER BY number) AS nth, number AS [Position In String] FROM dbo.Numbers n WHERE n.number 0) begin select @ str=replace (@ str,'','') endselect @ str

two。 Split the string according to the space

For each segment of the split string trim or replace, connect with a space, which is a bit tedious. No code example is written. For more information on how to split the string, please refer to "the position of the Nth occurrence".

four。 Whether it is a valid IP/ ID card number / mobile phone number, etc.

Strings such as IP/ ID card number / mobile phone number often have their own specific rules. It is possible to judge bit by bit or segment by segment through substring, but the performance of SQL statements is often poor, so it is recommended to try regular functions, see below.

five。 Regular expression function

1. Oracle

Starting with 10g, you can use regular expressions in queries, which are implemented through functions that support regular expressions:

Oracle 10 gREGEXP_LIKEREGEXP_REPLACEREGEXP_INSTRREGEXP_SUBSTROracle 11g (New) REGEXP_COUNT

Oracle uses the REGEXP function to deal with the above problems:

(1) how many times does the same character / string appear?

Select length (regexp_replace, [^ -]',') from dual;select REGEXP_COUNT ('123,345-566) from dual;-- Oracle 11g

(2) the position of the nth occurrence of the same character / string

There is no need for regularization, and the instr of ORACLE can find the location directly:

Instr ('source_string','sub_string' [, n] [, m])

N indicates that the search starts with the nth character, the default value is 1, m indicates the m occurrence, and the default value is 1.

Select instr ('abcdefghijkabc','abc', 1,2) position from dual

(3) multiple identical characters are consecutive and merged into one character

Select regexp_replace (trim ('agc f f'),'\ s beautiful from dual')

(4) whether it is a valid IP/ ID card number / mobile phone number, etc.

-whether it is a valid IPWITH IPAS (SELECT '10.20.30.40' ip_address FROM dual UNION ALLSELECT' a.b.c.d'ip_address FROM dual UNION ALLSELECT '256.123.0.254' ip_address FROM dual UNION ALLSELECT' 255.255.255.255' ip_address FROM dual) SELECT * FROM IPWHERE REGEXP_LIKE (ip_address) '^ (([0-9] {1} | [0-9] [0-9] | 1 [0-9] [0-9] | 2 [0-4] [0-9] | 25 [0-5])\.) {3} ([0-9] {1} | [0-9] [0-9] | 1 [0-9] [0-9] | 2 [0-4] [0-9] | 25 [0-5]) $') -- whether it is a valid ID card / mobile phone number. There is no example yet.

2. SQL Server

Currently, the latest version is SQL Server 2017, and there is no support for REGEXP function, which needs to be extended by general CLR. Implement REG_REPLACE for CLR as follows:

-1. Open CLR EXEC sp_configure 'show advanced options',' 1'GORECONFIGUREGOEXEC sp_configure 'clr enabled',' 1'GORECONFIGUREGOEXEC sp_configure 'show advanced options',' 0 destroy go

two。 Create Assembly

-3. Create the CLR function CREATE FUNCTION [dbo]. [regex_replace] (@ input [nvarchar] (4000), @ pattern [nvarchar] (4000), @ replacement [nvarchar] (4000)) RETURNS [nvarchar] (4000) WITH EXECUTE AS CALLER, RETURNS NULL ON NULL INPUTAS EXTERNAL NAME [RegexUtility] .RegexUtility. [RegexReplaceDefault] GO--4. Use regex_replace to replace multiple spaces with a single space select dbo.regex_replace ('agc f f','\\ white space')

Note: implement more REGEXP functions through CLR, which can be developed by yourself if you have high-level language development capabilities, or you can directly use some open source contributions, such as http://devnambi.com/2016/sql-server-regex/

The above is all the content of the article "how to solve the character processing problem of SqlServer similar to regular expressions". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report