Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Practical Application of gawk gsub function

2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly explains "the practical application of gawk gsub function". Friends who are interested may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn the practical application of gawk gsub function.

When doing a data cleaning requirement, you need to query duplicate data with the same fields in both tables. The general idea is to use the statements statement, similar to:

Select *

From a

Where exists (select 1

From b

Where a.col1 = b.col1

And a.col2 = a.col2)

But the trouble here is that there are too many columns to match:

A.INFOCODE,a.SOURCENAME,a.SOURCETYPE,a.PUBLISHTYPE,a.NOTICEDATE,a.ENDDATE,a.NOTICETITLE,a.LANGUAGE,a.IMPORTLEVEL,a.SOURCEURL,a.ATTACHTYPE,a.ATTACHNAME,a.ATTACHSIZE,a.FORM,a.ACCESSORYNUM,a.NOTICESTATE,a.PUBLISHDATE,a.FILENUMBER

Solve this problem with Linux text processing:

Put this paragraph in a text first:

Root@bd-dev-mingshuo-183:/tmp#more 1

A.INFOCODE,a.SOURCENAME,a.SOURCETYPE,a.PUBLISHTYPE,a.NOTICEDATE,a.ENDDATE,a.NOTICETITLE,a.LANGUAGE,a.IMPORTLEVEL,a.SOURCEURL,a.ATTACHTYPE,a.ATTACHNAME,a.ATTACHSIZE,a.FORM,a.ACCESSORYNUM

A.NOTICESTATErect a.PUBLISHDATErea.FILENUMBER here introduce the gsub function in gawk

Gsub matches all the contents that conform to the regular expression, and then replaces it, which is equivalent to sed's Unigram g'.

The syntax is as follows:

Gsub (regular expression, subsitution string, target string)

The target range of processing is the third field, and the matching condition is the first parameter, and after matching, it is replaced with the second parameter.

Process one line of text as multiple lines of text:

Root@bd-dev-mingshuo-183:/tmp#more 1 | gawk 'gsub (/, /, "\ n", $0)'

A.INFOCODE

A.SOURCENAME

A.SOURCETYPE

A.PUBLISHTYPE

A.NOTICEDATE

A.ENDDATE

A.NOTICETITLE

A.LANGUAGE

A.IMPORTLEVEL

A.SOURCEURL

A.ATTACHTYPE

A.ATTACHNAME

A.ATTACHSIZE

A.FORM

A.ACCESSORYNUM

A.NOTICESTATE

A.PUBLISHDATE

A.FILENUMBER copies each column:

Root@bd-dev-mingshuo-183:/tmp#more 1 | gawk 'gsub (/, /, "\ n", $0)' | gawk-F'\ n'{print "on", $0, "=", $0, "and"}'

On a.INFOCODE = a.INFOCODE and

On a.SOURCENAME = a.SOURCENAME and

On a.SOURCETYPE = a.SOURCETYPE and

On a.PUBLISHTYPE = a.PUBLISHTYPE and

On a.NOTICEDATE = a.NOTICEDATE and

On a.ENDDATE = a.ENDDATE and

On a.NOTICETITLE = a.NOTICETITLE and

On a.LANGUAGE = a.LANGUAGE and

On a.IMPORTLEVEL = a.IMPORTLEVEL and

On a.SOURCEURL = a.SOURCEURL and

On a.ATTACHTYPE = a.ATTACHTYPE and

On a.ATTACHNAME = a.ATTACHNAME and

On a.ATTACHSIZE = a.ATTACHSIZE and

On a.FORM = a.FORM and

On a.ACCESSORYNUM = a.ACCESSORYNUM and

On a.NOTICESTATE = a.NOTICESTATE and

On a.PUBLISHDATE = a.PUBLISHDATE and

On a.FILENUMBER = a.FILENUMBER and

Replace

Root@bd-dev-mingshuo-183:/tmp#more 1 | gawk 'gsub (/, /, "\ n", $0)' | gawk-F'\ n'{print $0, "=" = ", $0," and "}'| sed's and = a sed

A.INFOCODE = b.INFOCODE and

A.SOURCENAME = b.SOURCENAME and

A.SOURCETYPE = b.SOURCETYPE and

A.PUBLISHTYPE = b.PUBLISHTYPE and

A.NOTICEDATE = b.NOTICEDATE and

A.ENDDATE = b.ENDDATE and

A.NOTICETITLE = b.NOTICETITLE and

A.LANGUAGE = b.LANGUAGE and

A.IMPORTLEVEL = b.IMPORTLEVEL and

A.SOURCEURL = b.SOURCEURL and

A.ATTACHTYPE = b.ATTACHTYPE and

A.ATTACHNAME = b.ATTACHNAME and

A.ATTACHSIZE = b.ATTACHSIZE and

A.FORM = b.FORM and

A.ACCESSORYNUM = b.ACCESSORYNUM and

A.NOTICESTATE = b.NOTICESTATE and

A.PUBLISHDATE = b.PUBLISHDATE and

A.FILENUMBER = b.FILENUMBER and

The processing process is relatively simple, focusing on the application of gsub function in gawk, as well as processing ideas.

At this point, I believe that you have a deeper understanding of the "practical application of gawk gsub function", you might as well come to the actual operation! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 284

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report