Sharing skills of dealing with complex data by Excel 04/25 Update SLTechnology News&Howtos

Sharing skills of dealing with complex data by Excel

2025-04-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

The title of the original text: "the Excel catalog, which is used every day, still hides this magic skill that you don't know?" "

Hello, everyone. I am a farmer, a farmer who specializes in difficult and complicated numbers.

Yesterday, my colleague suddenly threw me such a data.

Let me divide it into four columns like this.

Can this still baffle me as a little data expert?

Don't talk too much nonsense, try it.

When you get such data, the first thing that comes to mind in the minds of most friends must be the [breakdown] function of Excel.

However, the actual operation found that neither the separator nor the fixed width can do anything about it.

So ~

Ahem! Back to the point, before we formally deal with this data, let's analyze the characteristics of the current data.

Data analysis We look at the data and find that there are the following characteristics:

First, the red column in the first column is full of numbers, and the blue column in the second column is full of text.

The sorting function in Excel requires an identifier, such as a space.

So, all we have to do is insert an identifier between the number and the text.

Secondly, the blue column of the second column and the red column of the third column have the same characteristics as the previous columns, except that the words and numbers have changed places.

Finally, let's look at the third red column and the fourth green column. We found that both columns were numbers.

However, after careful analysis, you will find that the fourth column, the green column, has a decimal point, and the decimal point is preceded by a number.

In this way, can't we add the decimal point to the previous digit for feature recognition?

Well, now that we have all the ideas, what are we going to do next?

Data processing for data processing, we can use Word wildcards to help.

For example, if you want to search the words at the beginning of the word "farmer" and write in [Nong [Yi-tortoise] {1,}], you can match with farmers, rural areas, agriculture, agricultural economy, rural cooperatives, farmland.

If you write [Nong *], only the word "Nong" will be found.

This is actually done using regular expressions.

But what is a regular expression?

Is equivalent to the sublimation of wildcards.

Popular understanding is the super-matching pattern of text data, specifying the desired text in a more concise way. Similar to the one-to-many pattern.

Among them, one is the general pattern composed of metacharacters, and most of them are the text data in accordance with the pattern.

Just like: fruit (the regular expression you wrote) includes apples, pears, bananas, cherries, grapefruit. (matches the text data on. )

Since we are dealing with text data, we can't just confine ourselves to Excel. After all, all three musketeers of Office have their own skills.

Among them, Word is the leader of the three Musketeers in the text field.

❶ Word data processing ① pastes the data into Word first, and selects "selective paste"-[keep text only] during the operation.

Add a separator between the ② serial number and the department.

Press the [Ctrl+H] shortcut key to enter the replacement page and enter: [([0-9]) ([1-turtle])] in the search content.

[([0-9])]:

Select any number from 0 to 9 and set it as a group.

[(1-turtle])]:

Select any Chinese character in the document and set it as a group.

[([0-9]) ([1-turtle])]:

Only numbers in front and Chinese characters in back can be selected and divided into two groups at the same time.

Enter: [\ 1 |\ 2] in the replacement content

[1] represents any number found in the first set of [0-9] patterns; [2] represents any text found in the second set of [one-tortoise] patterns; and adds a delimiter [|] between the two sets of data.

After that, click "more > >", and then check "use wildcard", otherwise, the search will fail.

Because wildcards are special characters in text values that replace unknown characters, it is easy to find data that matches based on a particular pattern.

Take [*?] as an example, checking the wildcard will make Word realize that [*?] is no longer an ordinary character, but a special matching symbol.

Take a wave of moving pictures.

Add a separator between ③ department and workload.

Press the [Ctrl+F] shortcut key to enter the replacement page, and enter: [(1-turtle]) ([0-9])] in the search content.

[(1-turtle])]:

Select any Chinese character in the document and set it as a group.

[([0-9])]:

Select any number from 0 to 9 and set it as a group.

[(1-tortoise]) ([0-9])]:

Only Chinese characters in front and numbers in back can be selected and divided into two groups at the same time.

Enter: [\ 1 |\ 2] in the replacement content

[\ 1] represents any text found in the first set of [one-tortoise] mode

[\ 2] represents any number found in the second set of [0-9] patterns

And add a delimiter [|] between the two sets of data.

Add a separator between the ④ workload and the work cycle.

Press the [Ctrl+F] shortcut to enter the replacement page and enter: [([0-9]) ([0-9]) (.)] in the search content.

[([0-9])]:

Select any number from 0 to 9 and set it as a group.

[(.)]:

Select the symbol [.] and set it as a group.

[([0-9]) ([0-9]) (.)]:

Only the first is a number, the second is a number, and the third is a period, which can be divided into three groups.

Enter: [\ 1 |\ 2\ 3] in the replacement content

[\ 1] represents any number found in the first set of [[0-9]] patterns

[\ 2] represents any number found in the second set of [[0-9]] patterns.

[\ 3] represents the period in the third group, and adds a delimiter [|] between the first and second groups.

After the above settings are set, the data can be glued back to Excel for processing.

▋ Excel data processing ⑤ is broken down using Excel.

After the delimiter is added, you can easily use the [separate] function to break down the data.

Then use the fixed width of [separate] to deal with the following names.

Finally, we got the specification data like this.

The summary of knowledge must be kept in mind and don't worry when you get the data.

First of all, the characteristics of the data should be analyzed, and the corresponding software should be used for preliminary processing according to the characteristics of the data.

Word+Excel processing:

❶ analyzes the data and glues into the Word

❷ builds a matching model for regular expressions, such as [([- turtle]) ([0-9])]

❸ check [use wildcards] and group

❹ adds delimited identity

❺ glued back to Excel [separate] treatment.

PS: each symbol in the wildcard is in English under half a corner!

In fact, in addition to the above treatment, there is also a PQ method. Due to the limited space, I will briefly talk about the next steps. Interested students can try it by themselves.

PQ processing:

❶ select any cell in the data area, click "data Tab"-"get data"-"from File"-Select "from folder"

❷ split-[by number of characters]

❸ split-- [by conversion from non-numeric to numeric]

❹ split-[by conversion from numeric to non-numeric].

This article comes from the official account of Wechat: Akiba Excel (ID:excel100), author: farmer

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.