How to implement KMP algorithm in Java 04/19 Update SLTechnology News&Howtos

How to implement KMP algorithm in Java

2025-04-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the relevant knowledge of "how to realize the KMP algorithm in Java". The editor shows you the operation process through the actual case. The operation method is simple, fast and practical. I hope this article "how to realize the KMP algorithm in Java" can help you solve the problem.

Graphic illustration

The idea of kmp algorithm is similar to that of bm algorithm. As mentioned earlier, there is a concept of a good suffix in the bm algorithm, while there is a concept of a good prefix in kmp. Let's look at the following example.

Looking at the example above, the abcde that has been matched is called a good prefix, and a does not match the subsequent bcde, so there is no need to slide directly to e again.

What if there are matching characters in the prefix?

Looking at the example above, if we slide directly after a good prefix, we will overslide and miss the matching string. So how do we slide reasonably according to a good prefix?

In fact, it is to see whether the prefix and suffix of the current good prefix match, find the longest matching length, and slide directly. Since we can find the longest matching length more than once, we can initialize an array and save the longest matching length under the current good prefix, and then our next array will come out.

We define a next array that represents the length of the longest matching substring of the prefix and suffix under the current good prefix. This longest matching length indicates that the substring has been matched before, and there is no need to match again. The match starts directly from the next character of the substring.

Whether we need to match every character every time we calculate next [I], and whether we can deduce from next [I-1] to reduce unnecessary comparisons.

With this in mind, let's take a look at the following steps:

Suppose next [I-1] = k-1

If modelStr [k] = modelStr [I], then next [I] = k

If modelStr [k]! = modelStr [I], can we directly assume that next [I] = next [I-1]?

Through the above example, we can see very clearly that next [I]! = Next [I-1], then when modelString [k]! = modelString [I], we know next [0], next [1]... Next [I-1], how to push out next [I]?

Suppose modelStr [x... I] is the longest suffix substring that the prefix suffix can match, then the longest matching prefix substring is modelStr [0... IMurx]

When we are looking for the longest matching string, the second longest matching string in front of him (excluding the current I), that is, modelStr [x... IMur1] should have been solved before, so we only need to find a matching string that has been solved, assuming that the prefix substring is modelStr [0... I-x-1], the suffix substring is modelStr [x... IMur1], and modelStr [iMurx] = = modelStr [I], this prefix suffix substring is the secondary prefix substring, plus the current character is the longest matching prefix suffix substring.

Code implementation

First of all, the most important next array in the kmp algorithm, this array marks the number of characters of the longest prefix suffix matching substring up to the current subscript. In the kmp algorithm, if a prefix is a good prefix, that is, matching with the pattern string prefix, we can use certain techniques to slide forward more than one character, see the previous explanation. We don't know in advance which prefixes are good, and the matching process occurs more than once, so we initially call an initialization method to initialize the next array.

1. If the next character of the longest prefix substring of the previous character = = the current character, the longest prefix substring of the previous character can be directly added to the current character

two。 If not, you need to find that the next character of the longest prefix substring that exists before is equal to that of the current substring, and then set the longest prefix suffix substring of the current substring.

Int [] next; / * initialize the next array * @ param modelStr * / public void init (char [] modelStr) {/ / first compute the next array / / traversal modelStr, and the traversed characters form a string next = new int [modelStr.length]; int start = 0; while (start)

< modelStr.length) { next[start] = this.recursion(start, modelStr); ++ start; } } /** * * @param i 当前遍历到的字符 * @return */ private int recursion(int i, char[] modelStr) { //next记录的是个数，不是下标 if (0 == i) { return 0; } int last = next[i -1]; //没有匹配的,直接判断第一个是否匹配 if (0 == last) { if (modelStr[last] == modelStr[i]) { return 1; } return 0; } //如果last不为0，有值，可以作为最长匹配的前缀 if (modelStr[last] == modelStr[i]) { return next[i - 1] + 1; } //当next[i-1]对应的子串的下一个值与modelStr不匹配时，需要找到当前要找的最长匹配子串的次长子串 //依据就是次长子串对应的子串的下一个字符==modelStr[i]; int tempIndex = i; while (tempIndex >

0) {last = next [tempIndex-1]; / / find the matching substring if (modelStr [last] = = modelStr [I]) where the first next character is the current character; {return last + 1;}-tempIndex;} return 0;}

Then start using the next array to match, match from the first character to find the first mismatched character, and then determine whether it is a complete match and return directly, no, to determine whether the first character does not match, but to match directly behind. If you have a good prefix, you take advantage of the next array, and you know from the next array where you can start the match, and you don't have to match the previous one.

Public int kmp (char [] mainStr, char [] modelStr) {/ / start matching int I = 0, j = 0; while (I + modelStr.length)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.