Using TensorFlow to implement a simple CAPTCHA recognition process 07/02 Update SLTechnology News&Howtos

Using TensorFlow to implement a simple CAPTCHA recognition process

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

In this paper, we use TensorFlow to implement a deep learning model, which is used to realize the process of CAPTCHA recognition. The CAPTCHA identified here is the graphic CAPTCHA. First of all, we will use the labeled data to train a model, and then use the model to realize the CAPTCHA recognition.

1. CAPTCHA preparation

Here we use python's captcha library to generate it. This library is not installed by default, so here we need to install this library first, and we also need to install the pillow library.

Cdn.xitu.io/2019/5/27/16af775b82262772?imageView2/0/w/1280/h/960/format/webp/ignore-error/1 ">

Once installed, we can generate a simple graphic CAPTCHA with the following code

We can see that the text in the picture is exactly what we define, so we can get a picture and its corresponding real text, and then we can use it to generate a batch of training data and test data.

two。 Pretreatment

Data preprocessing must be carried out before training. Now we first define the text content of the CAPTCHA to be generated, which is equivalent to having label, and then we use it to generate CAPTCHA, and then we can get the input data x. Here we first define our input vocabulary, because the vocabulary of uppercase and lowercase letters plus numbers is relatively large. Suppose we use a CAPTCHA with uppercase and lowercase letters and numbers, a CAPTCHA with four characters, then the total possible combination is (26 + 26 + 10) ^ 4 = 14776336, which is a bit large to train, so let's simplify it here. Only use pure numeric CAPTCHA to train, so the number of combinations becomes 10 ^ 4 = 10000, which is obviously a lot less.

So here we first define a vocabulary and its length variable:

Here VOCAB is the content of the vocabulary, that is, the 10 numbers from 0 to 9, the number of characters in the CAPTCHA, that is, CAPTCHA_LENGTH is 4, and the length of the vocabulary is the length of VOCAB, that is, 10.

Next, we define a method for generating CAPTCHA data. The process is similar to the above, except that here we convert the returned data into an array in the form of Numpy:

By calling this method, we can get a Numpy array, which actually converts the CAPTCHA into the RGB of each pixel. Let's call this method to try:

The contents are as follows:

You can see that its shape is (60,160,3), which actually means that the height and width of the CAPTCHA image is 60,160,60x160 pixels, and each pixel has a RGB value, so the last dimension is the RGB value of the pixel.

Next, we need to define label. Since we need to use a deep learning model for training, it is best to use One-Hot encoding for our label data, that is, if the text of the CAPTCHA is 1234, then the index position of the vocabulary should be set to 1, and the total length is 40. Let's use the program to convert One-Hot encoding and text:

Here, the text2vec () method converts the real text into One-Hot encoding, and the vec2text () method converts the One-Hot encoding back to the real text.

For example, when we call these two methods here, we convert 1234 text to One-Hot encoding, and then turn it back:

In this way, we can convert the text to One-Hot encoding.

Then we can construct a batch of data. The x data is the Numpy array of the CAPTCHA, and the y data is the One-Hot encoding of the CAPTCHA text. The generated content is as follows:

Here we define a getrandomtext () method that can randomly generate CAPTCHA text, and then use this randomly generated text to generate the corresponding x and y data, and then write the data into the pickle file, thus completing the preprocessing operation.

3. Build a model

With the data, let's start building the model. Here we use the traintestsplit () method to divide the data into three parts, the training set, the development set, and the validation set:

Next, we use three datasets to build three Dataset objects:

Then initialize an iterator and bind it to the dataset:

Next comes the key part. Here we use three-layer convolution and two-layer fully connected network to construct. To simplify the writing, we directly use the layers module of TensorFlow:

Here the convolution kernel size is 3. SAME mode is used for padding and relu is used for the activation function.

After the transformation of the fully connected network, the shape of y becomes [batchsize, nclasses], and our label is composed of CAPTCHALENGTH One-Hot vectors. Here we want to use cross entropy to calculate, but when calculating cross entropy, the sum of the elements in the last dimension of the label parameter vector must be 1, otherwise there will be problems in calculating the gradient. For more information, please see the official documentation of TensorFlow:

Https://www.tensorflow.org/apidocs/python/tf/nn/softmaxcrossentropywithlogits

But now the label parameter is composed of CAPTCHALENGTH One-Hot vectors, so the sum of the elements here is CAPTCHALENGTH, so we need to re-reshape to make sure that the sum of the elements in the last dimension is 1:

In this way, we can ensure that the last dimension is the VOCAB_LENGTH length, and it is an One-Hot vector, so the sum of the elements must be 1.

Then Loss and Accuracy are ready to calculate:

And then carry out the training:

Here we first initialize traininitializer, bind iterator to Train Dataset, then execute trainop to get loss, acc, gstep, and so on, and output them.

Training

Run the training process, and the results are similar to the following:

test

We can also save the model every few Epoch during the training process:

Of course, you can also save the model with the highest accuracy on the verification set.

We can re-Reload the model during verification, and then verify it:

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.