How to realize handwritten digit recognition in spl 04/27 Update SLTechnology News&Howtos

How to realize handwritten digit recognition in spl

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article is about how to realize handwritten digit recognition in spl. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Recognizing handwritten Arabic numerals is very simple for humans, but it is still a bit complicated for the program.

However, with the popularity of machine learning technology, it is not difficult to use more than 10 lines of code to implement a program that can recognize handwritten numbers. This is because there are too many machine learning models that can be used directly, such as tensorflow, caffe, there are ready-made installation packages under python, write a number recognition program, more than 10 lines of code is enough.

What I want to do, however, is to implement such a program completely from scratch without the help of any third-party libraries. The reason for doing this is to do it yourself in order to gain an in-depth understanding of the principles of machine learning.

1 Model implementation 1.1 principle

Those who are familiar with neural network regression algorithms can skip this section.

I learned some basic concepts and decided to use the regression algorithm. First download the famous MNIST dataset, which contains 60000 training samples and 10000 test samples. Each digital image is a 28-28 grayscale image, so the input can be thought of as a 28-28 matrix or a 28-28 pixel value.

A model is defined here to judge a picture number, each model includes the weight of each input, plus an intercept, and finally make a normalization. Expression of the model:

Out5= sigmoid (X0 # W0 + X1 # W1 +... X783*W783+bias)

X0 to X783 is 784 inputs, W0 to W783 is 784 weights, and bias is a constant. The sigmoid function can squeeze a wide range of numbers into the (0BI) interval, that is, normalization.

For example, we use this set of weights and bias to determine the number 5. We expect the output to be 1 when the picture is 5 and 0 when not 5. Then the process of training is to calculate the gap between the value of Out5 and the correct value (0 or 1) according to the input of each sample, and then adjust the weight and bias according to this gap. To change the formula, we are trying to make (the correct value of Out5-) close to zero, that is, the so-called minimum loss.

By the same token, there are 10 models for 10 numbers, each judging a different number. After training, a picture comes and is calculated with these 10 sets of models. If the result of the model is closer to 1, the picture is considered to be the number.

1.2 training

According to the above idea, use the SPL (structured processing language) of the aggregator to code the implementation:

one

= file ("train-imgs.btx") .cursor@bi ()

two

> x = [], wei= [], bia= [], vault 0.0625

three

For 10

> wei.insert (0, [to (2828). (0)])

Bia.insert (0Pol. 01)

four

For 50000

> label=A1.fetch (1) (1)

five

> y=to (10). (0), y (label+1) = 1line x = []

six

> x.insert (0Magna 1.fetch (28028))

> xchangx. (~ / 255)

seven

= wei. (~ * * x). (~ .sum () + + bia

eight

= B7. (1 / (1+exp (-)

nine

= (B8 murmury) * * (B8. (1 Murray)) * * B8

ten

For 10

> wei (B10) = wei (B10)-- x. (~ * v*B9 (B10))

Bia (B10) = bia (B10)-v*B9 (B10)

eleven

> file ("MNIST model .btx") .export @ b (wei), file ("MNIST model .btx") .export @ ba (bia)

You don't have to look for it anymore. All the code for the training model is here, and no third-party libraries are used. Let's parse it:

A1, import MNIST training samples with cursors. This is the format I converted and can be accessed directly by the aggregator.

A2, define variables: input x, weight wei, training speed v, etc.

A3Query B3, initialize 10 groups of models (each group is 784 weights + 1 bias)

A4, cycle 50,000 samples for training, 10 models for simultaneous training

B4, take out the label, that is, what is this picture?

B5, calculate the correct 10 outputs, save to variable y

B6, take 28 pixels of this picture as input, and C6 divide each input by 255, which is for normalization.

B7, calculate X0W0 + X1W1W1 +. X783*W783+bias

B8, calculate sigmoid (B7)

B9, calculate the partial derivative of B8, or gradient

B10, C10, adjusts the parameters of 10 models according to the values of B9.

A11, after training, save the model to a file.

1.3 Test

Test the success rate of this model and write a test program in SPL:

one

= file ("MNIST model .btx") .cursor@bi ()

= [0meme 1, 2, 3, 4, 5, 6, 7, 8, 9]

two

> wei=A1.fetch (10), bia=A1.fetch (10)

three

> cnt=0

four

= file ("test-imgs.btx") .cursor@bi ()

five

For 10000

> label=A4.fetch (1) (1)

six

> x = []

seven

> x.insert (0Magna 4.fetch (28028))

> xchangx. (~ / 255)

eight

= wei. (~ * * x). (~ .sum () + + bia

nine

= B8. (round (1 / (1+exp (- ~)), 2))

ten

= B9.pmax ()

eleven

If label==B1 (B10)

> cnt=cnt+1

twelve

= A1.close ()

thirteen

= output (cnt/100)

Run the test, the correct rate reached 91.1%, I am very satisfied with this result, after all, this is only a single-layer model, I use TensorFlow's single-layer model to get a little more than 91%. Let's parse the code:

A1, import model file

A2, extract the model into variables

A3, counter initialization (used to calculate success rate)

A4, import the MNIST test sample, this file format is converted by me

A5, cycle 10,000 samples for testing

B5, take out the label

B6, clear input

B7, take 28 pixels of this picture as input, each input divided by 255, this is for normalization

B8, calculate X0 # W0 + X1 # W1 +. X783*W783+bias

B9, calculate sigmoid (B7)

B10, get the maximum, that is, the most likely number

B11, judge the correct measurement counter plus one

A12dA13, test is over, close file, output correct rate.

1.4 Optimization

The optimization here is not to continue to improve the accuracy, but to improve the speed of training. Students who want to improve their accuracy can try these methods:

1. Add a convolution layer

two。 The learning speed does not use a fixed value, but decreases with the number of training.

3. Don't use all zero for the initial value of weight, use normal distribution.

I don't think it makes much sense to simply pursue accuracy, because some pictures in MNIST datasets are inherently problematic, and even people don't necessarily know how many numbers they are written. I used the aggregator to show several wrong pictures, all of which are very irregular in writing, and the following picture is hard to see as 2.

The key point below is that to improve the training speed, you can use parallelism or clustering. Parallelism using the SPL language is easy, as long as you use the fork keyword and process the above code a little bit.

one

= file ("train-imgs.btx") .cursor@bi ()

two

> x = [], wei= [], bia= [], vault 0.0625

> mode=to (0pc9)

three

> wei=to (28028). (0)

four

Fork mode

= A1.cursor ()

five

For 50000

> label=B4.fetch (1) (1)

> yellow1line x = []

six

If labeling A4

> yroom0

seven

> x.insert (0MagneB4.fetch (28028))

> xchangx. (~ / 255)

eight

= (wei**x) .sum () + bia

nine

= 1 / (1+exp (- C8))

ten

= (C9Mury) * ((1-C9)) * C9

eleven

> wei=wei--x. (* v*C10)

Bia=bia- v*C10

twelve

Return wei,bia

thirteen

= movefile (file ("MNIST model .btx"))

fourteen

For 10

> file ("MNIST Model .btx") .export@ba ([A4 (A15) (1)])

fifteen

For 10

> file ("MNIST Model .btx") .export@ba ([A4 (A16) (2)])

After using parallelism, the training time is reduced by almost half, and the code does not make many changes.

2 Why is SPL language?

Using the SPL language may be a little uncomfortable at the beginning, and you will find it more and more convenient when you use it more and more:

1. Support set operations, such as the multiplication of 784 inputs and 784 weights used in the example, just write a * * directly, and if you use Java or C, you have to implement it yourself.

two。 The input and output of data is very convenient, and it is convenient to read and write files.

3. Debugging is so convenient that all variables are visible, which is easier to use than python.

4. Can be step-by-step calculation, with changes do not have to start all over again, Java and C can not do this, although python can but not convenient, the aggregator only points in the corresponding box execution on it.

5. It is convenient to implement parallelism and clustering without too much development effort.

6. Support for calling and being called. The aggregator can call the third-party java library, and Java can also call the code of the aggregator. For example, the above code can be called by Java to achieve a function of automatically filling in the CAPTCHA.

This kind of programming language is the most suitable for mathematical calculation.

Thank you for reading! This is the end of the article on "how to realize handwritten digit recognition in spl". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it out for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.