How to realize Picture character recognition by Python 07/15 Update SLTechnology News&Howtos

How to realize Picture character recognition by Python

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly shows you "Python how to achieve picture Optical Character Recognition," the content is simple and easy to understand, clear organization, I hope to help you solve doubts, let Xiaobian lead you to study and learn "Python how to achieve picture Optical Character Recognition" this article bar.

preface

What is OCR?

Optical character recognition (OCR) refers to the process of analyzing and recognizing image files of text data to obtain text and layout information. In short, text data in images is detected and the content of the text is identified.

So what are the application scenarios?

In fact, there are ocr shadows everywhere in our daily life, such as ID card identification input information, vehicle license plate number identification, automatic driving, etc. during the epidemic. Machine learning is playing an increasingly important role in our lives and is no longer a mystery.

What is OCR's technical route?

The operation mode of ocr is as follows: input-> image preprocessing-> text detection-> text recognition-> output.

I will sort out the process of verifying the use of the project according to the state of contact.

project structure

First let's look at the structure of the project.

Found that the project has Chinese description, which is very convenient, click open according to the official instructions to start operation.

environment deployment

Click on README.md, and you can see from the documentation tutorial that the first step is to teach you how to install the environment.

Because there are too many contents, I will make a summary so that everyone can get started directly.

1. Install Anaconda and construct a virtual environment

Here you can refer to my other article, which is very detailed: Python Machine Learning Chapter 1 Environmental Configuration Diagram Flow

Officially given is Python 3.8 virtual environment, we also construct one, open Anaconda Prompt.

Enter command:

conda create -n paddle_env python=3.8

Activation environment:

conda activate paddle_env2, Dependency package download

paddlepaddle installation

pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple

layoutparser installation

pip3 install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl

Shapely installation, this needs to be downloaded, download address: Shapely download address

I chose this one.

Installation command:

pip install Shapely-1.8.0-cp38-cp38-win_amd64.whl

paddleocr installation

pip install paddleocr -i https://mirror.baidu.com/pypi/simple

OK, a little bit more environment, are installed on the good start to use it.

test code

The official gives two modes, one is command line execution, the other is code execution. To visualize the configuration, I'm using code patterns here.

Prepare a picture with text

The test code is as follows

#!/ user/bin/env python# coding=utf-8"""@project : ocr_paddle@author : huyi@file : test.py@ide : PyCharm@time : 2021-11-15 14:56:20"""from paddleocr import PaddleOCR, draw_ocr #The currently supported languages of Paddleocr can be switched by modifying lang parameters #e.g.`ch`, `en`, `fr`, `german`, `korean`, `japan`ocr = PaddleOCR(use_angle_cls=True, use_gpu=False, lang="ch") # need to run only once to download and load model into memoryimg_path = './ data/2.jpg'result = ocr.ocr(img_path, cls=True)for line in result: # print(line[-1][0], line[-1][1]) print(line) #Display results from PIL import Image = Image.open (img_path).convert('RGB')boxes = [line[0] for line in result]txts = [line [1][0] for line in result]scores = [line[1][1] for line in result]im_show = draw_ocr(image, boxes, txts, scores, font_path='./ fonts/simfang.ttf')im_show = Image.fromarray(im_show)im_show.save('result.jpg')

code illustrates

1. Because my computer does not have a graphics card, I set use_GPU=False.

2. The display result part will mark the recognized text with a box and display the recognition result.

verify

We see that the printed content has the image position of each sentence identified, as well as the recognition result and credibility. In the result diagram above, the corresponding text of each sentence is boxed out. The effect was very good!

parameters replenishment

The official also gave some parameters to adjust the output content. See the quickstart.md file. Parameter supplement:

- Use detection alone: set `--rec` to `false`

- Use ID alone: Set `--det` to `false`

The official also provides a standard json structure output data

The return result of PP-Structure is a list composed of dict. The example is as follows

```shell

[{ 'type': 'Text',

'bbox': [34, 432, 345, 462],

'res': ([[36.0, 437.0, 341.0, 437.0, 341.0, 446.0, 36.0, 447.0], [41.0, 454.0, 125.0, 453.0, 125.0, 459.0, 41.0, 460.0]],

[('Tigure-6. The performance of CNN and IPT models using difforen', 0.90060663), ('Tent ', 0.465441)])

}

]

```

That's all for Python's "How to Achieve Image Optical Character Recognition." Thank you for reading! I believe that everyone has a certain understanding, hope to share the content to help everyone, if you still want to learn more knowledge, welcome to pay attention to the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.