How to use API to realize Picture character recognition in node.js 07/15 Update SLTechnology News&Howtos

How to use API to realize Picture character recognition in node.js

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "how to use API in node.js to achieve picture text recognition", the content is easy to understand, clear, hope to help you solve your doubts, the following let Xiaobian lead you to study and learn "how to use API in node.js to achieve picture text recognition" this article.

Structure of the project:

Let's first take a look at the effects of each folder:

Dao: database layer business logic db: some basic methods of encapsulating mysql, such as initializing and updating doc:ApiDoc automatically generate interface documents based on interface annotations node_module: introduced third-party package public: store static resources router: interface routing layer, store business logic util: encapsulate some common common methods, such as signature encryption

This article mainly tests several interfaces that do not need to apply for permission. Let's take a look at what types of interfaces are available for text recognition:

To call API first, we need to take the parameter access_token parameter. So next we need to get the access_token first. Let's take a look at the document's requirements for access_token:

The document is very clear, we do not talk much nonsense, direct code to achieve access to access_token. First, we configure client_id and client_secret in config.js:

Create a postHelper.js file under the util folder to encapsulate the http request. First of all, to obtain the need for access_token, we first encapsulate a POST request method with Content-Type as application/x-www-form-urlencoded:

Next, implement an interface to get the access_token. The previous article explained the design method of an interface in detail. Therefore, the specific process for obtaining the access_token interface is:

Remove all necessary parameters and sign parameters are encrypted according to the rules to generate a signature sign. Initiate a post request to get access_token.

Next, under the code implementation, we implement the business logic such as encryption and signature in the routing layer:

Then call the post request request access_token API to get the access_token operation on the dao layer to execute.

Here, since we use the request library to initiate the request, we need to install the dependency first. The command is:

Npm install request-save-dev

Then let's take a look at the running effect of the interface:

You can see that the access_token has been successfully obtained. Because access_token has an expiration time, you can choose to get a new access_token again when it expires or a new access_token before each request for API. Next, let's take a look at the first interface: universal character recognition.

Universal character recognition interface

Post the document API description first:

The request parameters to be used in this API are also posted by the way:

The interface is actually very simple, upload pictures to parse the text. The picture format can be BASE64 or url. I use URL to test here, and then implement the business logic code:

There are two things to pay special attention to in this interface:

If the image parameter exists, the url parameter cannot take effect. Url does not support https, that is, the picture url of the https protocol cannot be parsed.

We can take a look at the corresponding effect of the API:

By the way, you can understand the meaning of the return value to the field by posting the description of the return parameters:

You can see that the image recognition text has been parsed to two sentences. Of course, this API can select and pass parameters. We can also take a look at the request parameters to indicate that you can choose to identify the language type and detect the orientation of the image. I do not test the optional parameters, you can expand them if you are interested. Let's take a look at the next interface: universal text recognition (with location information version).

General character recognition (including location information version)

By looking at the title, you can see that the difference between this API and the previous API is that this API can return the location information of the text in the image. First of all, paste the API description:

Let's implement this interface directly in code. Post the code directly:

Let's take a look at the interface effect:

You can see that the position of the text relative to the picture is returned in the parsed array. We can post the description of the return parameters:

The API can choose to pass parameters such as recognize_granularity positioning but character position, etc. There is not too much explanation for the optional parameters. Readers can expand on their own. Next, let's look at the next interface: handwritten character recognition.

Handwritten character recognition

This API can identify handwritten Chinese or numbers in pictures. First of all, paste the API description:

I will use the following pictures for handwritten picture recognition:

Needless to say, let's implement the code directly:

We first need to encode the picture in base64 and then submit it to API. I am here to read local images for base64 coding example. We can test the effect of the interface:

The returned result was successfully parsed to the text of the handwritten picture. We can take a look at the description of the return parameters:

ID card identification

Support structured identification of all fields on the front and back of the second-generation ID card of mainland residents, including name, sex, nationality, date of birth, address, identity card number, issuing authority, validity period At the same time, support for image risk and quality detection of ID card pictures uploaded by users, which can identify whether the picture is a copy or temporary ID card, whether it has been remade or edited, and whether there are quality problems such as upside down, blur, underexposure, overexposure and so on. First of all, paste the description of the interface:

Without saying much, directly implement the interface code:

In addition to the base64-encoded image, the required parameters for this API also need to specify the front and back of the ID card. We can test the effect of the interface:

Paste the return parameters to help readers understand the meaning of the returned parameters of the API:

The above is all the contents of the article "how to use API to achieve picture text recognition in node.js". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.