How to understand the Buffer module in Node.js 04/10 Update SLTechnology News&Howtos

How to understand the Buffer module in Node.js

2025-04-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how to understand the Buffer module in Node.js". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how to understand the Buffer module in Node.js".

Understand Buffer

JavaScript is very friendly with string manipulation.

Buffer is an object like Array, which is mainly used to manipulate bytes.

Buffer structure

Buffer is a typical combination of JavaScript and C++ module, which implements the performance-related part in C++ and the non-performance-related part in JavaScript.

The memory occupied by Buffer is not allocated through V8 and belongs to out-of-heap memory. Due to the impact of V8 garbage collection performance, it is a good idea to manage commonly used operands with a more efficient and proprietary memory allocation recycling policy.

Buffer is already valuable when the Node process starts and is placed on the global object (global). So there is no need for require introduction to use buffer

Buffer object

The non-hexadecimal two-digit number of the element of the Buffer object, that is, a value from 0 to 255

Let buf01 = Buffer.alloc (8); console.log (buf01); / /

You can use fill to populate the value of buf (default is utf-8 encoding), and if the populated value exceeds buffer, it will not be written.

If the length of the buffer is greater than the content, it will be filled repeatedly

If you want to empty the previously filled content, you can fill () directly.

Buf01.fill ('12345678910') console.log (buf01); / / console.log (buf01.toString ()); / / 12345678

If the content is Chinese, under the influence of utf-8, Chinese will occupy three elements, and letters and half-width punctuation will occupy one element.

Let buf02 = Buffer.alloc (18, 'start our new journey', 'utf-8'); console.log (buf02.toString ()); / / start our new journey

Buffer is greatly influenced by the Array type, you can access the length attribute to get the length, you can also access the element through the subscript, or you can view the element location through indexOf.

Console.log (buf02); / / console.log (buf02.length) / / 18-byte console.log (buf02 [6]) / / 230: E6 is 230console.log (buf02.indexOf ('I')) / / 6: console.log (buf02.slice (6,9). ToString ()) / / I: get, after conversion is'I'

If the byte assignment is not an integer between 0255, or if the assignment is less than 0 when the decimal value is assigned, add the value one by one. Until you get an integer between 0255. If it is greater than 255, subtract 255 one by one. If it is a decimal, discard the decimal part (not rounded)

Buffer memory allocation

The memory allocation of the Buffer object is not in the heap memory of V8, but the memory request is implemented at the C++ level of Node. Because dealing with a large number of bytes of data can not be used to apply for a little bit of memory from the operating system. For this reason, Node uses memory at the C++ level to allocate memory in JavaScript.

Node uses slab allocation mechanism, and slab is a dynamic memory management mechanism. At present, it is widely used in some * nix operating systems, such as Linux.

Slab is a requested fixed-size memory area. Slab has the following three states:

Full: fully assigned statu

Partial: partial allocation statu

Empty: no assigned status

Node uses 8KB as a boundary to distinguish whether Buffer is a large object or a small object

Console.log (Buffer.poolSize); / / 8192

The value of this 8KB is the size of each slab. At the JavaScript level, memory is allocated as a unit.

Assign a small buffer object

If the specified Buffer size is less than 8KB, the Node will be allocated as a small object.

Construct a new slab unit. Slab is currently in empty empty state.

Construct a small buffer object 1024KB, and the current slab will be occupied by the 1024KB, and record where the slab is used.

At this point, create another buffer object with the size of 3072KB. The construction process determines whether the current slab remaining space is sufficient, if so, uses the remaining space, and updates the allocation status of the slab. After the 3072KB space is used, the current slab remaining space 4096KB.

If you create a 6144KB-sized buffer at this time, the current slab space is insufficient and a new slab will be constructed (this will result in a waste of the remaining space in the original slab)

For example, in the following example:

Buffer.alloc (1) Buffer.alloc (8192)

There will only be a 1-byte buffer object in the first slab, and the latter buffer object will build a new slab store

Because a slab may be allocated to multiple Buffer objects, slab space will be reclaimed only if these small buffer objects are released in scope and can all be recycled. Although only a 1-byte buffer object is created, if it is not freed, the memory of the actual 8KB is not freed.

Summary:

The real memory is provided at the C++ level of Node, while the JavaScript level is just for use. When performing small and frequent Buffer operations, the mechanism of slab is used for pre-application and time allocation, so that there is no need to have too many system calls on memory request from JavaScript to the operating system. For large chunks of buffer, the memory provided by C++ layer can be used directly without delicate allocation operation.

Splicing of Buffer

In usage scenarios, buffer is usually transmitted in segments.

Const fs = require ('fs'); let rs = fs.createReadStream ('. / quiet night thinking .txt', {flags:'r'}); let str =''rs.on (' data', (chunk) = > {str + = chunk;}) rs.on ('end', () = > {console.log (str);})

The above is an example of reading streams, and the chunk object obtained in data time is the buffer object.

But when there is a wide-byte encoding in the input stream (one word occupies multiple bytes), the problem is exposed. The toString () operation is hidden in str + = chunk. Equivalent to str = str.toString () + chunk.toString ().

The following limits the buffer length of each read of the readable stream to 11. 5.

Fs.createReadStream ('. / quiet Night thinking .txt', {flags:'r', highWaterMark: 11})

The output is:

There is garbled code above, and the length of buffer is limited to 11. For a buffer of any length, a wide-byte string may be truncated, but the longer the buffer, the lower the probability.

Encoding

However, if encoding is set to utf-8, this problem will not occur.

Fs.createReadStream ('. / quiet Night thinking .txt', {flags:'r', highWaterMark: 11, encoding:'utf-8'})

Reason: although the stream triggers the same number of times no matter how you set the encoding, the readable stream object sets a decoder object internally when setEncoding is called. Each data event is decoded from buffer to a string through the decoder object and then passed to the caller.

The string_decoder module provides API for decoding Buffer objects into strings (in a manner that preserves encoded multibyte UTF-8 and UTF-16 characters)

Const {StringDecoder} = require ('string_decoder'); let S1 = Buffer.from ([0xe7, 0xaa, 0x97, 0xe5, 0x89, 0x8d, 0xe6, 0x98, 0x8e, 0xe6, 0x9c]) let S2 = Buffer.from ([0x88, 0xe5, 0x85, 0x89, 0xef, 0xbc, 0x8c, 0x0d, 0x0a, 0xe7, 0x96]) console.log (s1.toString ()); console.log (s2.toString ()); console.log (' -') Const decoder = new StringDecoder ('utf8'); console.log (decoder.write (S1)); console.log (decoder.write (S2))

After getting the encoding, StringDecoder knows that wide-byte strings are stored in 3 bytes under utf-8 encoding, so the first time decoder.write will only output the first 9 bytes of transcoded characters, and the last two bytes will be retained inside StringDecoder.

Buffer and performance

Buffer is widely used in file and network, especially in network transmission, where the performance is very important. In applications, strings are usually manipulated, but once transmitted over the network, they need to be converted to buffer for binary data transmission. In web applications, string conversion to buffer occurs all the time. Improving the conversion efficiency of string to buffer can greatly improve the network throughput.

If you send it to the client as a pure string, the performance will be worse than sending a buffer object, because the buffer object does not have to be converted each time it responds. By pre-converting static content to buffer objects, we can effectively reduce the reuse of CPU and save server resources.

You can choose to separate the dynamic and static content in the page, and the static content part is pre-converted to buffer to improve performance.

The highWaterMark setting has a critical impact on performance when reading files. Ideally, the length of each read would be the user-specified highWaterMark.

HighWaterMark size has two effects on performance:

It has a certain influence on the allocation and use of buffer memory.

If the setting is too small, it may cause too many system calls

Thank you for your reading, the above is the content of "how to understand the Buffer module in Node.js". After the study of this article, I believe you have a deeper understanding of how to understand the Buffer module in Node.js. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.