In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article introduces the knowledge of "how to write memory-efficient applications with Node.js". In the operation of practical cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
Problem: large file replication
If someone is asked to write a file copy program in NodeJS, he will quickly write the following code:
Const fs = require ('fs'); let fileName = process.argv [2]; let destPath = process.argv [3]; fs.readFile (fileName, (err, data) = > {if (err) throw err; fs.writeFile (destPath | |' output', data, (err) = > {if (err) throw err;}); console.log ('New file has been created');})
This code simply writes the file to the target path after trying to read it based on the file name and path entered, which is not a problem for small files.
Now suppose we have a large file (greater than 4 GB) that needs to be backed up with this program. Take one of my 7.4G UHD 4K movies as an example. I used the above program code to copy it from the current directory to another directory.
$node basic_copy.js cartoonMovie.mkv ~ / Documents/bigMovie.mkv
Then I got this error message under the Ubuntu (Linux) system:
Home/shobarani/Workspace/basic_copy.js:7 if (err) throw err; ^ RangeError: File size is greater than possible Buffer: 0x7fffffff bytes at FSReqWrap.readFileAfterStat [as oncomplete] (fs.js:453:11)
As you can see, because NodeJS allows only data written to 2GB into its buffer at the maximum, an error occurs while reading the file. To solve this problem, it is best to consider the memory situation when you are doing I _ paw O-intensive operations (copy, processing, compression, etc.).
Streams and Buffers in NodeJS
In order to solve the above problems, we need a way to cut large files into many file blocks, and we need a data structure to store these file blocks. A buffer is the structure used to store binary data. Next, we need a way to read and write file blocks, and Streams provides this part of the ability.
Buffers (buffer)
We can easily create a buffer using the Buffer object.
Let buffer = new Buffer (10); # 10 is the volume console.log (buffer) of buffer; # prints
You can also write this in the new version of NodeJS (> 8).
Let buffer = new Buffer.alloc (10); console.log (buffer); # prints
If we already have some data, such as arrays or other data sets, we can create a buffer for them.
Let name = 'Node JS DEV'; let buffer = Buffer.from (name); console.log (buffer) # prints
Buffers has some important methods such as buffer.toString () and buffer.toJSON () that can drill down into the data it stores.
We are not going to create the original buffer directly to optimize the code. The NodeJS and V8 engines have already implemented this in creating internal buffers (queues) when dealing with streams and network socket.
Streams (stream)
Simply put, a stream is like an arbitrary door on a NodeJS object. In a computer network, the entrance is an input action and the exit is an output action. We will continue to use these terms next.
There are four types of streams:
Readable stream (for reading data)
Writable stream (for writing data)
Duplex flow (can also be used for reading and writing)
Conversion stream (a custom duplex stream for processing data, such as compressing, checking data, etc.)
The following sentence clearly explains why we should use streams.
An important goal of Stream API (especially the stream.pipe () method) is to limit data buffering to an acceptable level so that sources and targets at different speeds do not block available memory.
We need some ways to accomplish the task without crushing the system. This is what we already mentioned at the beginning of the article.
In the diagram above, we have two types of streams, which are readable streams and writable streams. The .pipe () method is a very basic method for connecting readable and writable streams. If you don't understand the diagram above, it doesn't matter. after reading our example, you can come back to the diagram and everything will take for granted at that time. Piping is a compelling mechanism, and here are two examples to illustrate it.
Solution 1 (simply use streams to copy files)
Let's design a solution to solve the problem of copying large files in the previous article. First we will create two streams, and then perform the next few steps.
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
Listen for blocks from readable streams
Write data blocks to a writable stream
Track the progress of file replication
Let's name this code streams_copy_basic.js
/ * A file copy with streams and events-Author: Naren Arya * / const stream = require ('stream'); const fs = require (' fs'); let fileName = process.argv [2]; let destPath = process.argv [3]; const readabale = fs.createReadStream (fileName); const writeable = fs.createWriteStream (destPath | | "output"); fs.stat (fileName, (err, stats) = > {this.fileSize = stats.size; this.counter = 1 This.fileArray = fileName.split ('.'); try {this.duplicate = destPath + "/" + this.fileArray [0] +'_ Copy.' + this.fileArray [1];} catch (e) {console.exception ('File name is invalid! Please pass the proper one');} process.stdout.write (`File: ${this.duplicate} is being created: `); readabale.on ('data', (chunk) = > {let percentageCopied = ((chunk.length * this.counter) / this.fileSize) * 100; process.stdout.clearLine (); / / clear current text process.stdout.cursorTo (0) Process.stdout.write (`${Math.round (percentageCopied)}%`); writeable.write (chunk); this.counter + = 1;}); readabale.on ('end', (e) = > {process.stdout.clearLine (); / / clear current text process.stdout.cursorTo (0); process.stdout.write ("Successfully finished the operation"); return }); readabale.on ('error', (e) = > {console.log ("Some error occured:", e);}); writeable.on (' finish', () = > {console.log ("Successfully created the file copy!");});})
In this program, we receive two file paths (source and target files) passed in by the user, and then create two streams to transport blocks of data from the readable stream to the writable stream. Then we define some variables to track the progress of the file copy, and then output to the console (in this case, console). At the same time, we also subscribed to some events:
Data: triggered when a block of data is read
End: triggered when a block is read by a readable stream
Error: triggered when an error occurs while reading a block of data
By running this program, we can successfully complete the task of copying a large file (7.4G in this case).
$time node streams_copy_basic.js cartoonMovie.mkv ~ / Documents/4kdemo.mkv
However, when we observe the memory condition of the program while running through the task manager, there is still a problem.
4.6GB? The memory consumed by our program at run time doesn't make sense here, and it is likely to jam other applications.
What happened?
If you look closely at the reading and writing rate in the picture above, you will find some clues.
Disk Read: 53.4 MiB/s
Disk Write: 14.8 MiB/s
This means that producers are producing at a faster rate, while consumers are unable to keep up. In order to save the read data blocks, the computer stores the excess data in the RAM of the machine. This is the reason for the peak of RAM.
The above code ran on my machine for 3 minutes and 16 seconds.
17.16s user 25.06s system 21% cpu 3pur16.61total
Solution 2 (file replication based on flow and automatic back pressure)
In order to overcome the above problems, we can modify the program to automatically adjust the read and write speed of the disk. This mechanism is back pressure. We don't need to do much, just import the readable stream into the writable stream, and NodeJS will be responsible for the back pressure.
Let's name this program streams_copy_efficient.js
/ * A file copy with streams and piping-Author: Naren Arya * / const stream = require ('stream'); const fs = require (' fs'); let fileName = process.argv [2]; let destPath = process.argv [3]; const readabale = fs.createReadStream (fileName); const writeable = fs.createWriteStream (destPath | | "output"); fs.stat (fileName, (err, stats) = > {this.fileSize = stats.size; this.counter = 1 This.fileArray = fileName.split ('.'); try {this.duplicate = destPath + "/" + this.fileArray [0] +'_ Copy.' + this.fileArray [1];} catch (e) {console.exception ('File name is invalid! Please pass the proper one');} process.stdout.write (`File: ${this.duplicate} is being created: `); readabale.on ('data', (chunk) = > {let percentageCopied = ((chunk.length * this.counter) / this.fileSize) * 100; process.stdout.clearLine (); / / clear current text process.stdout.cursorTo (0) Process.stdout.write (`${Math.round (percentageCopied)}%`); this.counter + = 1;}); readabale.pipe (writeable); / / Auto pilot ON! / / In case if we have an interruption while copying writeable.on ('unpipe', (e) = > {process.stdout.write ("Copy has failed!");})
In this example, we replace the previous block write operation with a single line of code.
Readabale.pipe (writeable); / / Auto pilot ON!
The pipe here is the cause of all magic. It controls the speed of disk reading and writing so that it does not block memory (RAM).
Run it.
$time node streams_copy_efficient.js cartoonMovie.mkv ~ / Documents/4kdemo.mkv
We copied the same large file (7.4 GB), so let's take a look at memory utilization.
A great shock! Now the Node program takes up only 61.9 MiB of memory. If you observe the read and write rate:
Disk Read: 35.5 MiB/s
Disk Write: 35.5 MiB/s
At any given time, the read and write rate can be kept consistent because of the back pressure. What is even more surprising is that the optimized program code is 13 seconds faster than the previous one.
12.13s user 28.50s system 22% cpu 3purl 03.35 total
Due to NodeJS streams and pipes, the memory load is reduced by 98.68% and the execution time is also reduced. That's why pipelines are a powerful presence.
61.9 MiB is the buffer size created by the readable stream. We can also use the read method on the readable stream to allocate a custom size for the buffer block.
Const readabale = fs.createReadStream (fileName); readable.read (no_of_bytes_size)
In addition to copying local files, this technique can also be used to optimize many of the problems with Icano operations:
Process the data flow from Kafka to the database
Process data streams from the file system, dynamically compress and write to disk
More.
This is the end of "how to write memory-efficient applications with Node.js". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.