How to understand Node.js 07/01 Update SLTechnology News&Howtos

How to understand Node.js

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

How to understand Node.js, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.

A brief introduction to Node

II. Module mechanism

A.CommonJS specification

1. Module reference: introducing external modules through the require () method

two。 Module definition: provides the method or variable that the exports object uses to export the current module, and is the only exported exit

3. Module identification: the parameter passed to the require () method must be a string named in accordance with the hump, or with a.,.. Relative path at the beginning

Module implementation of B.Node

1. Whether it is the core module or the file module, the require () method adopts cache priority for clean loading of the same module, which is the first priority.

two。 Core module "file module in path form" custom module (the custom module is generated in a way that is very similar to the way JS prototype chains or scope chains are found)

3.Node complements the extension in the order of .js, .json and .node. During the attempt, you need to call the fs module to synchronously block whether the file exists. This will be a place that causes performance problems. If it is a .node and .json file, pass it to require () with the extension.

Compilation of 4.js module: packaged into (function (exports, require,module,__filename,__dirname)) {... .})

c. Core module

1.JS core module

Node uses the js2c.py tool that comes with V8 to convert all built-in JS code into arrays in C++ to generate node_ nativesh header files.

It differs from the file module in how the source code is obtained (the core module is loaded from memory) and where the execution results are cached.

2. C _ blank + core module

C++ completes the core within the main body, and JS implements the encapsulated mode outside the main body. Node's buffer, crypto, evals, fs, os and other modules are all written in part through CUniverse +.

D.C _ expansion _ module

One of the typical weaknesses of 1.JS is bit operation, which is inefficient.

e. Module call stack

1. The built-in module is the lowest-level module. If you do not know the built-in module to be called, try to avoid using the process.binding () method to call it directly.

Responsibilities of the 2.JS core module: as the encapsulation layer and bridge layer of the built-in module; pure functional module

3. The file module is usually written by a third party, including the ordinary JS module and the CCMG + extension module.

f. Package and NPM

1. Package description file: package.json, which can help Node solve the problem of dependent package installation

g. Front and rear common module

1.AMD, CMD specification

Third, Asynchronous Ihamo

a. Why do you want to use asynchronous IBO?

1. User experience

two。 Allocation of resources

The single-threaded synchronous programming model will not make better use of the hardware resources because of blocking the Icano. The multithreaded programming model also gives developers a headache because of deadlocks and state synchronization in programming.

Node gives its solution between the two: using single thread to keep away from multi-thread deadlock, state synchronization and other problems; using asynchronous Ibank O to keep single thread away from blocking to make better use of CPU

b. The status quo of Asynchronous Ipaw O implementation

1. Blocking / non-blocking: there are only two ways for the operating system kernel to deal with Icano, blocking and non-blocking

When calling blocking Imax O, the application needs to wait for it to complete before returning the result.

One of the characteristics of blocking Imap O is that after the call, the call must wait until all the operations are completed at the kernel level of the system.

The difference between non-blocking Icano is that it returns immediately after the call, not the data expected by the business layer, but just the status of the current call. In order to get the complete data, you need to call the Icano operation repeatedly to confirm whether it is complete.

This technique of repeatedly calling to determine whether the operation is completed is called polling: read (raw and lowest performance), select (improved read, which can only check 1024 file descriptors at the same time), poll (using linked list, but the performance is still very low even when there are many file descriptors), epoll (the most efficient event notification mechanism under Linux, which really makes use of event notification and callback. Instead of traversing queries), kqueue (exists only on FreeBSD systems)

two。 Ideal non-blocking asynchronous I/O:AIO (only Linux is supported, only 0_DIRECT reading in kernel I _ Dot O is supported, and system cache is not available)

3. Real Asynchronous Imax O: simulated thread pool, AIO of glibc, libeio, IOCP under windows

Async iBand O of C.Node

1. Event loop: Node's own execution model. When the process starts, Node creates a loop similar to while (true). Each time the loop body is executed, we call it Tick. The process of each Tick is to check if there are any events waiting to be handled, and if so, take out the event and its related callback function. If there are associated callback functions, execute them

two。 Observers: there are one or more observers in each event loop, and the process of determining whether there is an event to be handled is to ask these observers if there are any events to be handled. Browsers use a similar mechanism, such as file iThando observers in Node, Internet iMacro observers, and so on.

3. Event cycle is a typical producer / consumer model. The asynchronous Icano, the network request, etc., are the producers of the event, which is passed to the observer, and the event loop fetches the event from the observation capital and processes it.

4. Request object: there is an intermediate product called request object during the transition from the call initiated by JS to the completion of the kernel's Ihop O operation.

5. Event loop, observer, request object and I / O thread pool together constitute the basic elements of the Node asynchronous Icano model.

d. Asynchronous API with non-I _ Band O

1. Timer

SetTimeout () and setInterval () are consistent with API in browsers, and their implementation principle is similar to that of asynchronous Imax O, except that they do not require the participation of the Imax O thread pool.

Using a red-black tree inside the observer, the timer is not accurate.

2.process.nextTick ()

Relatively light, each call will only put the callback function into the queue and take it out and execute it in the next round of Tick

The time complexity of timer is O (lg (n)), and that of nextTick () is O (1).

3.setImmediate ()

Similar to nextTick (), the priority is lower than nextTick () because the event loop checks the observer in the afternoon order, nextTick () belongs to the idle observer and setImmediate () belongs to the check observer

Idle watchers-> check O watchers-> check watchers

e. Event-driven and High performance Server

1.Node handles requests in an event-driven way, eliminating the need to create additional corresponding threads for each request, thus saving the overhead of creating and destroying threads. At the same time, the operating system has a low cost of context switching when scheduling tasks because there are fewer threads.

2.Nginx also adopts an event-driven approach.

IV. Asynchronous programming

a. Functional programming

1. High-order function: you can take the function as a parameter or return value, and form a subsequent transfer style, shifting the business focus of the function from the return value to the callback function.

two。 Partial function: refers to the creation of a function usage that calls another part-a function whose parameters or variables are already preset. A new custom function is generated by specifying some parameters in the form of a partial function.

b. Advantages and difficulties of Asynchronous programming

1. advantage

The biggest feature brought by Node is the event-driven non-blocking I _ plink O model, which is its soul.

Node is designed to solve the performance problem of blocking Icano in the programming model, using a single-threaded model, which causes Node to be more like an expert at dealing with Imaco-intensive problems.

It is recommended that the consumption of CPU should not exceed 10ms, or decompose a large number of calculations into many small calculations, and schedule them through setImmediate ().

two。 Difficult point

Asynchronous handling: Node forms a convention on handling exceptions. It returns asynchrony as the first parameter of the callback function, and does not catch exceptions to the callback function passed by the user.

Function nesting process: for Node, the scenario of multiple asynchronous calls in a transaction can be found everywhere, which does not take advantage of the parallel advantages brought by asynchronous Iscaro.

Blocking code: there is no thread sleep feature such as sleep ()

Multithreaded programming: due to the lag of the front-end browser to the standard, Web Workers is not popular, Node draws lessons from this pattern, and child_process is its basic API,cluster module is a deeper application.

Asynchronous to synchronous: occasional synchronous requirements will suddenly leave developers at a loss because there is no synchronous API.

c. Asynchronous solution

1. Event publish / subscribe model

Event listener pattern is a pattern widely used in asynchronous programming, which is the event-based callback function, also known as publish / subscribe mode.

The events module provided by Node itself is a simple implementation of the publish / subscribe model, and some modules in Node inherit from it.

The event publish / subscribe pattern itself does not have the problem of synchronous and asynchronous invocation, but in Node, emit () calls are mostly triggered asynchronously along with the event loop, so they are widely used in asynchronous programming.

Often used to decouple business logic, it is also a hook mechanism that uses hooks to export internal data or state to external callers

If more than 10 listeners are added to an event, you will be warned; the EventEmitter object gives special treatment to the error event to handle exceptions

Using once to solve the problem of cache avalanche

2.Promise/Deferred mode

Promises/A: as long as you have a copy of then ()

By encapsulating asynchronous calls, Promise realizes the sharing of forward and reverse use cases and the delay of logical processing.

Promise mode is slightly more beautiful than the original event listening and triggering, but its drawback is that it needs to encapsulate different API for different scenarios, which is not as flexible as direct native events.

The Promise and the secret lies in the operation of the queue.

3. Process control library

Tail trigger and Next: in addition to events and Promise, there is another class of methods that need to be called manually in order to continuously perform subsequent calls. I call this kind of method tail trigger. The common keyword is next, and the most commonly used middleware is Connect.

The middleware mechanism enables filtering, verification, logging and other functions to be performed like aspect-oriented programming when dealing with network requests, without being associated with specific business logic, resulting in coupling.

Middleware does not require that every intermediate method is asynchronous, but if each step is completed asynchronously, it is actually only serialized processing, and there is no way to improve the processing efficiency of the business through parallel asynchronous calls.

Async method: series () implements serial execution of a set of tasks; parallel () implements parallel asynchronous operations; waterfall () implements the input of the former result; auto implements automatic dependency processing

Step library: serial mode is implemented by default, parallel () method is included in this to implement parallel, and group () is used to implement grouping.

Wind library

d. Asynchronous concurrency control

1. There is a significant difference between asynchronous Ipicuro and synchronous Ipicuro: synchronous Ipicuro is always called one after another in the loop body, because each Ipicuro blocks each other, so it doesn't consume too many file descriptors, and the performance is low; for asynchronous Ipicuro, concurrency is easy to implement, but it still needs to be controlled because it is too easy to implement. Although it is necessary to squeeze the Hengtong of the underlying system, it still needs to be given some overload protection to prevent it from going too far.

The solution of 2.bagpipe

Control concurrency through a queue

If the amount of asynchronous invocation that is currently active (means that the call is initiated but no callback is performed) is less than the limit, it is removed from the queue for execution.

If the active call reaches the limit, the call is temporarily stored in the queue

At the end of each asynchronous call, the new asynchronous call is taken from the queue to execute

The solution of 3.async: parallelLimit () method

Memory control

Garbage collection mechanism and memory limitation of A.V8

Memory limit of 1.V8: about 1.4GB on 64-bit systems and about 0.7GB on 32-bit systems

In 2.V8, all JS objects are allocated through the heap, which is viewed using process.memoryUsage (). HeapTotal and heapUsed represent the memory requested and the amount currently used. Rss is the abbreviation of resident set size, that is, the resident memory portion of the process.

3. In V8, memory is mainly divided into the new generation and the old generation, the objects in the new generation are objects with a short survival time, and the objects in the old generation are objects with a long survival time or resident memory.

4. On the basis of generation by generation, objects in the new generation are mainly garbage collected by Scavenge algorithm. In the concrete implementation of Scavenge, the Cheney algorithm is mainly used, and in the old generation, the combination of Mark-Sweep and Mark-Compact is mainly used for garbage collection.

5. In order to reduce the pause time caused by the whole heap of garbage collection, V8 starts with the marking phase, changing the actions that would have been paused in one breath to incremental marking (incremental marking), that is, splitting it into so many small "steps" and letting the JS application logic execute for a short while after each "step". Garbage collection and application logic are executed alternately until the marking phase is completed.

b. Efficient use of memory

1. Scope: if the variable is a global variable (not through var or defined on the global variable), because the global scope needs to be released until the process exits, it will cause the referenced object to be resident in memory (resident in the older generation). If you need to release the memory-resident object, you can delete the reference relationship through the delete operation. Deleting the object's properties through delete in V8 may interfere with the optimization of V8. Therefore, it is better to remove the reference by assignment.

two。 Closure: once a variable references an intermediate function, the intermediate function will not be released, and the original scope will not be released, and the memory footprint generated in the scope will not be released. Unless there is a different reference, it will be released gradually.

c. Memory index

1. View memory usage

The totalmem () and freemen () methods in the 2.os module are used to view the memory usage of the operating system and return the total memory and idle memory of the system, respectively.

The memory that is not allocated through V8 is called out-of-heap memory, which can break through the memory limit by using out-of-heap memory.

The memory composition of 3.Node is mainly composed of the part allocated by V8 and the part allocated by Node itself. V8's heap memory is mainly restricted by V8's garbage collection.

d. Memory leak

1. In Node, caching is not cheap, and once an object is used as a cache, it means that it will be resident in the old generation. The more keys in the cache, the more objects will survive for a long time, which will cause garbage collection to do useless work on these objects during scanning and sorting.

two。 Try to use external caches, such as Redis and Memcached

3. Queue problems, such as the accumulation of database write operations:

The surface solution is to switch to technologies that consume faster.

The deep solution should be to monitor the length of the queue. Once accumulated, an alarm should be generated through the monitoring system and the relevant personnel should be notified.

e. Memory leak troubleshooting

1.node-heapdump, node-memwatch and other tools

f. Large memory application

1.Node provides stream to handle large files. If you don't need string-level operations, you don't need V8 to handle them. You can try pure Buffer operations, which are not limited by the V8 memory heap.

VI. Understand Buffer

A.Buffer structure

1.Buffer is a typical module that combines JS with C++. It implements performance-related parts in C++ and non-performance-related parts in JS.

2.Buffer is greatly influenced by the Array type. You can access the length attribute to get the length, or you can access the element through the subscript; if the value assigned to the element is less than 0, it is incremented to 256 one by one until you get an integer between 0 and 255. If the value is greater than 255, subtract 256 one by one. If it is a decimal, discard the decimal part.

3.Node applies the policy of applying memory at C++ level and allocating memory in JS. Node adopts slab allocation mechanism.

Conversion of B.Buffer

1. String:

New Buffer (str, [encoding])

Buf.toString ([encoding], [start], [end])

Buffer.isEncoding (encoding) to determine whether the encoding supports conversion, you can use iconv and iconv-lite libraries

Splicing of C.Buffer

1.Buffer is not equal to a string, but it can be implicitly converted! Need to pay attention to the coding problem

2.setEncoding () can only handle utf8, Base64 and UCS-2/UTF-16LE encodings.

3. Use an array to store all received Buffer fragments and record the total length of all fragments, and then call the Buffer.concat () method to generate a merged Buffer object. The Buffer.concat () method encapsulates the replication process from a small Buffer object to a large Buffer object.

D.Buffer and performance

1. By predicting the conversion of static content to Buffer objects, the reuse of CPU can be effectively reduced and server resources can be saved. In the Web application built by Node, you can choose to separate the dynamic content from the static content in the page, and the static content part can be pre-converted to Buffer to improve the performance. Because the file itself is binary data, so in the scene where there is no need to change the content, try to read only Buffer, and then output directly, without additional conversion to avoid wear and tear.

The influence of the size of 2.highWaterMark on performance

The highWaterMark setting has a certain influence on the allocation and use of Buffer memory.

HighWaterMark setting filtering may result in too many system calls

3. If the file is small (less than 8kb), it may cause slab not to be fully used; for large files, the size decision of highWaterMark will trigger the number of system calls and data events; when reading the same large file, the relationship between the size of the highWaterMark value and speed: the higher the value, the faster the reading speed.

VII. Network programming

a. Build a TCP service

1. Server events (net.createServer ()): listening, connection, close, error

two。 Connection events (net.connecct ()): data, end, connect, drain, error, close, timeout

3. In Node, because TCP enables the Nagle algorithm by default

b. Build a UDP service

1.UDP events: message, listening, close, error

c. Build a HTTP service

1.http server events: connection, request, close, checkContinue, connect, upgrade, clientError

2.http client events: response, socket, connect, upgrade, continue

d. Build a WebSocket service

e. Network service and security

1.Node provides three modules in network security, namely, crypto, tls and https.

VIII. Building Web applications

1.Cookie optimization: reduce the size of Cookie; use different domain names for static components; reduce DNS queries

two。 Caching rules: add Expires or Cache-Control to the header; configure ETags; to make Ajax cacheable

3. Clear the cache: the url request is followed by a version number, such as http://xxx.com/?v=1.0.0

4. Attachment shows that the content only needs to be viewed immediately, and the data can be saved as attachments.

IX. Play with the process

The robustness of 1.PHP is achieved by establishing a separate context for each request.

2.Master-Worker mode, also known as master-slave mode. The main process is not responsible for specific business processing, but is responsible for scheduling or managing the work process, which tends to be stable. The work process is responsible for the specific business processing.

3.child_process module:

Spawn (): start a child process to execute the command

Exec (): unlike spawn (), there is a callback function that captures the status of the child process. You can specify the timeout property to set the timeout period, which is suitable for executing existing commands.

ExecFile (): starts a child process to execute the executable file, which is suitable for the executable file

Fork (): to create a child process of Node, you only need to specify the JS file module to be executed

4.WebWorker allows you to create worker threads and run them in the background, so that some heavily blocked calculations do not affect UI rendering on the main thread

5.IPC (Inter-Process Communication, inter-process communication) is to enable different processes to access resources and coordinate work with each other. Node uses pipe technology.

6. A handle is a reference that can be used to identify a resource, and its interior contains a file descriptor that points to the object.

7.Cluster events: fork, online, listening, disconnect, exit, setup

10. Testing

a. Unit testing

1. Principles for writing testable code: single responsibility, interface abstraction, hierarchical separation

two。 Unit testing mainly includes assertions, test frameworks, test cases, test coverage, mock, continuous integration, etc. Node will also add asynchronous code testing and private method testing

3. Assertion: a first-order logic placed in a program (such as a logical judgment that a result is true or false) to indicate what the program developer expects-when the program runs to the location of the assertion, the corresponding assertion should be true. If the assertion is not true, the program will abort with an error message

The assert module in 4.Node includes: ok (), equal (), notEqual (), deepEqual (), notDeepEqual (), strictEqual (), notStrictEqual (), throws (), doesNotThrow (), ifError ()

5. Unit test style:

TDD (Test-driven Development): focus on whether all functions are implemented correctly, and the presentation is biased towards the style of the functional specification.

BDD (behavior-driven development): focus on whether the overall behavior is in line with expectations, and express it more closely to the habits of natural language.

6. Related tools: mocha, blanket, jscover, muk, Makefile, travis-ci

b. Performance testing

1. Benchmark: benchmark

two。 Stress testing: ab, siege, http_load

Eleventh, production

a. Project engineering

1. Directory structure: as long as you follow the single principle

two。 Build tools: Makefile, Grunt

3. Coding specifications: JSLint, JSHint

4. Code review

b. Deployment proc

1. In the actual project requirements, there are two points that need to be verified: one is the correctness of the function, and the other is the check related to data.

c. Performance

1. Split principle: do specific things, let good tools do what you are good at, simplify models, separate risks

two。 Dynamic and static separation, cache enabled, multi-process architecture, read-write separation

d. Journal

1. Access log, exception log, database record, split log

e. Monitoring and alarm

1. Monitoring: log monitoring, response time monitoring, process monitoring, disk monitoring, memory monitoring, CPU occupancy monitoring, CPU load monitoring, Icano load monitoring, network monitoring, application status monitoring, DNS monitoring

two。 The realization of alarm: email alarm, SMS alarm or telephone alarm.

f. Stability.

1. Multi-machine: load balancing, state sharing, data consistency, reverse proxy need to be considered

two。 Multiple computer rooms

3. Disaster recovery backup

g. Isomeric coexistence

1. Heterogeneous coexistence with existing systems through protocols

Appendix B. Debug Node

1.Debugger

Setting breakpoints through debugger;

Use node debug xxxx.js

Step instructions: cont or c, next or n, step or s, out or o, pause

2.Node Inspector

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.