Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Gaud JS dependency Analysis Engineering and what is its key principle

2025-01-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

What this article shares with you is about Gao de JS dependency analysis project and what the key principles are. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article. Without saying much, follow the editor to have a look.

I. background

After the Bundle of Gaode App, due to the complexity of the business, the number of Bundle is very large. This brings a new problem-the dependency relationship between Bundle is complex and needs to be regulated to keep the dependency between Bundle under the architecture design.

Moreover, in order to ensure that Bundle can operate independently, in the process of continuous iteration of the business, reverse dependencies are needed to quickly determine the scope of influence of the iteration. At the same time, for the aspect API (that is, the system API provided by the container, similar to the BOM API in the browser), you also need to determine the influence range and usage trend of each section API, which can be used as the basis for modifying or deactivating an API.

Take the component library as an example, because the components will be used by several business projects, our changes to the components will affect these business items. Before the plan is modified, the reverse dependency needs to be calculated based on the positive dependency (business dependency component)-where the component is dependent, in order to determine the scope of impact of the component modification.

A higher dimension than a file is a dependency between Bundle. We have business Bundle as well as public Bundle. Public Bundle is also divided into different levels of Bundle.

For a public Bundle, the business Bundle can rely on it, but the public Bundle cannot in turn rely on the business Bundle;. Similarly, the underlying Bundle forbids relying on the Bundle encapsulated by the upper layer. We need to use dependency analysis to ensure that these dependencies are designed in accordance with the above rules.

Second, realize the key steps

Implement JS dependency analysis. The whole implementation process is roughly shown in the following figure:

Here are some key steps to unfold.

Using AST to extract dependent paths

To do file-level dependency analysis, you need to extract the dependency path in each file. There are two ways to extract the dependency path:

The advantage of using regular expression is that it is easy to implement, but the disadvantage is that it is difficult to remove comments and the flexibility is limited.

First, lexical analysis and syntax analysis are carried out, and after getting AST (abstract syntax tree), each syntax tree node is traversed. The advantage of this scheme is that the analysis is accurate, but the disadvantage is that it is more troublesome to implement than pure syntax. If the corresponding language does not provide parser API (such as Less), it is difficult to implement.

Generally speaking, in order to ensure accuracy, the second option will be used if the second option can be used.

Taking JS (.js, .jsx, .ts, .tsx) files as examples, we can use the API ts.createSourceFile provided by TypeScript to carry out lexical and grammatical analysis of JS-like files, and get AST:

Const ast = ts.createSourceFile (abPath, content, ts.ScriptTarget.Latest, false, SCRIPT_ Kinde [path.extname (abPath)])

Once we have the AST, we can start traversing the AST to find all the dependent paths we need. When traversing, you can implement a traversal function walk by using the ts.forEachChild provided by the typeScript module to traverse all the children of a syntax tree node:

Function walk (node: ts.Node) {ts.forEachChild (node, walk) / / depth-first traversal / / according to different types of syntax tree nodes The purpose of different processing is to find paths in import, require and require.resolve / / the above three writing methods are divided into two categories-import declaration and function call expression / / in which function call expression is divided into direct call (require) and attribute call (require.resolve) switch (node.kind) {/ / import declaration handles case ts.SyntaxKind.ImportDeclaration: / / omit details. Break; / / function call expression processing case ts.SyntaxKind.CallExpression: / / omit details break;}}

In this way, we can accurately find all the directly referenced dependent files in the class JS file.

Of course, in the concrete implementation of case, in addition to the case where the user explicitly writes the dependency path, the user may also dynamically load the dependency in the form of variables, which requires context-based semantic analysis so that some constants can be replaced with strings.

But not all dynamic dependencies can be extracted, for example, if the dynamic dependency path is returned by Ajax, then there is no way. However, there is no need to over-consider these situations, the direct writing of string literals can satisfy most scenarios. After that, it is planned to limit this kind of writing through process control and compiler verification, and at the same time collect alarms at run time, requiring explicit references to ensure that references to section API can be statically analyzed.

Set up a document map to find the way

We have our own set of rules for how to write dependent paths:

Reference class JS files support not writing extensions

To reference this Bundle file, you can write only the file name directly.

Use relative path

The public Bundle file is referenced by @ ${bundleName} / ${fileName}. FileName also directly writes only the file name within that Bundle.

These methods are a little more complicated than the planning of CommonJS or ECMAScript Module, especially the rule of "write only file names directly". For us, we need to find the real path corresponding to this file before we can continue with dependency analysis.

To do this, first build a file map with a data structure of {[fileName]: 'relative/path/to/file'}. I used glob to get all the file tree nodes in the entire Bundle directory, filtered out all the file nodes, and generated the file map with the file name as key and the path relative to the Bundle root as value. In the case of "writing only the file name", the corresponding relative path can be found directly according to the file name with O (1) time complexity.

In addition, for the rule that "reference class JS files do not support writing extensions", you need to traverse every possible extension and find the corresponding path after supplementing the path, which will be more complex.

Dependency is the relationship of a graph, and it is necessary to build a node before building a relationship.

In the initial implementation of dependencies, because of the inertia of thinking as a front end, "one file depends on other files" is a tree relationship, and the data structure will naturally use a linked tree structure similar to children: Node [] in the file tree. In fact, this is what happens with dependencies:

If you use tree maintenance, then utils.js nodes will appear in the children of page.jsx and comp.jsx, respectively, resulting in redundant data, which is very common in real projects.

But if it's just a matter of volume, it may not be so serious, and it will cost a little bit of space at most. However, we will find that file dependencies also have this kind of circular dependency:

This kind of circular dependency often occurs when writing TypeScript when making type declarations. There is even a circular dependency between two files. This is a reasonable way to write.

However, for the direct use of the chained tree structure, if the algorithm for creating the chained tree is "when creating a node, create a child node first, and then complete its own creation after the child node creation returns", because we will find that if we write in this way, there will be unlimited dependencies:

Const fooTs = new Node ({name: 'foo.ts', children: [new Node ({name:' bar.ts', children: [new Node ({name: 'baz.ts', children: [new Node ({name:' foo.ts') / / and the top foo.ts is the same children: [...] / / infinite loop. })]})]}))

The root cause of this problem is that this relationship is the relationship of the graph, not the tree, so when creating this data structure, we cannot use the algorithm of "create a child node first when creating a node. Wait for the child node creation to return and then complete its own creation" algorithm, we must switch the train of thought back to the idea of the graph-first create the node, then create the relationship.

With this approach, you are equivalent to using the adjacency list structure of the graph. Let's take a look at the words "create nodes first, then create relationships":

/ / create each node first, and set children to the empty array const fooTs = new Node ({name: 'foo.ts', children: []}); const barTs = new Node ({name:' bar.ts', children: []}); const bazTs = new Node ({name: 'baz.ts', children: []}); / / then create the relationship fooTs.children.push (barTs); barTs.children.push (bazTs) BazTs.children.push (fooTs)

Using this method of writing, you can complete the creation of the diagram.

However, this data structure can only exist in memory and cannot be serialized because it is circular referenced. The inability to serialize means that you can't store or transfer, and you can only play like this in your own process, which obviously doesn't work.

So we also need to modify the data structure to replace the references in the adjacency list with child pointer tables, that is, to add an index for each node and use the index to correspond in children:

Const graph = {nodes: [{id: 0, name: 'foo.ts', children: [1]}, {id: 1, name:' bar.ts', children: [2]}, {id: 2, name: 'baz.ts', children: [0]}]}

Here some students will ask: why don't we just use the subscript of nodes, but add an id field that is the same as the subscript number? The reason is simple, because subscripts depend on the order of the array itself, and if you mess up that order-- such as using filter to filter out some nodes-- these subscripts will change. Adding an id field looks a bit redundant, but it reduces a lot of complexity and is more extensible for the following algorithm.

Using stack to solve the problem of circular reference (cyclic digraph)

When we need to use the dependency data generated above, if we need to traverse the DFS (depth traversal) or BFS (breadth traversal) algorithms, we will find that because the dependency is circular, these recursive traversal algorithms will be endless. There are three simple ways to solve this problem:

Add a field to the existing diagram to mark it

Each time you enter and traverse a new node, check whether it has been traversed before. But this will pollute the picture.

Create a new graph with the same dependency and mark it in the new diagram

Although this practice can be achieved, it is troublesome and a waste of space.

Use the stack to record the traversal path

We create an array as a stack and execute it with the following rules:

Each time a node is traversed, the index (push) of the new node is pressed into the stack

Each time it is returned from a node, the top index (pop) in the stack is removed

Each time you enter a new node, check whether the index value already exists in the stack (using includes), and fall back if it does.

This method is suitable for DFS algorithm.

III. Summary

Dependency is another expression of source code, and it is also a very useful tool to control the quality of giant projects. We can use dependencies to dig out countless imaginative spaces, such as finding useless files, precise test range changes between versions, and so on. If we combine the underlying dependencies such as Android, iOS and C++, we can calculate more analysis results.

At present, the dependency scanning project is carried out iteratively. We use the agile development model, from some simple, rough Bundle-level dependencies, gradually refined to file-level or even identifier-level, in the process of landing according to different accuracy to gradually meet different requirements for accuracy, so that the whole process can get varying degrees of benefit and feedback, driving us to continue to iterate and optimize.

This is what Gaud JS relies on analysis engineering and what the key principles are. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 225

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report