How to understand the distributed big data's highly concurrent web development framework 07/09 Update SLTechnology News&Howtos

How to understand the distributed big data's highly concurrent web development framework

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

Distributed big data high concurrency web development framework how to understand, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can get something.

Distributed big data's High concurrency web Development Framework

I. introduction

Generally speaking, we think that the website speed of static web page html is the fastest, but since there are dynamic web pages, a lot of interactive data are queried from the database, and the data are often changed. except for some news and information websites, it is not realistic to use html static to improve access speed. We have to find a more appropriate solution between the code and the database.

Reducing the number of database visits, separation of files and databases, big data distributed storage, cluster load balancing of servers, the use of page cache, nosql in-memory database instead of relational database, all these measures are the key to improve the high concurrency performance of the system.

2. Decomposition

(1) distributed server cluster

A) File server cluster

The downloads of pictures, videos and other downloaded files are usually the evil leaders who occupy the network bandwidth. These resources must be placed independently on the file server with good bandwidth and can provide http protocol access address, so that the cpu operation of the web server will not be affected when downloading files.

It is best to use disk array central storage for file servers, such as the file cloud server provided by Aliyun, so that it is easy to use and choose how much bandwidth and how much storage space to use.

If there is no central storage, you can also do a file server cluster, as shown below

To put it bluntly, each file server installs a simple web api as a file transfer and access interface, which can manually assign the server address to the web program. Of course, it can also make a simple load balancer for the unified interface call of the web program.

It should be noted that the result of uploading files via the web api API must return the complete download address of the http file for a specific server, which should be stored in the database.

Number of file servers: file servers are relatively independent and there is no data association, so the number mainly depends on the carrying capacity of bandwidth and the size of hard disk space. After dynamically expanding the server, you only need to add the server address to the call list of web programs to achieve infinite expansion capacity.

B) Web server cluster

Whether it is windows system or linux system, the performance and resources of a single server are limited, and the number of concurrent connections supported is limited, so the method of multi-server cluster must be adopted to improve the number of concurrent connections. It is also easy to calculate the capacity of connection concurrency:

Connection concurrency = server 1 concurrency + server 2 concurrency +. + server n concurrency

Of course, we cannot all assign a domain name address to each web server to access, it must be the same domain name and the same entry. For example, there are hundreds of web servers behind Baidu, but we all use a www.baidu.com entry. As for this entry, we will automatically assign us a web server access, we will not care about the specific address of this web server, this is the role of load balancer.

In the case of abundant hardware, the mongodb cluster recommends the cluster mode of multiple nodes, which not only improves the access performance, but also ensures the data security and integrity performance.

Gemfire main memory database

Gemfire is a commercial nosql in-memory database that has been commercialized for many years and has been tested by many large organizations. Fortunately, the release of its open source version of Geode in April 2015 may be more widely used in the near future.

In addition to the open source nosql main memory database, domestic and foreign giants have been studying their own in-memory databases, such as oracle, ibm, Ali and so on. Nosql in-memory databases tend to replace relational databases.

Friends can study the advantages and disadvantages of each in-memory database, no matter which product we use, we are nothing more than to solve two problems: 1, in-memory database to ensure fast access; 2, server cluster storage big data. As for how to build the environment, you can find relevant information.

(2) load balancer

Load balancing server is divided into hardware balancing server and software balancing server, the purpose is to provide a unified access entrance to access the server cluster, and can dynamically monitor the load of each server, transfer new user requests to servers with low load.

Hardware servers are purchased directly as independent servers as load balancer servers. For example, Aliyun has provided them.

The software server uses software with agent function as a forwarding server, such as Nginx,HAProxy,LVS, etc. You can search for relevant information on specific installation and deployment.

(3) caching technology

From the above, we can see that the web server and the database server are distributed on different servers, that is, when the web program obtains data from the database, it transmits the network data through the network tcp/ip protocol. When the amount of data queried is too large, the network bandwidth rate is likely to become a bottleneck. When high concurrency is online, it will greatly affect the efficiency of the whole system.

We use caching technology to solve this problem. The relevant business is involved here. Generally speaking, we divide the data into two categories according to the specific business:

1. One is the data that does not change frequently, such as table structure, public setting data, provincial, municipal and county lists, and so on. We will permanently cache these data as permanent data. We will read all the records of the database at once and cache them locally to the web server for long-term storage, and regularly check whether the database has updated data. If there is an update, then update the local cache data.

2. The other is frequently changing data, which needs to be read from the database every time it is read. Often, this kind of data also contains a large amount of data. When reading the database, it will not be able to read at once, but can only be read in pages. The time for caching data should not be too long.

There are also many caching tools, such as ehcache, which comes with MemoryCache,java in. Net, which are all well-known caches.

III. Synthesis

To sum up, we can design our overall plan as follows:

IV. Development framework

ErpCore is a powerful rapid development framework, which integrates database design, software modeling, automatic generation of models, visual design of interface, customizable business flow, and automatic generation of systems needed by users. The business systems of all industries are extended on this framework, which makes software engineers change from "modeling-writing code-testing" all tedious and repetitive work to fully automatic generation, which greatly simplifies the development time and cost of enterprise software.

1. Automatic modeling

There is a virtual database system inside the framework, users can create tables, fields and relationships between tables on the virtual database, and enterprises can build an appropriate database architecture according to their specific business needs, that is, through automation, sales business personnel will be able to complete the work of DBA. The business process will become enterprise customization.

2. Custom object

Corresponding to the creation of tables, fields and associations between tables on the virtual database, users can customize objects, object properties, and object associations. Establishes the possibility that it can be extended to meet all business systems in all industries.

3. Visual design of form form

By dragging and dragging, business people can create a software user interface and associate the interface to create the required business system without coding.

4. Fully automatic creation of subsystem

By creating objects, creating forms, and integrating them into a subsystem in the background, ordinary users can use the subsystem to work without additional development work.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.