ImSQL: massive data, trusted storage 04/30 Update SLTechnology News&Howtos

ImSQL: massive data, trusted storage

2025-04-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

The existence of problems such as data fraud and data untrustworthiness have brought severe challenges to many application scenarios such as financial supervision and risk control, and are also becoming a major obstacle to large-scale data interconnection and sharing. The authenticity and credibility of data has affected all areas of society for a long time, and this impact will be more prominent in the era of artificial intelligence, which is more dependent on data.

Data falsification can happen in any link. Among them, counterfeiting during data storage is often more simple: under the existing data storage technology, the owner, manager or trustee of the data have the ability to unilaterally tamper with or delete the data.

Since an important reason for the untrustworthiness of the data is that one party can tamper with and delete the data without authorization, then how to avoid this problem has naturally received a lot of attention from the industry. The birth of blockchain and decentralized storage technology has not only curbed data tampering to a certain extent, but also made a preliminary verification in the market.

Many enterprises begin to try to use blockchain to store data, such as in scenarios such as goods traceability. The practice is often to write important data directly into the block. This simple and rude approach does solve the need for anti-deletion and modification of data, and then satisfy the trusted sharing of some data, but there are many problems:

First of all, it is unable to store massive data: the block is not suitable for storing big data, including multimedia data, otherwise the block size is difficult to control and the scalability of the block chain becomes worse. As a result, the original data must be filtered in the business, and only a small amount of necessary data is selected to be stored in the block, but this will reduce the richness of trusted data.

Secondly, the efficiency of data access is low: first of all, due to the existence of the packaging process, block chain data storage is generally not used for high-speed data writing. Secondly, because of the ergodic data reading method, the block chain can not support fast indexing, let alone SQL.

Third, the efficiency of data maintenance is low: because of its sequential reference, the block chain does not support the deletion and modification of individual historical data (unless the full chain is regenerated, but this is a behavior that the block chain should not encourage). It should be noted here that "put an end to unilateral tampering" and "cannot be deleted at all" are two completely different things. The former is a technical means to ensure mutual trust, but the latter may be a loss of necessary functional points.

Finally, there is the risk of data loss: this risk refers only to the PoW block chain system that adopts the principle of the longest chain in the Nakamoto consensus. In this kind of block chain, when the chain bifurcation occurs, the longest (or heaviest) chain branch will be retained and the other branches will be discarded, so that the data in the block is in fact always at risk of being "subverted" and discarded. The existence of behaviors such as selfish mining exacerbates this risk. This is unacceptable in data storage applications.

For the above reasons, the direct use of traditional block chain for data storage obviously can not meet the needs of trusted data storage in a large number of practical scenarios. As a result, this issue has led to a lot of discussion, such as "what data should be stored on the chain and what data should be stored under the chain". The emergence of these problems is fundamentally due to the limitation of the storage efficiency and capacity of the block chain. After all, in the database age, we never talk about what data should be stored outside the database.

In recent years, there have also been some products that provide useful practices to solve the above problem of inefficient storage of block chain data, such as:

Interstellar file system IPFS, R3 Corda, Tencent TrustSQL, etc. However, these products still have more or less problems with trusted data storage, specifically:

IPFS generates a hash summary of the data content and carries out distributed storage among multiple nodes. A single holder does not grasp the complete data, which protects the data privacy to a certain extent. However, IPFS can only be modified (because the hash value will change due to changes in content), and without access control and other data security measures, it is still difficult to meet the needs of enterprise services as a whole.

Corda is a storage product tailored to the privacy needs of financial transactions, focusing on the privacy of data storage. For this reason, Corda does not have a global ledger and requires the existence of witnesses, which is a private but not secure and reliable data storage scheme.

TrustSQL and other similar products in China adopt a simple and intuitive design idea, which is also the most common practice in China at present, that is, first store the data in the database (or IPFS), and then store the operation records, data hash and so on on the chain. Compared with TrustSQL, some similar products such as ChainSQL with shared bits further enhance the support for SQL. This kind of products meet the needs of "auditable" and "transparent supervision" of data, but the disadvantage is that it is still unable to put an end to the deletion and modification of the data itself. In addition, the preservation of key data needs to rely on the full copy storage of the participating nodes, and the storage cost is slightly higher. And the design of data privacy is still insufficient.

In view of the shortcomings of the above products, Wuyuan Science and Technology explored a different way through original technological innovation, and launched the independent intellectual property product "ImSQL", which aims to provide a trusted storage product that can really ensure that data will not be tampered with or deleted without authorization.

ImSQL (Immutable SQL Database) is a new trusted data storage solution based on block chain and distributed storage technology, and perfectly solves the core problems such as "preventing private deletion", "protecting data privacy" and "reducing storage cost". It provides a reliable technical path for trusted storage and data sharing in big data era.

Compared with existing products, ImSQL has the following outstanding advantages:

1. Put an end to unilateral tampering and deletion of data. Through the multi-party verification in the two links of saving and fetching, and putting an end to tampering and deletion in the storage process, the true credibility of the data is guaranteed in all aspects, so that the participants in the application can trust each other and adopt the other party's data at ease, so that the data can support accurate traceability and accountability.

2. Put an end to single point failure. While multi-parties share data, they also work together to maintain data, and the data is not only stored on one side, but also fundamentally realizes the trusted sharing pool of distributed data, which not only avoids the risk of single point of failure, but also improves the efficiency of data sharing.

3. Fragmented storage meets the needs of data privacy, making it impossible for either party to master complete data, thus solving the data privacy problems existing in traditional cloud computing centralized storage or block chain full copy storage. With the exception of the owner of the data, no other storage custodian has access to complete data.

4. Excellent data access performance: ImSQL single node can achieve write speed of 3000 TPS and read speed of 10000 QPS. In addition, ImSQL also has the advantages of supporting SQL language, being scalable horizontally, excellent access performance and user experience, and can make full use of set qun expansion to further increase the above indicators several times.

5. It meets the efficient access needs of big data such as multimedia, supports efficient access, efficient indexing and efficient expansion, is really competent for big data's business scenarios, and can store video and other data reliably and efficiently, thus providing an unprecedented trusted security experience for video surveillance and other scenarios.

6. The use of piecewise design greatly reduces the storage pressure and cost of each storage participant, and makes more participants have the opportunity to join and participate in the ecology of trusted data sharing.

7. Distributed architecture, compatible with light nodes, encourage more nodes to participate. There are no super nodes, and the nodes involved in storage have the same status, which can better ensure the reliability and invulnerability of the system. In addition, if the node chooses to run in the light copy mode, it can only store part of the data, which greatly reduces the storage pressure, and the obligation is reduced, but the power is not affected.

ImSQL not only takes into account the database attributes such as mass storage, fast indexing and horizontal expansion, but also takes into account the characteristics of block chain where data is stored and solidified. in many fields that pay attention to data trusted storage and sharing, it is expected to bring unprecedented experience and convenience. For example: to achieve data exchange and mutual trust among all parties in the supply chain, to achieve data interconnection between different departments of the government or large enterprises, to support the storage of massive data related to trusted traceability, and so on.

Take the government big data construction as an example. It has always been a difficult problem to achieve efficient data interconnection among many different government departments and entities. The current practice often requires the establishment of an independent big data department, the construction of an independent data storage system, analysis and reconstruction of relevant data from different entities, and then visualization. This often brings large upfront expenses, which not only includes a variety of explicit expenses such as human, financial and material, but also implies hidden expenses such as staffing, power and responsibility benefits, time cost, department wall and so on. At the same time, the existence of independent big data department also implies the need for a trusted third party to endorse and even assume responsibility. If ImSQL is used as the underlying platform for data exchange in this scenario, this task can be accomplished more efficiently, as shown in:

There is no need to rely on the endorsement of third-party entities: data between different entities can be directly written to ImSQL, which is preserved, and the data can no longer be tampered with and deleted by any one party, ensuring the availability, consistency and credibility of other entities when accessing data at any time.

There is no need to establish and maintain an additional data storage system: the data is stored and maintained by all participating entities, shared and connected naturally, which reduces the cost of system implementation and maintenance without reducing the efficiency of use. At the same time, ImSQL's data fragmentation storage technology can not only achieve data sharing, but also take into account the protection of privacy, that is, the data stored by all entities can be incomplete fragments, and only those entities with access rights hold the key to find, combine and interpret the fragment data.

To sum up, as a credible and tamper-proof data storage technology, ImSQL not only inherits the advantages of block chain data preservation, but also breaks through the weakness of block chain in efficiency, and provides users with the same efficient data access experience as database. ImSQL is a new category produced by the combination of blockchain and database technology, and it is the only choice to realize trusted data storage.

Author Information: Dr. Jiao Zhenzhen, founder of Wuyuan Technology, Associate Professor / Master Director (Chinese Academy of Sciences).

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.