What is the relationship between Giant Sequoia Database and mongodb 04/20 Update SLTechnology News&Howtos

What is the relationship between Giant Sequoia Database and mongodb

2025-04-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

What is the relationship between the giant sequoia database and mongodb, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can gain something.

As a commercial open source software, giant sequoia database already has a large number of community users. Since open source, everyone seems to have a lot of questions to communicate with us, from the principle and architecture of distributed database to the installation and use of SDB giant sequoia database, so we invite technology celebrities to have a good chat with you. If you have any questions, please feel free to bring them over!

Yesterday, we invited two technical celebrities of Giant Sequoia Database to answer questions about distributed databases in the official WeChat group of Giant Sequoia. Let's get some practical information.

1. Official data say that many performances of giant sequoia surpass that of mongoDB. Can the former replace the latter? if not, in what aspects of performance is giant sequoia less capable than mongoDB? why, is there a design factor? What is the relationship between mongodb and SDB?

SDB can completely replace mongodb, and many features are not supported by mongodb. Such as transactions, join queries, and so on.

At present, it is better than mongodb in performance and better than mongodb in distribution mechanism.

To say the weakness, mainly in the promotion and development of the community, the earliest starting point of mongodb is from the development.

EMurr relationship structure is too slow to iterate for development, so JSON documents correspond to objects one by one, and there are no restrictions on schema, which is too helpful to develop iterations such as POC. Many Internet companies produce versions in half a day, which requires high requirements for development.

Mongodb has done a good job in ease of use, we are also working hard, I hope you will support it!

Mongodb has nothing to do with SDB, and we all have very different starting points.

two。 The development of the database itself has high technical requirements, which needs to be supported by the principle of the paper, just like the balance of AC in CAP theory, and we would like to hear the analysis of principle.

CAP mainly refers to Consistency (consistency), Availability (availability), Partition tolerance (partition fault tolerance); the theory mainly put forward in the distributed storage system, P is necessary in the distributed storage system, when the network and other factors fail, An and C can not be satisfied at the same time, so the concept of "three choices and two" and the choice between AP and CP are put forward.

Choosing two or three will mislead many newcomers, thinking that it is either CP or AP. In fact, this choice is extreme, not a simple black-and-white choice.

First of all, since partitions rarely occur, there is no reason to sacrifice C or An if there are no partitions in the system. Secondly, the trade-off between C and A can occur repeatedly with very fine granularity in the same system, and each decision may be different because of the specific operation, or even because it involves specific data or users. So it's a change between 0 and 100%.

For example, distributed one master and two slaves guarantee AP, but the speed of network synchronization determines the ability of CP, so this synchronization capability varies from 0 to 100% according to the network environment.

When the conditions are good, they are basically satisfied, and when the network is cut off, it is necessary to make a choice, but when the synchronization is completed and the master node is upgraded from the node, it is restored.

Of course, if you want to ensure that the CP must be strongly consistent, then when a machine crashes, strong consistency cannot be done, and CP will not be satisfied.

ACID is the most important feature of traditional relational database, such as atomicity, consistency, isolation and persistence, and emphasizes consistency. Belongs to CP.

How to keep a balance between these two states needs to proceed from business logic, user requirements and business requirements.

3. Now the Giant Sequoia database is divided into community version and enterprise version, does the community office have castration, and which scenarios are applicable to these two respectively?

The open source community version and the enterprise version of the database kernel source code is consistent, compared to the enterprise version, the community version is only visual operation and SparkSQL this part is limited.

The other is professional services, the enterprise version has some of our dedicated debugging tools, etc., easy to serve, but has little impact on the community.

4. How is the giant sequoia database combined with Spark? If you don't need an ordinary sql query, what method is used to query?

Spark is suitable for complex queries with low concurrency and large amount of data.

Giant Sequoia Database has developed a set of connectors specifically for spark, which can support the provision of raw data for spark. The syntax of Spark SQL basically follows the syntax of Hive SQL, and you can basically consult the hive SQL syntax to write.

What is the difference between 5.spark sql and ordinary sql? Different grammar or different language?

Spark SQL is similar to the standard SQL language, there is no difficulty in writing, but after all, it does not do OLTP (On-Line Transaction Processing online transaction processing). Some syntax does not support it and belongs to the SQL subset.

6. What is the bottleneck of the giant sequoia database?

At present, the biggest bottleneck is the isolation mechanism. Giant sequoia SDB mainly pursues high availability and high performance, that is, in ACID and BASE mechanisms, giant sequoia SDB has more obvious advantages in BASE mechanism.

ACID and BASE promote the development of relational database and NoSQL respectively. Now NewSQL is to find a better balance and support ACID as much as possible on the basis of high availability.

For today's business, the high availability scenario is much larger than the OLTP scenario.

The difference between 7.nosql and newsql?

This starts with nosql, which inspires a large number of users in the Internet era, and high availability becomes very important, so it withdraws from the BASE mechanism.

BASE is basic available (Basically Available) soft state (Soft State) final consistency (Eventually Consistent) soft state is intermediate state, such as synchronization delayed distributed slave node replica state. From the perspective of BASE, high availability is a priority. Belonging to AP in CAP, the main purpose is to ensure the convenience and high availability of business iterations. Newsql is based on BASE and satisfies ACID as much as possible.

8. Does the giant sequoia database support distributed file storage?

Yes, Giant Sequoia has SequoiaCM products dedicated to unstructured storage with the same distribution and different storage structure, which is block storage.

At present, SDB supports dual storage, one is the BSON structure of row storage, and the other is block storage structure. The function of SDB for block storage is called LOB (large object), which supports the storage of unstructured files of any size. So you can use SDB's LOB function to store files, and the file will be split into 256KB (default) block size distributed storage in the entire database cluster, data redundancy depends on data partition groups.

☞ [Giant Sequoia solution] ECM Enterprise content Management

Does 9.SDB currently have an official docker image?

There is currently no official docker.

The main reason is that from a performance point of view, docker has a lot of skills for memory management and the use of iMacro, but from a small project, ease of use perspective, we will consider introducing docker.

We will provide a mirrored version of APP in Qingyun recently, and we will also consider using docker deployment for other clouds.

10. If I were a video file, what would be the performance change?

If users continue to write large unstructured files to SDB, the performance change will mainly depend on your server.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.