Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Can distributed cache be used as NoSQL database?

2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the knowledge of "whether distributed cache can be used as a NoSQL database". Many people will encounter this dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Can distributed caches be used as NoSQL databases?

InfoQ: can you compare distributed caching solutions with NoSQL databases?

Greg Luck: distributed caching usually keeps data in memory to reduce latency. NoSQL databases are DBMS without R (that is, database management systems without relationships) and generally lack support for transactions and other advanced features. For systems that do not support relationships, the association of table relationships is the most troublesome part of SQL, which is the origin of the name NoSQL.

One of the NoSQL databases is key-value storage. Typical examples include Dynamo, Oracle NoSQL Database, and Redis. Caching is also a key store, so the two are related. Many caching implementations can be configured to be persistent, and most of the time they don't do so because caching is meant to improve performance rather than persistence. The NoSQL database, by contrast, is used for persistence.

Persistent caching can also be used as a key NoSQL database. NoSQL also mentions Big Data, which usually refers to a larger amount of data than can be put into a single RDBMS node, ranging from a few TB to a few PB.

Distributed caching is usually used to reduce the latency of transactional data, which is not large at first, but slowly evolves in the direction of Big Data. Because the cache keeps the data in memory, this increases the cost of storage and requires a limit on the size of the data. If you rely on heap storage, each server node may have only a poor 2GB. If you rely on distributed caching, Ehcache also provides out-of-heap storage, where each server can store hundreds of GB of data that can be used as a TB-level cache.

Persistent, distributed caching can be applied to some NoSQL scenarios. The NoSQL database can also handle some cached scenarios, but the latency is slightly higher.

InfoQ: from an architectural point of view, are there any similarities between distributed caching and NoSQL databases?

Greg: they all want to provide TPS and extensibility that are superior to RDBMS. To this end, they all simplify their functionality, putting aside troublesome issues such as table associations, stored procedures, and ACID transactions.

Although there is JSR 107in the Java caching world, which provides a standard set of caching API for Spring and Java EE programmers, they both prefer to use private interfaces rather than standardized interfaces.

They all partition the data in a way that is transparent to the client and expand outward. Non-Java products are also doing a good job of scaling up. With Terracotta BigMemory, we are also very special in scaling up on the Java platform. Finally, both can be deployed on common hardware and operating systems, which makes them ideal for running in the cloud.

InfoQ: what is the difference between these two technologies in architecture?

Greg:NoSQL and RDBMS usually use disks. The disk is a mechanical device with high latency because the seek time is the time when the head moves to the correct track, and the read and write time depends on the RPM of the disk. NoSQL attempts to optimize disk usage, for example, by appending logs only to the current location of the head and occasionally flushing to disk. In contrast, caches mainly store data in memory.

NoSQL and RDBMS clients are thin (think of Thrift or JDBC) and only transfer data over the network, while caches like Ehcache use in-process and remote storage, so common requests can be successfully processed locally. In the context of distributed caching, hot spot data is cached in the in-process storage of each application server, and increasing the number of servers does not increase the load on the network or the back end.

RDBMS focuses on becoming the generic SOR (System of Record). NoSQ wants to be the SOR of a specific data type, such as key-value pairs, documents, sparse tables (wide tables), or graphs. Caching is performance-oriented and is typically used in conjunction with RDBMS or NoSQL databases, and the data type is SOR. Often, the result of the Web service call and the calculation result of the business object are stored in the cache, which may require hundreds of sor calls.

Caching like Ehcache runs partly in the operating system process of the application and partly in the process of its own machine on the other side of the network. But not all distributed caches are like this: memcache is an example, and all data is stored across the network.

InfoQ: what kind of applications are most suitable for this approach?

Greg: let's start with the previous problem. Using distributed caching for your existing applications usually requires only a small amount of work, while NoSQL requires a lot of work and major architectural changes.

So the first type of applications suitable for distributed caching are existing systems, especially those that require the following:

Need to scale out due to a surge in usage or load

Lower latency is required to achieve SLA

In order to minimize the use of expensive infrastructure such as mainframes

Reduce the cost of Web service invocation

Deal with extreme load peaks (such as promotions like Black Friday)

Can distributed caches be used as NoSQL databases?

InfoQ: are there any limitations in this approach?

Greg: caches, placed in memory, are constrained in size, and their technology is limited by how much memory is available for them to use (more on this below).

Caching, even if it provides persistence, is not necessarily the best choice for SOR. Caching deliberately avoids the complex functions of backing up to and restoring from disk, although there are also simple ones. RDMBS has developed a wealth of backup, restore, migration, reporting, and ETL features over the past 30 years. NoSQL is somewhere in between.

Caching provides a programmatic API for changing and accessing data. NoSQL and RDBMS provide tools to execute scripted languages such as SQL, UnSQL, and Thrift.

But the key point is to remember that caching doesn't want to be your SOR. It can easily live in harmony with your RDBMS, for which it does not require all the complex features used by RDBMS.

InfoQ: what do you think of distributed caching solutions, NoSQL databases and traditional RDBMS working together in the future?

Greg: much faster than RDBMS, depending on the NoSQL of the deployment topology, and the data access pattern, distributed caching can be located anywhere between the three. Those who need lower latency can use caching as a supplement to NoSQL, just as they do with RDBMS now.

Slightly different is that when you want to extend RDBMS to multiple nodes, it is often difficult to extend, either affecting programming contracts, or subject to CAP tradeoffs; while with NoSQL, even if only one node is used, you can simply think of it as a multi-node installation. If you scale up, you don't have these problems. In RDBMS, caching is added to avoid the hassle of scaling out. Usually, caching can solve the capacity problem of the system, and you don't have to spend too much effort. So when you need to scale out, add the cache.

For NoSQL, the ability to scale out is built in, so use caching when you need low latency.

This is the end of the content of "whether distributed caches can be used as NoSQL databases". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report