In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
Editor to share with you the relational database and nosql example analysis, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to understand it!
NoSQL concept
With the rapid development of web2.0, non-relational and distributed data storage has developed rapidly, and they do not guarantee the ACID characteristics of relational data. The concept of NoSQL was put forward in 2009. The most common explanation for NoSQL is "non-relational", "Not"
Only SQL is also accepted by many people. (the word "NoSQL" was first used as the name of a lightweight relational database in 1998.)
NoSQL is most used by us when key-value storage, of course, there are other document, column storage, schema database, xml database and so on. Before the concept of NoSQL was proposed, these databases were used in various systems, but they were rarely used in web Internet applications. Such as cdb, qdbm, bdb databases.
The bottleneck of traditional Relational Database
The traditional relational database has good performance, high stability, has been tested by history, and it is easy to use and powerful. At the same time, it has accumulated a large number of successful cases. In the field of the Internet, MySQL has become the absolute front king, it is no exaggeration to say that MySQL has made an outstanding contribution to the development of the Internet.
In the 1990s, the number of visits to a website was generally small and could be easily handled with a single database. At that time, there were more static web pages than dynamic interactive websites.
In the last 10 years, the website began to develop rapidly. Popular forums, blogs, sns and Weibo gradually lead the trend in the field of web. In the early days, the forum traffic is actually not large, if you contact the network earlier, you may still remember that there were text-based storage forum programs at that time, you can imagine the average forum traffic.
Memcached+MySQL
Later, with the increase in the number of visits, almost most websites using MySQL architecture began to have performance problems on the database. Web programs are no longer only focused on functionality, but also in pursuit of performance. Programmers begin to make extensive use of caching technology to ease the pressure on the database and optimize the structure and index of the database. At first, it is popular to ease the pressure on the database through file caching, but when the traffic continues to increase, multiple web machines cannot be shared through file caching, and a large number of small file caches also bring high IO pressure. At this time, Memcached naturally becomes a very fashionable technology product.
As an independent distributed cache server, Memcached provides a shared high-performance cache service for multiple web servers. On the Memcached server, it develops the expansion of multiple Memcached cache services according to the hash algorithm, and then appears consistent hash to solve the malpractice of a large number of cache invalidation caused by adding or reducing cache servers. At that time, if you went to the interview and you said you had Memcached experience, you would definitely get extra points.
Mysql master / slave read / write separation
Due to the increased write pressure on the database, Memcached can only alleviate the read pressure on the database. The concentration of read and write on a database makes the database unbearable. Most websites begin to use master-slave replication technology to achieve read-write separation, so as to improve read-write performance and read-library scalability. Mysql's master-slave mode has become standard for websites at this time.
Table and database
With the continued rapid development of web2.0, on the basis of Memcached cache, master-slave replication and read-write separation of MySQL, the write pressure of MySQL master library begins to appear bottleneck, and the amount of data continues to soar. Because MyISAM uses table locks, there will be serious lock problems in high concurrency. A large number of highly concurrent MySQL applications begin to use InnoDB engine (row lock) instead of MyISAM. At the same time, it has become popular to use sub-table and sub-database to alleviate the expansion problem of writing pressure and data growth. At this time, sub-table and sub-database has become a hot technology, which is not only a hot issue in the interview, but also a hot technical issue discussed in the industry. It was at this time that MySQL launched an unstable table partition, which brought hope to companies with average technical strength. Although MySQL launched MySQL
Cluster cluster, but because there are few successful cases in the Internet, the performance can not meet the requirements of the Internet, but provides a very large guarantee in terms of high reliability.
Scalability bottleneck of MySQL
On the Internet, most MySQL should be IO-intensive. In fact, if your MySQL is CPU-intensive, it is likely that your MySQL design has performance problems and needs to be optimized. The development of MySQL applications in the environment of large amount of data and high concurrency is becoming more and more complex and technically challenging. It takes experience to grasp the rules of sub-table and sub-database. Although some technologically powerful companies like Taobao have developed a transparent middleware layer to shield the complexity of developers, it can not avoid the complexity of the entire architecture. The sub-libraries of sub-libraries and sub-tables are faced with the problem of expansion at a certain stage. There is also a change in requirements, which may require a new way of dividing libraries.
MySQL database often stores some large text fields, resulting in very large database tables, which is very slow when doing database recovery, and it is not easy to recover the database quickly. For example, text with a size of 10 million 4KB is close to the size of 40GB, and if you can omit this data from MySQL, MySQL will become very small.
Relational database is very powerful, but it can not cope with all application scenarios well. The scalability of MySQL is poor (complex technology is needed to implement it), the pressure of IO under big data is great, and it is difficult to change the table structure, which is exactly the problem faced by developers using MySQL at present.
Advantages of NOSQL
Easy to expand
There are many kinds of NoSQL databases, but a common feature is to remove the relational characteristics of relational databases. There is no relationship between the data, so it is very easy to extend. It also brings scalable capabilities at the architectural level.
Large amount of data, high performance
NoSQL databases have very high read and write performance, especially in the case of large amounts of data. This benefits from its non-relational nature and the simple structure of the database. Generally, Query Cache is used in MySQL, and Cache is invalid every time the table is updated. It is a kind of large-grained Cache. In applications with frequent interactions for web2.0, Cache performance is not high. NoSQL's Cache is record-level and is a fine-grained Cache, so NoSQL has much higher performance at this level.
Flexible data model
NoSQL does not need to establish fields for the data to be stored in advance, and can store custom data formats at any time. In a relational database, adding and deleting fields is a very troublesome thing. If it is a table with a very large amount of data, adding fields is a nightmare. This is especially true in the web2.0 era with a large amount of data.
High availability
NoSQL can easily implement a highly available architecture with little impact on performance. For example, the Cassandra,HBase model can also achieve high availability by copying the model.
Summary
The emergence of NoSQL database makes up for the shortcomings of relational data (such as MySQL) in some aspects, and can greatly save development costs and maintenance costs in some aspects.
Both MySQL and NoSQL have their own characteristics and application scenarios, and the close combination of the two will bring new ideas to the database development of web2.0. Let relational databases focus on relationships and NoSQL focus on storage.
Classification of NoSQL
NoSQL is just a concept, and NoSQL databases are divided into many categories according to the storage model and characteristics of the data.
Types
Partial representative
Characteristics
Column storage
Hbase
Cassandra
Hypertable
As the name implies, data is stored in columns. The biggest feature is that it is convenient to store structured and semi-structured data, and it is convenient to do data compression. It has great IO advantages for queries against a certain column or columns.
Document storage
MongoDB
CouchDB
Document storage is generally stored in a format similar to json, and the stored content is document-based. This gives you the opportunity to index some fields and implement some of the functions of a relational database.
Key-value storage
Tokyo Cabinet / Tyrant
Berkeley DB
MemcacheDB
Redis
Its value can be quickly queried through key. Generally speaking, storage is accepted according to order, regardless of value format. (Redis includes other features)
Graph storage
Neo4J
FlockDB
The best storage of graphic relationships. If the traditional relational database is used to solve the problem, the performance is low, and the design and use is not convenient.
Object storage
Db4o
Versant
Manipulate the database through a syntax similar to that of an object-oriented language and access data in the way of objects.
Xml database
Berkeley DB XML
BaseX
Store XML data efficiently and support XML's internal query syntax, such as XQuery,Xpath.
The classification of the above NoSQL database types is not absolute, but only a general division from the storage model. There is no absolute boundary between them, and there is also a gap between them. For example, the Table type storage of Tokyo Cabinet / Tyrant can be understood as document storage, and the Berkeley DB XML database is developed on top of Berkeley DB.
NoSQL or relational database
Although there was a more radical article "Relational Database is Dead" in 2009, we all know that relational database is still alive and well, and you have to use relational database. But it also illustrates the fact that relational databases do have bottlenecks when dealing with WEB2.0 data.
So are we going to use NoSQL or a relational database? I don't think it's necessary for us to give an absolute answer. We need to decide what we use according to our application scenario.
If relational databases work well in your application scenarios, and you are very good at using and maintaining relational databases, then I don't think you need to migrate to NoSQL unless you are a messy person. If you are in finance, telecommunications and other key areas where data is king, and you are currently using Oracle database to provide high reliability, don't rush to try NoSQL unless you encounter a particularly big bottleneck.
However, there are bottlenecks in most relational databases on WEB2.0 's website. Developers spend a lot of energy on disk IO and database extensibility, such as database sharding, master-slave replication, heterogeneous replication and so on. However, these tasks require more and more technical capabilities and become more and more challenging. If you are experiencing these occasions, then I think you should try NoSQL.
Choose the right NoSQL
There are so many types of NoSQL, and there are many of each type of NoSQL, what type of NoSQL is chosen as our storage? This is not an easy question to answer, there are many factors that affect our choice, and there may be many choices, and the choice of requirements may change again as the business scenario. We often need to consider the following situations:
The characteristics of data structure. Including structured, semi-structured, whether the field can be changed, whether there are large text fields, whether the data field may change.
Write characteristics. Including insert ratio, update ratio, whether to update the data of a small field, atomic update requirements.
Query characteristics. It includes the conditions of the query and the scope of the query hotspots. For example, the query of user information may be random, while the query of news is according to time, the newer the more frequent.
Combination of NoSQL and relational database
In fact, NoSQL database is only a compensation for some aspects of relational database (performance, expansion). In terms of function alone, almost all the functions of NoSQL can be satisfied in relational database, so the reason for choosing NoSQL is not functional.
Therefore, we generally use NoSQL and relational database together, taking their own strengths. When we need to use relational features, we use relational databases. When we need to use NoSQL features, we use NoSQL databases, each in its own place.
To take a simple example, such as the storage of user comments, comments probably have fields such as primary key id, comment object aid, comment content content, user uid, and so on. What we can be sure of is that the comment content content will definitely not use where content='' to query in the database, and the comment content is also a large text field. Then we can store the primary key id, comment object aid, user id in the database, and the comment content in NoSQL, so that the database saves the disk space occupied by the storage content, thus saving a lot of IO and making it easier to Cache the content.
/ / query the comment primary key id list commentIds=DB.query ("SELECT id FROM comments where aid=' comment object id' LIMIT 0Magin20") from MySQL; / / retrieve the comment entity data CommentsList=NoSQL.get (commentIds) from NoSQL according to the primary key id list; NoSQL replaces MySQL
In some applications, such as some configured relational key-value mapping storage, user name and password storage, Session session storage, etc., NoSQL can completely replace MySQL storage. It not only has higher performance, but also has more convenient development.
NoSQL acts as a cache server
In the architecture of MySQL+Memcached, we have to carefully design our cache everywhere, including expiration time design, cache real-time design, cache memory size evaluation, cache hit rate and so on.
NoSQL databases generally have very high performance, and in most scenarios, you no longer have to think about building a layer of Memcached cache for NoSQL at the code layer. The NoSQL data itself has done quite a lot of optimization work on Cache.
The size of data cached by an in-memory cache server such as Memcached is limited by the memory size, so if you use NoSQL instead of Memcached to cache the database, you can no longer be limited by the memory size. Although there may be a small amount of disk IO reads and writes, which may be a little slower than Memcached, it can be used to cache database query operations.
Avoid risk
Since NoSQL is a relatively new thing, especially since the NoSQL database we chose is not a very mature product, we may encounter unknown risks. In order to get the benefits of NoSQL, but also to consider risk aversion, how to have both fish and bear paw?
Now the practice of many companies in the industry is to back up the data. When storing data in NoSQL, it will also store a copy in MySQL. The NoSQL database itself needs to be backed up (cold backup and hot backup). Or you can consider using two kinds of NoSQL databases, which can be switched when there is a problem (to avoid the tragedy of digg using Cassandra).
Summary
This article is just a simple analysis of how to choose and integrate it from the perspective of MySQL and NoSQL. In fact, when choosing NoSQL, you may also encounter considerations about CAP principles, ultimate consistency, and BASE ideas. Because you will also encounter the above problems when using the MySQL architecture, it is not covered here.
Http://www.infoq.com/cn/news/2013/11/introducing-nosql
Column cluster storage
Column-oriented DBMS is a database management system that stores data tables as data columns rather than rows. Physically, a table is a collection of columns, and each column is essentially a table with only one field. These databases are commonly used in analytical systems, business intelligence, and analytical data storage.
Advantages:
Data can be compared because in a column of a table, the data is usually of the same type.
High-speed query performance can be achieved through cheap, mediocre hardware; due to compression, the data on disk takes up 5 to 10 times less space than relational databases.
Disadvantages:
There are usually no transactions.
There are many limitations for developers who are familiar with traditional RDBMS.
Typical representatives:
HBase
Cassandra
Accumulo
Amazon SimpleDB
Key value storage
You can use this database to store key-value pairs in persistent storage and then use keys to read values. So what are the benefits of this solution, which seems to be of limited use at first? The system is very efficient when saving / reading values based on keys because it does not have many limitations such as SQL processors, indexing systems, and analysis systems. This solution provides the most efficient performance, the lowest cost implementation, and scalability.
Advantages:
RDBMS is too slow and the burden on SQL cursors is too heavy.
Using RDBMS's solution to store a small amount of data is a bit expensive.
There is no need to use SQL queries, indexes, triggers, stored procedures, temporary tables, forms, views, and so on.
Because of its lightweight design, key-value databases can easily achieve scalability and high performance.
Disadvantages:
The limitations of relational databases can ensure the integrity of data from the bottom, while key-value storage does not have these restrictions, and the integrity of data is controlled by the application. In this case, the integrity of the data may be compromised due to errors in the application code.
In RDBMS, if the model is well designed, the logical structure of the database can fully reflect the structure of the stored data, and is different from the structure of the application (the data is independent of the application). For key value storage, it is very difficult to achieve this effect.
Typical representatives:
Amazon DynamoDB
Riak
Redis
LevelDB
Scalaris
MemcacheDB
Kyoto Cabinet
Document storage
Document storage refers to the program used to store, search and manage document-oriented information (semi-structured data), the central concept of which is the document. The specific implementation of document-oriented database is different, but generally speaking, they all encapsulate and encrypt the data (documents) in a variety of standardized formats, such as XML, YAML, JSON, BSON, PDF and so on.
Advantages:
A query language flexible enough.
Easy to scale horizontally.
Disadvantages:
In many cases, atomicity is not guaranteed.
Typical representatives:
MongoDB
Couchbase
CouchDB
RethinkDB
Pattern database
A graph database refers to a database that uses a graph structure to represent and store data through nodes, edges and attributes. By definition, a schema database is a storage system that provides adjacency without indexing. This means that each element contains a pointer directly to the adjacent element, so there is no need to look through the index.
Advantages:
The search for the associated dataset is faster.
They can naturally expand to larger datasets because they do not need to use expensive join operators.
Disadvantages:
RDBMS can be used in more general scenarios, where the schema database is only suitable for graph-like data.
Typical representatives:
Neo4j
FlockDB
InfoGrid
OrientDB
Multimode database
These databases contain a variety of database features.
There are two different product groups that can be considered multimodal:
A multimode database that supports multiple data models and use cases. For example, ArangoDB claims to have the benefits of key-value storage, while also providing support for document-oriented and schema databases.
A database for general purposes that supports multiple schemas. For example, MySQL 5.6 of Oracle supports SQL access, and key-value access can also be achieved through Memcached API.
Typical representatives:
ArangoDB
Aerospike
Datomic
Http://coolshell.cn/articles/7270.html
To start talking about data modeling techniques, we have to take a more or less systematic look at the growth trends of the NoSQL data model, so that we can understand some of their internal relationships. The following picture shows the evolution of the NoSQL family, and we can see such evolution: the Key-Value era, the BigTable era, the Document era, the full-text search era, and the Graph database era: (Chen Hao Note: note what SQL said in the intention, NoSQL will be SQL if it goes on like this, . )
NoSQL Data Models
First of all, we need to note that SQL and relational data models have been around for a long time, and this user-oriented nature means:
End users are generally more interested in the aggregate display of data than in separate data, which is mainly done through SQL.
We cannot manually control the concurrency, integrity, consistency, or data type verification of data. This is why SQL needs to do a lot of things in terms of transactions, two-dimensional table structures (schema), and appearance unions.
On the other hand, SQL allows software applications to control database data aggregation and data integrity and validity in many cases. And if we remove data consistency and integrity, it will be of great help to performance and distributed storage. Because of this, we have the evolution of data models:
Key-Value keys are very simple and powerful for storage. Many of the following technologies are basically based on this technology. However, Key-Value has a very fatal problem, that is, if we need to find a range of key. Chen Hao Note: anyone who has studied hash-table data structure should know that hash-table is a non-sequence container, unlike arrays, links, queues and other ordered containers, we can control the order of data storage. Therefore, the ordered key value
(Ordered Key-Value) the data model is designed to address this limitation to fundamentally improve the problem of datasets.
The Ordered Key-Value ordered key value model is also very powerful, but it also does not provide some kind of data model for Value. Generally speaking, the model of Value can be parsed and accessed by the application. This is very inconvenient, so there is a BigTable-type database. This data model actually means that there is map,map in map and then map is set up layer by layer, that is, key-value is nested layer by layer (there is another key-value in value). The Value of this database is mainly through "column families" (column).
Families), columns, and timestamps to control the version. (Chen Hao note: the version control of data with timestamps is mainly to solve the problem of data storage concurrency, that is, the so-called optimistic lock.
The Document databases document database improves the BigTable model and provides two meaningful improvements. The first is to allow subjective patterns (scheme) in Value, rather than map with map. The second is the index. Full Text Search Engines full-text search engines can be seen as a variant of document databases, which can provide flexible and variable data patterns (scheme) and automatic indexing. The main difference between them is that document databases are indexed by field names, while full-text search engines are indexed by field values.
Graph data models schema database can be thought of as a branch of this evolution process from Ordered Key-Value database. The schema database allows you to build a data model of the schematic structure. The reason it has a relationship with the document database is that many of its implementations allow value to be either a map or a document.
The above is all the content of the article "sample Analysis of Relational Database and nosql". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.