In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Brief introduction of distributed system and NoSQL
=
Data storage
1. Data storage
★ data Model:
Hierarchical model
Reticular model
Relation model
Object relation model
☉ relational model:
Relational data, strict schema
ACID rule
Distributed system
1. Introduction
★ distributed system:
A distributed system (distributed system) consists of multiple computers and communication software components connected through a computer network (local network or wide area network).
Distributed system is a software system based on network. Because of the characteristics of the software, the distributed system has a high degree of cohesion and transparency.
Therefore, the difference between a network and a distributed system has more to do with high-level software (especially the operating system) than hardware.
Distributed systems can be applied to different platforms such as Pc, workstations, local area networks and wide area networks.
two。 Advantages of distributed Computing
Advantages of ★
☉ reliability (fault tolerance):
An important advantage of distributed computing system is reliability. The system crash of one server does not affect the rest of the server.
☉ scalability:
In distributed computing systems, more machines can be added as needed.
☉ resource sharing:
Sharing data is essential for applications such as banking and booking systems.
☉ flexibility:
Because the system is very flexible, it is easy to install, implement and debug new services.
Faster speed of ☉:
A distributed computing system can have the computing power of multiple computers, which makes it faster than other systems.
☉ Open system:
Because it is an open system, the service can be accessed locally or remotely.
Higher performance of ☉:
Compared with centralized computer network clusters, it can provide higher performance (and better cost performance).
two。 Disadvantages of distributed Computing
Shortcomings of ★
Troubleshooting ☉:
Troubleshooting and diagnosing problems.
☉ software:
Less software support is the main disadvantage of distributed computing systems.
☉ Network:
Network infrastructure problems, including: transmission problems, high load, information loss and so on.
☉ security:
The characteristics of the development system make the distributed computing system have some problems, such as the security of data and the risk of sharing.
NoSQL introduction
1. Introduction
★ Baidu encyclopedia:
NoSQL generally refers to non-relational databases. With the rise of the Internet web2.0 website, the traditional relational database has been unable to cope with the web2.0 website, especially the super-large-scale and highly concurrent SNS type web2.0 pure dynamic website, which has exposed many insurmountable problems, while the non-relational database has developed rapidly because of its own characteristics. The emergence of NoSQL database is to solve the challenges brought by large-scale data collection and multiple data types, especially the application problem of big data.
★ what is NoSQL:
NoSQL (NoSQL = Not Only SQL), which means "not just SQL".
NoSQL refers to a non-relational database. NoSQL, sometimes referred to as the abbreviation of Not Only SQL, is a general term for database management systems that are different from traditional relational databases.
NoSQL is used to store very large-scale data. Google or Facebook, for example, collect terabytes of data for their users every day. These types of data stores do not require fixed schemas and can be scaled out without redundant operations.
A brief history of ★ NoSQL:
The term NoSQL, which first appeared in 1998, is a lightweight, open-source relational database developed by Carlo Strozzi that does not provide SQL functions.
In 2009, Last.fm 's Johan Oskarsson initiated a discussion on distributed open source databases [2], and Eric Evans from Rackspace once again put forward the concept of NoSQL. At this time, NoSQL mainly refers to non-relational, distributed, database design patterns that do not provide ACID.
The "no:sql (east)" seminar held in Atlanta in 2009 was a milestone with the slogan "select fun, profit from real_world where relational=false;". Therefore, the most common interpretation of NoSQL is "non-relational", emphasizing the advantages of Key-Value Stores and document databases, rather than simply opposing RDBMS.
two。 Why use NoSQL?
Today we can easily access and grab data through third-party platforms (such as Google,Facebook, etc.). Users' personal information, social networks, geographic locations, user-generated data and user action logs have increased exponentially. If we want to mine these user data, then SQL database is no longer suitable for these applications, but the development of NoSQL database can well deal with these large data.
3.RDBMS vs NoSQL
★ RDBMS
-highly organized structured data
-structured query language (SQL) (SQL)
-data and relationships are stored in separate tables.
-data manipulation language, data definition language
-strict consistency
-basic transaction
★ NoSQL
-represents more than just SQL.
-No declarative query language
-there are no predefined patterns
-key-value pair storage, column storage, document storage, graphics database
-final consistency, not ACID attribute
-unstructured and unpredictable data
-CAP theorem
-High performance, high availability and scalability
4.CAP theorem
In computer science, CAP Theorem (CAP theorem), also known as Brewer's theorem Theorem, points out that it is impossible for a distributed computing system to satisfy the following three points:
☉ consistency (Consistency)
All nodes have the same data at the same time
☉ availability (Availability)
Ensure that every request has a response, regardless of success or failure
☉ separation tolerance (Partition tolerance)
The loss or failure of any information in the system will not affect the continued operation of the system.
The core of ★ CAP theory is:
A distributed system can not meet the requirements of consistency, availability and partition fault tolerance at the same time, at most two can be satisfied at the same time.
Therefore, according to the CAP principle, the NoSQL database is divided into three categories: meeting the CA principle, meeting the CP principle and meeting the AP principle.
CA-A single point of cluster, a system that meets consistency and availability, and is usually not very scalable.
CP-A system that satisfies consistency and partition tolerance, usually with low performance.
AP-Systems that meet availability and partition tolerance may generally require less consistency.
Advantages and disadvantages of 5.NoSQL
Advantages of ★
-High scalability
-distributed computing
-low cost
-Architectural flexibility, semi-structured data
-there are no complicated relationships.
Shortcomings of ★
-No standardization
-Limited query capabilities (so far)
-ultimately consistent is an unintuitive program
6.BASE
★ BASE
BASE:Basically Available, Soft-state, Eventually Consistent . Defined by Eric Brewer.
The core of ☉ CAP theory is:
A distributed system can not meet the requirements of consistency, availability and partition fault tolerance at the same time, at most two can be satisfied at the same time.
☉ BASE is the general principle of weak availability and consistency requirements for NoSQL databases:
Basically Availble-basically available
Soft-state-soft state / flexible transactions. "Soft state" can be understood as "connectionless", while "Hard state" is "connection-oriented"
Eventual Consistency-ultimate consistency; ultimate consistency is also the ultimate goal of ACID.
7.NoSQL database classification
The type part represents
Characteristic column storage
Hbase
Cassandra
Hypertable
As the name implies, data is stored in columns. The biggest feature is that it is convenient to store structured and semi-structured data, and it is convenient to do data compression. It has great IO advantages for queries against a certain column or columns.
Document
Storage
MongoDB
CouchDB
Document storage is generally stored in a format similar to json, and the stored content is document-based. This gives you the opportunity to index some fields and implement some of the functions of a relational database.
Key-value storage
Tokyo Cabinet / Tyrant
Berkeley DB
MemcacheDB
Redis
Its value can be quickly queried through key. Generally speaking, storage is accepted according to order, regardless of value format. (Redis includes other features)
Graph storage
Neo4J
FlockDB
The best storage of graphic relationships. If the traditional relational database is used to solve the problem, the performance is low, and the design and use is not convenient.
Object storage
Db4o
Versant
Manipulate the database through a syntax similar to that of an object-oriented language and access data in the way of objects.
Xml database
Berkeley DB XML
BaseX
Store XML data efficiently and support XML's internal query syntax, such as XQuery,Xpath.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.