In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
Foreword:
Hive can map structured data files to a database table and provide SQL-like query functions. Before learning Hive, let's take a look at the differences between structured data, semi-structured data and unstructured data. 1. Structured data refers to the data that can be represented and stored in a relational database and expressed in two-dimensional form. The general characteristics are: the data is a behavior unit, a row of data represents the information of an entity, and the attributes of each row of data are the same. For example: id name age gender1 lyh 12 male2 liangyh 13 female3 liang 18 male so structured data is stored and arranged regularly, which is helpful for operations such as queries and modifications. But, obviously, it doesn't scale well (for example, what if I want to add a field? ). two。 Semi-structured data is a form of structured data, which does not conform to the data model structure associated with relational databases or other data tables, but contains related markup. Used to separate semantic elements and to layer records and fields. Therefore, it is also called self-describing structure.
Semi-structured data, belonging to the same category of entities can have different attributes, even if they are grouped together, the order of these attributes is not important.
Common semi-structured data are XML and JSON. For two XML files, the first may have A13female
The second possibility is:
Bmale from the above example, the order of attributes is not important, and the number of attributes in different semi-structured data is not necessarily the same. Some people say that semi-structured data is data stored in the data structure of trees or graphs, how to understand it? In the above example, the label is the root node of the tree, and the tag is the child node. Through this data format, you can freely express a lot of useful information, including self-description information (metadata). Therefore, the scalability of semi-structured data is very good. 3. Unstructured data, as its name implies, is data without a fixed structure. All kinds of documents, pictures, video / audio and so on belong to unstructured data. For this kind of data, we generally store it directly as a whole, and generally store it in a binary data format.
Reference: https://blog.csdn.net/liangyihuai/article/details/54864952
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.