In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
This article is to share with you what is related to data mining, Xiaobian thinks it is very practical, so share it with you to learn, I hope you can gain something after reading this article, not much to say, follow Xiaobian to have a look.
Data mining refers to the process of searching hidden information from a large amount of data through algorithms. Data mining is usually associated with computer science and achieves the goal of searching for hidden information in large amounts of data through many methods such as statistics, online analytical processing, intelligence retrieval, machine learning, expert systems (relying on past rules of thumb), and pattern recognition.
Data mining is a hot topic in the field of artificial intelligence and database research. Data mining refers to a non-trivial process that reveals hidden, previously unknown and potentially valuable information from a large amount of data in databases.
Data mining is a decision support process, which is mainly based on artificial intelligence, machine learning, pattern recognition, statistics, database, visualization technology, etc., highly automated analysis of enterprise data, inductive reasoning, mining potential patterns from it, helping decision makers adjust market strategies, reduce risks and make correct decisions.
The process of knowledge discovery consists of three stages: ① data preparation;② data mining;③ result expression and interpretation. Data mining can interact with users or knowledge bases.
data mining objects
Data types can be structured, semi-structured, or even heterogeneous. Knowledge can be discovered mathematically, nonmathematically, or inductively. Finally, the discovered knowledge can be used for information management, query optimization, decision support and data maintenance. [4]
The object of data mining can be any type of data source. It can be a relational database, which contains structured data, or a data warehouse, text, multimedia data, spatial data, temporal data, Web data, which contains semi-structured data or even heterogeneous data. [4]
Knowledge can be discovered numerically, nonnumerically, or inductively. The knowledge finally discovered can be used for information management, query optimization, decision support and data maintenance.
Data Mining Steps
Before implementing data mining, first determine what steps to take, what to do at each step, what goals are necessary to achieve, and have a good plan to ensure that data mining is implemented in an orderly manner and successful. Many software vendors and data mining consultants offer models of the data mining process to guide their users step by step through the data mining process. For example, SPSS 5A and SAS SEMMA.
Data mining process model steps mainly include defining problems, building data mining database, analyzing data, preparing data, building models, evaluating models and implementing. Let's take a look at the details of each step:
(1)Definition problem. The first and most important requirement before starting knowledge discovery is to understand data and business problems. You have to have a clear definition of what you want to do. For example, if you want to increase the utilization of email, you may want to do "increase user usage" or "increase the value of a user use." The models for solving these two problems are almost completely different. Decisions must be made.
(2)Establish data mining database. Building a data mining library includes the following steps: data collection, data description, selection, data quality assessment and data cleaning, merging and integration, building metadata, loading the data mining library, and maintaining the data mining library.
(3)Analyze the data. The goal of the analysis is to find the data fields that have the greatest impact on the prediction output and decide whether export fields need to be defined. If the dataset contains hundreds or thousands of fields, browsing and analyzing the data can be a time-consuming and tiring task, and you need to choose a software tool with a good interface and powerful capabilities to help you do this.
(4)Prepare the data. This is the final step in data preparation before building the model. This step can be divided into four parts: selecting variables, selecting records, creating new variables, and converting variables.
(5)Build models. Modelling is an iterative process. Different models need to be examined carefully to determine which model is most useful for the business problem at hand. Build a model with some of the data, and then test and validate the resulting model with the rest of the data. Sometimes there is a third data set, called the validation set, because the test set may be influenced by the characteristics of the model, and a separate data set is needed to verify the accuracy of the model. Training and testing a data mining model requires splitting the data into at least two parts, one for model training and one for model testing.
(6)Evaluation model. Once the model has been built, it is necessary to evaluate the results obtained and explain the value of the model. Accuracy from the test set is meaningful only for the data used to build the model. In practical applications, it is necessary to further understand the types of errors and the associated costs. Experience has shown that a valid model is not necessarily a correct model. The immediate cause of this is the assumptions implicit in modeling, so it is important to test models directly in the real world. First in a small range of applications, get test data, feel satisfied after the promotion to a wide range.
(7)Implementation. Once the model has been built and validated, there are two main ways to use it. The first is to provide analysts with a reference; the other is to apply the model to different data sets.
The above is what data mining is, Xiaobian believes that some knowledge points may be what we see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.