Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Learn from big data: big data analysis of the core article!

2025-01-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

I. purpose and classification of data analysis

Data analysis processes information from observations, measurements, or experiments of an interest phenomenon. The purpose of data analysis is to extract as much information as possible from data related to the subject. The main objectives include:

Speculate or interpret data and determine how to use it

Check whether the data is legal

Make reasonable suggestions for decision-making

Diagnose or infer the cause of an error

Predict what will happen in the future

Because of the diversity of statistical data, the methods of data analysis are very different. Data can be divided into several categories according to the following criteria: qualitative or quantitative data obtained from observation and measurement, univariate or multivariate data according to the number of parameters. In addition, some work has summarized the domain-related algorithms. Manimom et al classified data mining algorithms into descriptive (deive), predictive and confirmatory (veryfying) Bhatt and others divide multimedia analysis methods into feature extraction, deformation, representation and statistical data mining. However, there is no classification of big data processing methods. Blackett and others divide data analysis into three levels according to the depth of data analysis: descriptive (deive) analysis, predictive analysis (predictive analysis) and regular (preive) analysis.

Descriptive analysis

Describe what happens based on historical data, for example, using regression techniques to detect simple trends from data sets, visualization techniques to represent data more meaningfully, and data modeling to collect, store, and delete data in a more efficient way. descriptive analysis is often used in business intelligence and visibility systems.

Predictive analysis

Used to predict future probabilities and trends, for example, predictive models use statistical techniques such as linear and logarithmic regression to discover data trends, predict future output results, and use data mining techniques to extract data patterns (pattern) to give predictions.

Regular analysis

Solve decision making and improve analysis efficiency, for example, simulation is used to analyze complex systems to understand system behavior and find problems, while optimization techniques give the optimal solution under given constraints.

II. Application evolution

Data-driven applications have emerged in the past few decades, such as the business intelligence that appeared in the business field in the 1990s and the web search engine based on data mining in the early 21st century. Next, we will introduce the development of big data analysis application, which has high influence in the typical field of big data in different periods.

(1) Evolution of business applications

Early business data are structured data, which are collected by enterprises or companies and stored in relational database management systems. The data analysis techniques used in these systems are usually intuitive and simple. Gartner summarizes the common methods of business intelligence applications, including reports (reporting), dashboards (dashboard), instant queries (adhocquery), search-based business intelligence, online transaction processing, interactive visualization, scorecard, prediction models and data mining. At the beginning of the 21st century, the Internet and web enabled enterprises to put their business online and contact customers directly. A large number of product and customer information, such as clickstream data logs and user behavior, can be collected through web. By using different text and web mining technologies, product placement optimization, customer transaction analysis, product recommendation and market structure analysis can be completed. It is reported that in 2011, the number of mobile phones and tablets surpassed that of laptops and PCs for the first time, and mobile phones and the Internet of things built innovative applications with location-aware, personal-centric and context-aware.

(2) the evolution of network application

In the early days, the network provided e-mail and website services, so text analysis, data mining and web analysis technologies were used to mine email content and create search engines. Network data accounted for the vast majority of global data, including text, images, videos, photos, interactive content and other types of data. Subsequently, analysis techniques for semi-structured and unstructured data were developed. For example, image analysis technology can extract meaningful information from photos, and multimedia analysis technology can automate video surveillance systems in commercial or military fields. After 2004, the emergence of online social media such as forums, blogs, social networking sites and multimedia sharing sites enables users to generate, upload and share rich user-generated content. From these different people posting social media content, we can dig out daily hot events and social and political views, so as to provide timely feedback and opinions.

(3) the evolution of scientific application

High-yield sensors and instruments in many areas of scientific research will produce a large amount of data, such as astronomy, oceanography, genetics and environmental research. NSF of the United States has announced the establishment of a project for the BIGDATA project to promote data sharing and analysis. Some scientific research disciplines have previously developed an analysis platform for massive data and achieved effective results. For example, in biology, iPlant uses information infrastructure, physical computing resources and analysis software to support interoperability to provide data services to researchers, educators and students dedicated to enriching plant science knowledge. IPlant datasets are diverse data, including authoritative and referential data, experimental data, simulation modeling data, observation data and other processed data.

.

Based on the above analysis, the research of data analysis can be divided into six directions: structured data analysis, text analysis, web data analysis, multimedia data analysis, social network data analysis and mobile data analysis. Structured data analysis refers to traditional data analysis. Web data, multimedia data, social network data and mobile data may include some data types of structured data (such as text) from the data form, but they have new analysis requirements and characteristics in specific application areas.

Third, commonly used analytical methods

Although the goals and application areas are different, some common analysis methods are useful for almost all data processing. Three types of common data analysis methods are discussed below.

Data visualization

Related to information graphics and information visualization, the goal of data visualization is to display information clearly and effectively in a graphical way. 38) generally speaking, charts and maps can help people understand information quickly, but when the amount of data increases to the level of big data, traditional technologies such as spreadsheets can no longer handle large amounts of data. Big data's visualization has become an active research field because it can assist algorithm design and software development. Friedman and Frits discussed data visualization from the fields of information representation and computer science respectively. Tabusvis is a lightweight visualization system that provides flexible and customizable data visualization for multidimensional data.

Statistical analysis

Based on statistical theory, it is a branch of applied mathematics. In statistical theory, randomness and uncertainty are modeled by probability theory. Statistical analysis techniques can be divided into descriptive statistics and inferential statistics. Descriptive statistical techniques summarization or describe data sets, while inferential statistics can infer the process. More multivariate statistical analysis includes regression, factor analysis, clustering and discriminant analysis.

data mining

It is the calculation process of discovering big data's centralized data pattern, and many data mining algorithms have been applied in the fields of artificial intelligence, machine learning, pattern recognition, statistics and database. In addition, some other advanced technologies such as neural networks and genetic algorithms are also used for data mining in different applications. Sometimes, it can almost be thought that the boundaries between many methods are gradually diluted, such as data mining, machine learning, pattern recognition, even visual information processing, media information processing and so on.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report