Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to verify the distribution of data by using QMQ diagram in big data

2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces big data how to use QmurQ map to verify the distribution of data, the content is very detailed, interested friends can refer to, I hope it can be helpful to you.

QMQ graph is a graphical method to test the distribution of any random variable (such as normal distribution, exponential distribution, lognormal distribution, etc.), and it is a statistical method to observe the properties of any distribution.

For example, if a given distribution needs to verify whether it is a normal distribution, we run statistical analysis and compare the unknown distribution with the known normal distribution. Then by observing the results of QMQ graph, we can determine whether the given distribution is normal or not.

Steps to draw a QMQ diagram:

Given an unknown random variable.

Find each percentile

Generate a known random distribution, based on which you also follow steps 1-2.

Draw QMQ diagram

Given a random distribution, it is necessary to verify whether it is a normal / Gaussian distribution. For ease of understanding, we name this unknown distribution X and the known normal distribution Y.

Generate unknown distribution XRV X = np.random.normal (loc=50, scale=25, size=1000)

We are generating a normal distribution with 1000 values, the average = 50 and the standard deviation = 25.

Look for 1% "100%:

X_100.append 100 = [] for i in range (1101): X_100.append (X, I))

Calculate each percentile (1% Percentage 2% Personals 3% Personals. . ., 99% 100%) the randomly distributed value of X and store it in Xbike 100.

Generate a known random distribution Y and its percentile value: y = np.random.normal (loc=0, scale=1, size=1000)

A normal distribution is generated with an average of 0 and a standard deviation of 1, which needs to be compared with the unknown distribution X to verify whether the X distribution is normal or not.

For i in range (100): Y_100.append (np.percentile (Y, I))

Calculate each percentile (1% Percentage 2% Personals 3% Personals. . , 99% 100%) randomly distributed value of Y and stored it in YQing 100.

Drawing:

Draw a scatter plot for the unknown distribution values obtained above.

Here X is an unknown distribution, which is compared with the normal distribution of Y.

For QMQ graph, if the scatters in the graph are on a straight line, then the two random variables have the same distribution, otherwise they have different distributions.

From the QMQ diagram above, we can see that X is normally distributed.

What if the two are not the same?

If X is not a normal distribution and it has other distributions, then if the QMQ graph is drawn between X and the normal distribution, then the scattering points will not be on a straight line.

Here, the X distribution is a lognormal distribution, so the scattering points in the QmurQ graph are not straight lines.

Let's take another look:

These are QmurQ graphs of X and Y distributions under four different conditions.

Top left: QQ diagram of lognormal distribution and normal distribution

Top right: QQ diagram of normal and exponential distribution

Bottom left: QQ diagram of index and exponential distribution

Bottom right: QQ diagram of logistic and logistic distribution

Python implementation: import numpy as npimport matplotlib.pyplot as pltX = np.random.normal (loc=50, scale=25, size=1000) Xray 100 = [] for i in range (1101): X_100.append (np.percentile (X, I)) Y = np.random.normal (loc=0, scale=1, size=1000) Yellow100 = [] for i in range (1101): Y_100.append (np.percentile (Y, I)) plt.scatter (X, I) Yellow100) plt.grid () plt.ylabel ("Y-normal distribution") plt.xlabel ("X-normal distribution") plt.show ()

The QMQ graph can be used to compare any two distributions, and the unknown distribution can be verified by comparing with the known distribution. A major limitation of this approach is that it requires a large number of data points because it is not a wise decision to get less data. By observing the QMQ diagram, we can predict whether the two distributions are the same.

About big data in how to use QMui Q map to verify the distribution of data here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report