Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize Random sampling and probability Distribution in Python

2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly introduces how to achieve random sampling and probability distribution in Python, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, let the editor take you to understand it.

Python (including its packet Numpy) contains many probability algorithms, including basic random sampling and many classical probability distribution generation. This series introduces several probability functions commonly used in machine learning. First, let's look at the most basic function-random sampling.

1. Random.choice

If we only need to take one sample from the sequence (all samples are taken with equal probability), we only need to use random.choice:

Import randomres1 = random.choice ([0,1,2,3,4]) print (res1) # 32. Random.choices (put it back)

Of course, many times we not only need to take a number, but we need to set the probability of each item in the sequence to be different. At this point, we can use the random.random.choices function, which is used to sample a sequence that has been put back (that is, an item of data can be repeated multiple times). The prototype of the function is as follows:

Random.choices (population, weights=None, *, cum_weights=None, Kend1)

Population: sequence to be sampled

Weights: the weight assigned to each sample (also known as relative weight), which determines the probability of each sample being taken, such as [10,0,30,60,0]

Cum_weights: cumulative weight, relative weight [10,0,30,60,0] equals cumulative weight [10,10,40,100,100]

We sampled three samples according to the relative weight from [0,1,2,3,4] as follows:

Res2 = random.choices ([0meme 1je 2je 3jue 4], weights= [10,0,30,60,0], kryp3) # Note that population is not a keyword parameter and cannot be written as population= when calling a function to pass parameters # about keyword parameters and position parameters Please see my blog "Python technique 2: advanced usage of function parameters" https://www.cnblogs.com/orion-orion/p/15647408.htmlprint(res2) # [3, 3, 2]

Resample 3 according to cumulative weights from [0,1,2,3,4] and the samples are as follows:

Res3 = random.choices ([0,1,2,3,4], cum_weights= [10,10,40,100,100], print (res3) # [0,3,3]

Note that relative weight weights and cumulative weight cum_weights cannot be passed at the same time, otherwise TypeError exception 'Cannot specify both weights and cumulative weights' will be reported.

3. Numpy.sample (no playback)

Random.sample is no put back, if we need no put back sampling (that is, each item can only be collected once), then we need to use random.sample. It is important to note that if you use this function, you will not be able to define sample weights. The function prototype is as follows:

Random.sample (population, k, *, counts=None)

Population: sequence to be sampled

K: number of sampling elements

Counts: defines the number of repeats of collection elements in cases where population is a repeatable collection. Sample (['red',' blue'], counts= [4,2], KF5) is equivalent to sample (['red',' blue', 'blue'], KF5)

We sampled the sequence [0,1,2,3,4] three times without putting it back as follows:

Res3 = random.sample ([0,1,2,3,4], kryp3) print (res3) # [3,2,1]

The repeatable set [0, 1, 1, 2, 2, 3, 3, 4] is sampled 3 times without putting back as follows:

Res4 = random.sample ([0,1,2,3,4], counts= [1,2,2,2,1]) print (res4) # [3,2,2]

If the counts length does not match the length of the population sequence, an exception ValueError: "The number of counts does not match the population" is thrown.

4.rng.choices and rng.sample

Another implementation method of putting back sampling is what I learned in the code [2] of the paper [1]. That is, first define a random number generator, and then call the choices method or sample method of the random number generator, which is used in the same way as the random.choice/random.sample function.

Rng_seed = 1234rng = random.Random (rng_seed) res5 = rng.choices (population= [0,0,0.3,0.6,0], weights= [0.1,0,0.3,0.6,0], print (res5) # [3,3,0] res6 = rng.sample (population= [0min1min2d3d4], kumb3,) print (res6) # [4,0,2]

These two functions are used to randomly select the task node client in the implementation code [2] of paper [1]:

Def sample_clients (self): "" sample a list of clients without repetition "rng_seed = (seed if (seed is not None and seed > = 0) else int (time.time ()) self.rng = random.Random (rng_seed) if self.sample_with_replacement: self.sampled_clients =\ self.rng.choices ( Population=self.clients Weights=self.clients_weights, k=self.n_clients_per_round,) else: self.sampled_clients = self.rng.sample (self.clients, k=self.n_clients_per_round) 5. Numpy.random.choices

Sampling according to weight distribution from the sequence can also be realized by numpy.random.choice. The prototype of the function is as follows:

Random.choice (a, size=None, replace=True, p=None)

A: 1murd array-like or int   if it is 1murd array-like, then the sample will be taken from its elements. If it is int, the sample will be taken from np.arange (a)

Size: int or tuple of ints, optional   is the size of the output shape, if the given shape is (m * n × k), then the m × n × k samples will be taken from it. The default is None, which returns a single scalar.

Replace: boolean, optional   indicates whether the sample is put back or not. If replace=True, a value can be sampled again (a value can be taken multiple times), otherwise it will not be put back (a value can only be taken once).

P: 1Mel D array-like, optional   denotes the probability of each item in a being adopted. If it is not given, then we assume that the probability of each item in an obeys a uniform distribution (that is, the probability of each item is the same).

The repetition / non-repetition sampling of 3 times from [0meme 1, 2, 3, 4, 5] is as follows:

Import numpy as npres1 = np.random.choice (5,3, replace=True) print (res1) # [1 14] res2 = np.random.choice (5, 3, replace=False) print (res2) # [2 14]

It is also repeated / not repeated three times in [0pr 1, 2jol 3je 4je 5]. Now let's set different probabilities for each sample:

Res3 = np.random.choice (5,3, p = [0.1,0,0.3,0.6,0]) print (res3) # [23 3] res4 = np.random.choice (5,3, replace=False, p = [0.1,0,0.3,0.6,0]) print (res4) # [3 20] Thank you for reading this article carefully I hope the article "how to achieve random sampling and probability distribution in Python" shared by the editor will be helpful to everyone. At the same time, I also hope that you will support and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report