In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "how to use Python reservoir algorithm to achieve random sampling". In daily operation, I believe many people have doubts about how to use Python reservoir algorithm to achieve random sampling. The editor consulted all kinds of data and sorted out simple and easy-to-use operation methods. I hope it will be helpful for everyone to answer the doubt of "how to use Python reservoir algorithm to achieve random sampling". Next, please follow the editor to study!
Now there is a set of numbers. I don't know the total number of this group of numbers. Please describe an algorithm that can randomly extract k numbers from this set of data, so that the probability of each number being taken out is equal.
If there are n numbers in this group, then the probability taken by each number is KBO, but the difficulty of this problem is that you don't know the total number of this number, that is, you don't know n, so how to calculate the probability of each number?
Reservoir algorithm
The swimming pool (reservoir) is no stranger to everyone. The water in some swimming pools is alive, with both inlet and outlet pipes, so will all the water in the swimming pool be replaced after the current of the same volume as the swimming pool? Of course not. Some of the water may stay in the pool for a long time, and some may flow away as soon as it gets in. Following this phenomenon, the reservoir sampling algorithm was born. The key of the reservoir algorithm is to ensure that the water flowing into the reservoir and the water already in the pool remain in the reservoir with the same probability. And the reservoir algorithm can solve this kind of sampling problem without knowing the total amount in advance and in the case of time complexity O (N).
Core principle
This part involves the formula, and the picture is pasted directly to ensure the effect.
Python implementation
Next, try to implement the reservoir algorithm with Python, because the reservoir algorithm is sampled without knowing the total amount in advance, so define a method to receive a single element, and put this method in the class to hold the sampled data.
Import randomclass ReservoirSample (object): def _ init__ (self, size): self._size = size self._counter = 0 self._sample = [] def feed (self, item): self._counter + = 1 # element I (I
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.