In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces "the principle and implementation method of BitMap". In daily operation, I believe that many people have doubts about the principle and implementation of BitMap. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts about "the principle and implementation method of BitMap". Next, please follow the editor to study!
In java:
Byte-> 8 bits-> 1 byte char-> 16 bit-> 2 byte short-> 16 bits-> 2 byte int-> 32 bits-> 4 byte float-> 32 bits-> 4 byte long-> 64 bits-> 8 byte BitMap implementation principle
In java, an int type accounts for 32 bits, we use an int array to represent new int [32], a total of memory 32*32bit, now if we use each bit of int bytecode to represent a number, then 32 numbers only need an int type of memory size is enough, so in the case of a large amount of data will save a lot of memory.
Specific ideas:
1 int occupies 4 bytes, that is, 4'8'32 bits, so we only need to apply for an int array with the length of int tmp [1+N/32] to store the data, where N represents the total number to be searched, and each element in tmp occupies 32 bits can correspond to the decimal number 0q31, so we can get the BitMap table:
Tmp [0]: can denote 0,31 tmp [1]: can denote 320063 tmp [2] can represent 640095.
So let's take a look at how decimal numbers are converted into corresponding bit bits: suppose the 4 billion int data is: 6 int 3, 8, 32, 35, and then the specific BitMap is expressed as:
How to determine which subscript of the int number in the tmp array, this can actually be directly divided by 32 to take the integer part, for example: integer 8 divided by 32 rounded equal to 0, then 8 is on the tmp [0]. In addition, how do we know which bit of 8 is in the 32 bits in tmp [0]? in this case, 32 is ok directly on mod, and if the integer 8 is equal to 8 on the 8th mod in tmp [0], then the integer 8 is in the eighth bit bit in tmp [0] (from the right).
/ * * @ author sowhat * @ create 2020-08-20 19:30 * / public class BitMap {private long length; private static int [] bitsMap Private static final int [] BIT_VALUE = {0x00000001, 0x00000002, 0x00000004, 0x00000008, 0x00000010, 0x00000020, 0x00000040, 0x00000080, 0x00000100, 0x00000200, 0x00000400, 0x00000800, 0x00001000, 0x00002000, 0x00004000, 0x00008000, 0x00010000, 0x00020000, 0x00040000, 0x00080000, 0x00100000, 0x00200000, 0x00400000, 0x00800000, 0x01000000, 0x02000000, 0x04000000, 0x08000000, 0x10000000, 0x20000000, 0x40000000, 0x80000000} Public BitMap (long length) {this.length = length / * the required array size is calculated according to the length * when length%32=0 is equal to = length/32 * when length%32 > 0 the size is equal to = length/32+l * / bitsMap = new int [(int) (length > > 5) + (length & 31) > 0? 1: 0)] } / * * @ param n to be set to n * / public void setN (long n) {if (n)
< 0 || n >Length) {throw new IllegalArgumentException ("length value" > BitMap application
1: look at a small scene > find integers that are repeated > = 2 times among 300 million integers. The memory limit is not enough to hold 300 million integers.
For this scenario, I can use 2-BitMap to solve this problem, that is, assign 2bit to each integer and use different combinations of 0 and 1 to identify the special meaning. For example, 00 means that the integer has not appeared, 01 means it has appeared once, and 11 means it has appeared multiple times. The memory space required is 2 times that of the normal BitMap, which is 300 million * 2/8/1024/1024=71.5MB.
The specific process is as follows:
Scan 300 million integers, group BitMap, first check the corresponding position in the BitMap, if 00 becomes 01, 01 becomes 11, and 11 remains the same, when 300 million integers are scanned, that is to say, the whole BitMap has been assembled. Finally, look at the integer output that BitMap will correspond to bit 11.
2: it is known that a file contains some phone numbers, each with 8 digits, and count the number of different numbers.
8 bits at most 99 999 999, about 99 million bit, size = 99 999 999 bit, size = 99 999 999, 1024, 1024, 12m. (it can be understood as a number from 0-99 999 999, each number corresponds to one Bit bit, so only 99 million Bit==1.2MBytes is needed, so all 8-digit phones are represented with a small 12 megabytes of memory.)
Another way to analyze BitMap
I. introduction of questions
BitMap is a bitmap, which, to be exact, is translated into bit-based mapping. For example, there is a disordered bounded int array {1rec 2rec 5pm 7}, which is initially estimated to occupy 4cm 4x16 bytes of memory, which is not surprising, but if there are 1 billion such numbers, 1 billion * 4 bytes / (1024 "1024" 1024) = 3.72g (1GB=1024MB, 1MB=1024KB, 1KB=1024B, 1B=8b). If such a large data is searched and sorted, it is estimated that the memory will also crash. Some people say that these data can not be loaded at once, and that is to save the disk, which is bound to consume IO. What we advocate is high performance, which is not considered directly.
Second, problem analysis
If you use the idea of BitMap to solve it, it would be much better. The solution is as follows:
A byte accounts for 8 bit, if each bit has or does not have, that is, binary 0 or 1, if the position of the bit is used to represent the array value, then 0 means that the value has not been present, and 1 indicates that the array value has been present. Can't you describe the data, too? The details are as follows:
Isn't it amazing? now if the space needed for 1 billion of the data is 3.72G/32, a data that takes up 32bit now only takes up 1bit, saving a lot of space, not to mention sorting, everything seems so smooth. There is no correlation between such data, if you read it, you can read it in a multi-threaded way. Time complexity is also O (Max/n), where Max is the size of the byte [] array and n is the thread size.
III. Application and Code
If BitMap is just this feature, but not its elegance, then continue to appreciate its charm. The following computing idea is actually aimed at the logical operation of bit, and application scenarios similar to this kind of logical operation can be used in permission calculation.
Before we look at the code, let's figure out a question, how can a number quickly locate its index number, that is, find out what the index of byte [index] is and which one is position. Give an example, such as add (14). 14 has exceeded the mapping range of byte [0], in the range of byte [1] and so on. So how to quickly locate its index. If you find its index number, how to locate it. Index (N) represents the index number of N, and Position (N) represents the location number of N.
Index (N) = Namp 8 = N > > 3; Position (N) = N% 8 = N & 0x07
(1) add (int num)
What are you going to do with add data in bitmap? don't worry, it's simple and amazing.
As analyzed above, the purpose of add is to change the location from 0 to 1. The other positions remain the same.
Public void add (int num) {/ / num/8 gets the index int arrayIndex = num > > 3 of byte []; / / num%8 gets the position int position = num & 0x07 at byte [index]; / / after moving 1 to the left, that position is naturally 1, and then does it with the previous data |, in this way, that position is replaced with 1. Bits [arrayIndex] | = 1 > 3; / / num%8 gets the position int position = num & 0x07; / / after moving 1 to the left, that position is naturally 1, then reverse it, and do & with the current value to clear the current position. Bits [arrayIndex] & = ~ (1 > 3; / / num%8 gets the position int position = num & 0x07; / / after moving 1 to the left, that position is naturally 1, and then do & with the previous data to determine whether it is return (bits [arrayIndex] & (1 > 3) + 1]. } public void add (int num) {/ / num/8 gets the index int arrayIndex = num > > 3 of byte []; / / num%8 gets the position int position = num & 0x07 at byte [index]; / / after moving 1 to the left, that position is naturally 1, and then do it with the previous data |, in this way, that position is replaced with 1. Bits [arrayIndex] | = 1 > 3; / / num%8 gets the position int position = num & 0x07; / / after moving 1 to the left, that position is naturally 1, and then do & with the previous data to determine whether return (bits [arrayIndex] & (1 > 3) / / num%8 gets the position int position = num & 0x07; / / after moving 1 to the left, that position is naturally 1, then reverse it and do & with the current value to clear the current position. Bits [arrayIndex] & = ~ (1)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.