How to calculate Pi with hadoop 04/10 Update SLTechnology News&Howtos

How to calculate Pi with hadoop

2025-04-10 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces how to use hadoop to calculate Pi value, the article is very detailed, has a certain reference value, interested friends must read it!

I. the way and principle of calculating Pi

Baidu, there are a lot of ways to calculate PI. But the comment in the hadoop examples code says that the Quasi-Monte Carlo algorithm is used to estimate the value of PI.

Wikipedia's description of Quasi-Monte Carlo is more theoretical, with a lot of difficult formulas.

Fortunately, google found an article on the Stanford University website: "can you get the value of PI by throwing darts?" The article is short, illustrated and easy to understand.

Here I take a picture of an important part of that article:

Explain the picture above a little bit:

1. Figure2 is the upper right corner of the Figure1.

2. Throw darts into the Figure2 several times (a large number), and each time it is still at a different point.

3. If you throw too many times, Figure2 will be "riddled with holes".

4. At this time, "number of throws in the circle" divided by "total number of throws", multiplied by 4, is the value of PI! (for a specific derivation process, see the original text)

In this algorithm, a very important point is how to "throw at Figure2 randomly", that is, how to make every point on the Figure2 have an equal probability of being hit.

Halton sequence is used to guarantee this in hadoop examples code. For more information about Halton sequence, please refer to Wikipedia.

I would like to summarize the role of Halton sequence again: in a square of 1 times 1, it produces non-repetitive and uniform points. The Abscissa and ordinate values of each point are between 0 and 1.

It is in this way that it is guaranteed to be able to "throw at Figure2 randomly".

Someone summed up, this is actually called Monte Carlo algorithm, we take a unit square (1 × 1) inside to do an inscribed circle (unit circle), then the unit square area: incut unit circle area = the number of darts in the unit square: the number of darts in the unit circle, the unit circle area can be calculated by calculating the number of darts. Calculating pi.

Note that the accuracy is proportional to the number of darts you throw.

Second, run hadoop's command to estimate PI

[java] view plaincopyprint?

Hadoop jar $HADOOP_HOME/hadoop-*-examples.jar pi 100 100000000

The meaning of the next two numeric parameters:

The first 100 means to run a map task 100 times

The second number refers to how many times each map task has to be thrown

The product of the two parameters is the total number of throws.

The result of my run:

The above is all the contents of the article "how to calculate Pi with hadoop". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.