In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
Author | Bai Song
Objective: in scientific research, it is necessary to analyze the number of vertices involved in each iteration in order to further optimize the system. For example, the last line of the compute () method of SSSP changes the current vertex voteToHalt, that is, to the InActive state. So after each iteration, all vertices are in InActive state. After big synchronization, the vertex that receives the message is activated, becomes Active, and then calls the vertex's compute () method. The purpose of this paper is to count the number of vertices involved in each iteration. The compute () method of SSSP is attached below:
@ Override public void compute (Iterable messages) {if (getSuperstep () = = 0) {setValue (new DoubleWritable (Double.MAX_VALUE));} double minDist = isSource ()? 0d: Double.MAX_VALUE; for (DoubleWritable message: messages) {minDist = Math.min (minDist, message.get ());} if (minDist < getValue (). Get ()) {setValue (new DoubleWritable (minDist)) For (Edge edge: getEdges ()) {double distance = minDist + edge.getValue (). Get (); sendMessage (edge.getTargetVertexId (), new DoubleWritable (distance));}} / / set vertices to InActive state voteToHalt ();}
Note: the termination condition of the algorithm in giraph is that there are no active vertices and there is no message passing between worker.
The termination condition of the algorithm in hama-0.6.0 is only to determine whether there are active vertices. Not a real pregel idea, a semi-finished product.
The modification process is as follows:
Org.apache.giraph.partition. PartitionStats class
Add variables and methods to count the number of vertices that each Partition participates in the calculation in each superstep. The variables and methods added are as follows:
/ * * computed vertices in this partition * / private long computedVertexCount=0;/*** Increment the computed vertex count by one.*/public void incrComputedVertexCount () {+ + computedVertexCount;} / * * @ return the computedVertexCount * / public long getComputedVertexCount () {return computedVertexCount;}
Modify the readFields () and write () methods, appending the last sentence to each method. When each Partition calculation is completed, its own computedVertexCount is sent to Master,Mater to read the summary.
@ Overridepublic void readFields (DataInput input) throws IOException {partitionId = input.readInt (); vertexCount = input.readLong (); finishedVertexCount = input.readLong (); edgeCount = input.readLong (); messagesSentCount = input.readLong (); / / add the next statement computedVertexCount=input.readLong ();} @ Overridepublic void write (DataOutput output) throws IOException {output.writeInt (partitionId); output.writeLong (vertexCount); output.writeLong (finishedVertexCount); output.writeLong (edgeCount) Output.writeLong (messagesSentCount); / / add the next statement output.writeLong (computedVertexCount);}
Org.apache.giraph.graph. GlobalStats class
Add variables and methods to count the total number of vertices involved in each superstep, including all Partitions on each Worker.
/ * * computed vertices in this partition * Add by BaiSong * / private long computedVertexCount=0; / * @ return the computedVertexCount * / public long getComputedVertexCount () {return computedVertexCount;}
Modify the addPartitionStats (PartitionStats partitionStats) method to add the statistical computedVertexCount function.
/ * Add the stats of a partition to the global stats. * * @ param partitionStats Partition stats to be added. * / public void addPartitionStats (PartitionStats partitionStats) {this.vertexCount + = partitionStats.getVertexCount (); this.finishedVertexCount + = partitionStats.getFinishedVertexCount (); this.edgeCount + = partitionStats.getEdgeCount (); / / Add by BaiSong, add the next statement this.computedVertexCount+=partitionStats.getComputedVertexCount ();}
Of course, for the convenience of Debug, you can also modify the toString () method of this class (optional), as follows:
Public String toString () {return "(vtx=" + vertexCount + ", computedVertexCount=" + computedVertexCount + ", finVtx=" + finishedVertexCount + ", edges=" + edgeCount + ", msgCount=" + messageCount + ", haltComputation=" + haltComputation + ")";} org.apache.giraph.graph. ComputeCallable
Add statistics. In the computePartition () method, add the following sentence.
If (! vertex.isHalted ()) {context.progress (); TimerContext computeOneTimerContext = computeOneTimer.time (); try {vertex.compute (messages); / / add the following sentence: when the vertex has called the compute () method, add 1 partitionStats.incrComputedVertexCount () to the computedVertexCount of the Partition;} finally {computeOneTimerContext.stop ();}. Add Counters statistics, and my blog Giraph source code analysis (7)-add message statistics function is similar, I will not elaborate here. The added class is: org.apache.giraph.counters.GiraphComputedVertex, the source code of this class is attached below: package org.apache.giraph.counters;import java.util.Iterator;import java.util.Map;import org.apache.hadoop.mapreduce.Mapper.Context;import com.google.common.collect.Maps;/** * Hadoop Counters in group "Giraph Messages" for counting every superstep * message count. * / public class GiraphComputedVertex extends HadoopCountersBase {/ * * Counter group name for the giraph Messages * / public static final String GROUP_NAME = "GiraphComputedVertex"; / * * Singleton instance for everyone to use * / private static GiraphComputedVertex INSTANCE; / * * superstep time in msec * / private final Map superstepVertexCount; private GiraphComputedVertex (Context context) {super (context, GROUP_NAME); superstepVertexCount = Maps.newHashMap ();} / * * Instantiate with Hadoop Context. * * @ param context * Hadoop Context to use. * / public static void init (Context context) {INSTANCE = new GiraphComputedVertex (context);} / * * Get singleton instance. * * @ return singleton GiraphTimers instance. * / public static GiraphComputedVertex getInstance () {return INSTANCE;} / * Get counter for superstep messages * * @ param superstep * @ return * / public GiraphHadoopCounter getSuperstepVertexCount (long superstep) {GiraphHadoopCounter counter = superstepVertexCount.get (superstep); if (counter = = null) {String counterPrefix = "Superstep:" + superstep+ "; counter = getCounter (counterPrefix) SuperstepVertexCount.put (superstep, counter);} return counter;} @ Override public Iterator iterator () {return superstepVertexCount.values () .iterator ();}} experimental results, after running the program. The total number of vertices involved in each iteration is output at the terminal. Test SSSP (SimpleShortestPathsVertex class), and there are 9 vertices and 12 edges in the input graph. The output is as follows:
In the above test, there are 6 iterations. In the red box, the number of vertices involved in the calculation of each iteration overshoot is shown, in the following order: 9, 4, 4, 4, 4, 4, 5, 5, 4, 4, 4, 4, 4, 4, 5, 4, 4, 4, 4, 4, 5, 4, 4, 4, 4, 4, 5, 4, 4, 4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4, 5, 4, 4, 4, 4.
Explanation: at the 0th overstep, each vertex is active, and all nine vertices participate in the calculation. In the fifth superstep, a total of 0 vertices participate in the calculation, so no message will be sent out, and each vertex is inactive, so the algorithm iteration is terminated.
[for more articles, please visit the Qilan Community]
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.