In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces how to read the local log file in Spark, the content is very detailed, interested friends can refer to, I hope it can be helpful to you.
1. The code is as follows: import java.io. {FileWriter, BufferedWriter, File} import com.alvinalexander.accesslogparser. {AccessLogRecord, AccessLogParser} import org.apache.spark. {SparkContext, SparkConf} import scala.collection.immutable.ListMap/** * Spark reads the local log file, extracts the highest access address, sorts it, and saves it to the local file * Created by eric on 16-6-29. * / object LogAnalysisSparkFile {def getStatusCode (line: Option [AccessLogRecord]) = {line match {case Some (l) = > l.httpStatusCode case None = > "0"}} def main (agrs: Array [String]): Unit = {/ / set local operation, enter:-Dspark.master=local on Vm options: Fill in on Program arguments: local val sparkConf = new SparkConf (). SetMaster ("local [1]"). SetAppName ("StreamingTest") val sc = new SparkContext (sparkConf) val p = new AccessLogParser val log = sc.textFile ("/ var/log/nginx/www.eric.aysaas.com-access.log") println (log.count ()) / / 68591 val log1 = log.filter (line = > getStatusCode (p.parseRecord (line) = "404"). Count () Println (log1) val nullObject = AccessLogRecord ("" "," GET / foo HTTP/1.1 ",") val recs = log.filter (p.parseRecord (_) .getOrElse (nullObject). HttpStatusCode = = "404") .map (p.parseRecord (_) .getOrElse (nullObject) .request) val wordCounts = log.flatMap (line = > line.split (")) .map (word = > (word, 1)) .reduceByKey ((a) B) = > a + b) val uriCounts = log.map (p.parseRecord (_) .getOrElse (nullObject) .request) .map (_ split (") (1)) .map (uri = > (uri, 1)) .reduceByKey ((a, b) = > a + b) val uriToCount = uriCounts.collect / / (/ foo, 3), (/ bar, 10), (/ baz) 1). / / unordered val uriHitCount = ListMap (uriToCount.toSeq.sortWith (_. _ 2 > _. _ 2): _ *) / / (/ bar, 10), (/ foo, 3), (/ baz, 1) Descending uriCounts.take (10) .foreach (println) println ("* *") val logSave = uriHitCount.take (10) .foreach (println) / / this is a decent way to print some sample data uriCounts.takeSample (false, 100,100) / / the output is saved to the local file, because ListMap Causes saveAsTextFile to fail to use / / logSave.saveAsTextFile ("UriHitCount") val file = new File ("UriHitCount.out") val bw = new BufferedWriter (new FileWriter (file)) for {record
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.