In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "how to generate Java data script". In daily operation, I believe many people have doubts about how to generate Java data script. Xiaobian consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubt of "how to generate Java data script". Next, please follow the editor to study!
/ object ProducePvAndUvData {/ / ip val IP = 223 / / address val ADDRESS = Array ("Beijing", "Tianjin", "Shanghai", "Chongqing", "Hebei", "Liaoning", "Shanxi", "Jilin", "Jiangsu", "Zhejiang", "Heilongjiang", "Anhui", "Fujian", "Jiangxi") "Shandong", "Henan", "Hubei", "Hunan", "Guangdong", "Hainan", "Sichuan", "Guizhou", "Yunnan", "Shanxi", "Gansu", "Qinghai", "Taiwan", "Inner Mongolia", "Guangxi", "Xizang", "Ningxia" "Xinjiang", "Hong Kong", "Macao") / / date val DATE = new SimpleDateFormat ("yyyy-MM-dd") .format (new Date ()) / / timestamp val TIMESTAMP = 0L / / userid val USERID = 0L / / website val WEBSITE = Array ("www.baidu.com", "www.taobao.com", "www.dangdang.com", "www.jd.com", "www.suning.com", "www.mi.com" "www.gome.com.cn") / / behavior val ACTION = Array ("Regist", "Comment", "View", "Login", "Buy", "Click" "Logout") def main (args: Array [String]): Unit = {val pathFileName = "G://idea//scala//spark02/data" / / create the file val createFile = CreateFile (pathFileName) / / the object needed to write data to the file val file = new File (pathFileName) val fos = new FileOutputStream (file, true) val osw = new OutputStreamWriter (fos "UTF-8") val pw = new PrintWriter (osw) if (createFile) {var I = 0 / / generate 50, 000 + data while (I
< 50000){ //模拟一个ip val random = new Random() val ip = random.nextInt(IP) + "." + random.nextInt(IP) + "." + random.nextInt(IP) + "." + random.nextInt(IP) //模拟地址 val address = ADDRESS(random.nextInt(34)) //模拟日期 val date = DATE //模拟userid val userid = Math.abs(random.nextLong) /** * 这里的while模拟是同一个用户不同时间点对不同网站的操作 */ var j = 0 var timestamp = 0L var webSite = "未知网站" var action = "未知行为" val flag = random.nextInt(5) | 1 while (j < flag) { // Threads.sleep(5); //模拟timestamp timestamp = new Date().getTime() //模拟网站 webSite = WEBSITE(random.nextInt(7)) //模拟行为 action = ACTION(random.nextInt(6)) j += 1 /** * 拼装 */ val content = ip + "\t" + address + "\t" + date + "\t" + timestamp + "\t" + userid + "\t" + webSite + "\t" + action System.out.println(content) //向文件中写入数据 pw.write(content + "\n") } i += 1 } //注意关闭的先后顺序,先打开的后关闭,后打开的先关闭 pw.close() osw.close() fos.close() } } /** * 创建文件 */ def CreateFile(pathFileName: String): Boolean = { val file = new File(pathFileName) if (file.exists) file.deleteval createNewFile = file.createNewFile() System.out.println("create file " + pathFileName + " success!") createNewFile }} 统计每个网站的PU、VU、每个网站的每个地区访问量,由大到小排序 def main(args: Array[String]): Unit = { val conf = new SparkConf() conf.setMaster("local") conf.setAppName("SparkPvAndUv") val sc = new SparkContext(conf) val rdd: RDD[String] = sc.textFile("G:/idea/scala/spark02/data") println("*************PU******************") rdd.map(line=>{(line.split ("\ t") (5), 1)}) .reduceByKey (_ + _) .sortBy (_. _ 2) false) / / whether to descend False: is descending .foreach (println) println ("* UV*") rdd.map (line= > line.split ("\ t") (5) + "_" + line.split ("\ t") (1) / / website _ ip () / / deduplicate.map (line= > {(line.split ("_") (0)) 1)}) .reduceByKey (_ + _) .sortBy (_. _ 2 _ false) .foreach (println) / / visits to each region of each URL Sort from big to small val site_local: RDD [(String, String)] = rdd.map (line= > {(line.split ("\ t") (5), line.split ("\ t") (1)}) val site_localIterable: RDD [(String, Iterable [string])] = site_local.groupByKey () val result: RDD [(String, AbstractSeq [(String, Int)])] = site_localIterable.map (one = > {val localMap = mutable.Map [String) Int] () / variable map val site = one._1 val localIterator = one._2.iterator while (localIterator.hasNext) {/ / region val local = localIterator.next () if (localMap.contains (local)) {/ / if the region is in map Then get the value of the region plus 1 val value = localMap.get (local). Get localMap.put (local, value + 1)} else {/ / if there is no such region in the map, get the value of the region plus 1 localMap.put (local, 1) }} / / default is ascending order, descending order: localMap.toList.sortBy (-. _ 2), there is one more "-" val tuples: List [(String, Int)] = localMap.toList.sortBy (-. _ 2) if (tuples.length > 3) {val list = new ListBuffer [(String, Int)] () for (I)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.