In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Recently, a SQL has been running for more than two hours, so I'm going to optimize it.
First, check the counter data discovery of job generated by hive sql.
The total CPU time spent is overestimated by 100.4319973 hours
CPU time spent for each map
The first one took 2.0540889 hours.
It is recommended to set the following parameters:
1. Mapreduce.input.fileinputformat.split.maxsize is now 256000000 downwards to increase the number of maps (this move has an immediate effect. I set it to 32000000 to generate a 500 + map, and the final task is accelerated from 2 hours to 47 minutes.)
2. Optimize UDF getPageID getSiteId getPageValue (these methods use a lot of text matching of regular expressions)
2.1 regular expression processing optimization can refer to
Http://www.fasterj.com/articles/regex1.shtml
Http://www.fasterj.com/articles/regex2.shtml
2.2 UDF optimization see
1 Also you should use class level privatete members to save on object incantation and garbage collection.2 You also get benefits by matching the args with what you would normally expect from upstream. Hive converts text to string when needed, but if the data normally coming into the method is text you could try and match the argument and see if it is any faster. Exapmle: before optimization: > import org.apache.hadoop.hive.ql.exec.UDF; > import java.net.URLDecoder; > public final class urldecode extends UDF {> public String evaluate (final String s) {> if (s = = null) {return null;} > return getString (s); > public static String getString (String s) {> String a; > try {> a = URLDecoder.decode (s) >} catch (Exception e) {> a = ""; >} > return a; > public static void main (String args []) {> String t = "% E5%A4%AA%E5%8E%9F-%E4%B8%89%E4%BA%9A"; > System.out.println (getString (t)); >} >}
After optimization:
Import java.net.URLDecoder;public final class urldecode extends UDF {private Text t = new Text (); public Text evaluate (Text s) {if (s = = null) {return null;} try {t.set (URLDecoder.decode (s.toString (), "UTF-8")); return t;} catch (Exception e) {return null }} / / public static void main (String args []) {/ / String t = "% E5%A4%AA%E5%8E%9F-%E4%B8%89%E4%BA%9A"; / / System.out.println (getString (t)); / /}} 3 inherit to implement GenericUDF
3. If it is Hive 0.14 +, you can enable hive.cache.expr.evaluation UDF Cache function.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.