In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Official TEZUI needs TEZ6+hadoop2.6.0
In fact, TEZUI TEZ0.53+hadoop2.4.0+ can also play as long as hadoop has timelineserver.
However, hadoop2.4.0 hadoop2.5.0 timelineserver does not support cross-domain requests. So using tez view in ambari2.2 to build can be realized, and convenient and fast.
Tez-site.xml
Tez.lib.uris
Hdfs:///apps/tez-0.5.3/tez-0.5.3.tar.gz
Tez.task.generate.counters.per.io
True
Log history using the Timeline Server
Tez.history.logging.service.class
Org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService
Publish configuration information to Timeline server.
Tez.runtime.convert.user-payload.to.history-text
True
Yarn-site.xml
Plus
Indicate to clients whether Timeline service is enabled or not.
If enabled, the TimelineClient library used by end-users will post entities
And events to the Timeline server.
Yarn.timeline-service.enabled
True
The hostname of the Timeline service web application.
Yarn.timeline-service.hostname
192.168.117.117
Yarn.resourcemanager.system-metrics-publisher.enabled
True
Enables cross-origin support (CORS) for web services where
Cross-origin web response headers are needed. For example, javascript making
A web services request to the timeline server.
Yarn.timeline-service.http-cross-origin.enabled
For more information on true, see
Http://search-hadoop.com/m/tQlTsMD%26subj=Tez+nbsp+taskcount+log+visualization
Address for the Timeline server to start the RPC server.
Yarn.timeline-service.address
${yarn.timeline-service.hostname}: 10201
The http address of the Timeline service web application.
Yarn.timeline-service.webapp.address
${yarn.timeline-service.hostname}: 8188
The https address of the Timeline service web application.
Yarn.timeline-service.webapp.https.address
${yarn.timeline-service.hostname}: 2191
Yarn timelineserver start can start timelineserver
Tez config
Https://issues.apache.org/jira/browse/TEZ-2294
Tez optimization
Listing some details at very high level
-Set "tez.task.generate.counters.per.io=true" to get more details on the task counters. Basically this starts printinng the counters per edge, which can be a lot more useful for debugging.
-In case you want to avoid container launches etc when you analyze for first time, try hive.prewarm.enabled=true & hive.prewarm.numcontainers=
-Container reuse is enabled by default in tez. (tez.am.container.idle.release-timeout-min.millis, tez.am.container.idle.release-timeout-max.millis controls the amount of time a container is held by AM before releasing it)
-Set tez.runtime.io.sort.mb appropriately to avoid spills (you can check task counters in the logs to find out the spills and adjust it accordingly)
-Set tez.runtime.sort.threads=2 to enable PipelinedSorter which is a lot performant than DefaultSorter (this is the default in master branch. But if you are using earlier releases, you can turn it on by setting tez.runtime.sort.threads=2).
-Set tez.runtime.compress=true and set tez.runtime.compress.codec (SnappyCodec is preferred, but it is upto you to choose)
-Set tez.runtime.shuffle.keep-alive.enabled=true in case you have shuffle heavy workload. This reduces number of connections in shuffle.
-Adjust memory allocated to different inputs/outputs based on tez.task.scale.memory.ratios (but this is more of expert level setting which you might want to touch after nailing down any memory pressure)
-Adjusting shuffle buffers are also possible, but would advise only when you nail down an issue related to shuffle/merge codepath.
-Set "tez.runtime.optimize.local.fetch=true" to bypass http fetches (when data is locally present)
Feel free to refer to https://github.com/t3rmin4t0r/tez-autobuild/blob/master/tez-site.xml for any commonly used settings for benchmarks.
Rajesh
What are the problems with having tez.runtime.shuffle.keep-alive.enabled and tez.runtime.optimize.local.fetch set to true always by default?
@ r7raul1984, would you mind filing a documentation jira for your question. The list that Rajesh provided might be good to formalize into a doc and/or wiki.
Also, please take a look at https://issues.apache.org/jira/browse/TEZ-2294 to see all the list of parameters. If you see something off or not clear enough, please add your comments to the jira.
@ Rohini
We recently changed tez.runtime.optimize.local.fetch to true as the default value in master. The feature was introduced and probably kept as false initially as it had not been fully battle tested.
The latter I am assuming depends on how many open connections a cluster's setup can sustain and needs to be tuned in combination with "tez.runtime.shuffle.keep-alive.max.connections". Good point on whether we should make this true by default. Will wait for @ Rajesh/@Gopal/@Sid to chime in and they can open a new jira if this is generally beneficial in most setups.
> What are the problems with having
> tez.runtime.shuffle.keep-alive.enabled and
> tez.runtime.optimize.local.fetch set to true always by default?
> What are the problems with having
> tez.runtime.shuffle.keep-alive.enabled and
> tez.runtime.optimize.local.fetch set to true always by default?
Nothing has failed due to these so far-we
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.