In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Installation package preparation
Download the latest stable version on the official website, and the subject downloads apache-storm-0.9.5.tar.gz.
Role assignment
Hostname IP role hadoop001192.168.0.1Nimbushadoop002192.168.0.2Supervisorhadoop003192.168.0.2Supervisor3. Installation steps
3.1 the first step to install a Strom cluster is to set up a ZooKeeper cluster. Since Zookeeper is relatively easy to build, it is no longer introduced. Here zookeeper is hadoop001:2181,hadoop002:2181,hadoop003:2181.
3.2 extract the installation package to the specified directory, here under / opt.
Add environment variables to / home/you-user-name/.bashrc
Export STORM_HOME=/opt/storm
Export PATH=$STORM_HOME/bin:$PATH
Export CLASSPATH=$STROM_HOME/lib:$CLASSPATH
3.4 modify Storm configuration file
Configuration item configuration description
List of storm.zookeeper.servers ZooKeeper servers
Storm.zookeeper.port ZooKeeper connection port
The local file system directory used by storm.local.dir storm (must exist and the storm process can read and write)
Storm.cluster.mode Storm cluster operation mode ([distributed | local])
Whether ZeroMQ is used as the messaging system in storm.local.mode.zmq Local mode, and java messaging system is used if set to false. Default is false
The root location of Storm in storm.zookeeper.root ZooKeeper
Storm.zookeeper.session.timeout client connection ZooKeeper timeout
The id of the running topology of storm.id, which consists of storm name and a unique random number.
Nimbus.host nimbus server address
Thrift listening port of nimbus.thrift.port nimbus
The jvm option assigned to the nimbus process when nimbus.childopts is deployed through the storm-deploy project
The nimbus.task.timeout.secs heartbeat timeout, after which the nimbus considers the task dead and reassigns it to another address.
Nimbus.monitor.freq.secs nimbus checks the heartbeat and the interval between reassigning tasks. Note that if the machine goes down, nimbus will take over and deal with it immediately.
Nimbus.supervisor.timeout.secs supervisor's heartbeat timeout, once exceeded the nimbus will consider the supervisor dead and stop distributing new tasks to it.
A special timeout setting when nimbus.task.launch.secs task starts. This value is used temporarily instead of nimbus.task.timeout.secs before the first heartbeat after startup.
Nimbus.reassign whether the nimbus reassigns execution when it discovers that task failed. The default is true, and modification is not recommended.
Nimbus.file.copy.expiration.secs nimbus judges the timeout of the upload / download link. When the idle time exceeds this setting, nimbus will think the link is dead and break it actively.
Service port of ui.port Storm UI
List of drpc.servers DRPC servers so that DRPCSpout knows who to communicate with
Service port of drpc.port Storm DRPC
List of ports on supervisor.slots.ports supervisor that can run workers. Each worker occupies one port, and each port runs only one worker. With this configuration, you can adjust the number of worker running on each machine. (adjust the number of slot per machine)
Supervisor.childopts is used in the storm-deploy project to configure the jvm option of the supervisor daemon
Worker heartbeat timeout in supervisor.worker.timeout.secs supervisor, once timeout supervisor will try to restart the worker process.
When supervisor.worker.start.timeout.secs supervisor is initially started, the heartbeat of worker times out, after which supervisor attempts to restart worker. Because of the extra consumption caused by the initial startup and configuration of JVM, the first heartbeat will exceed the setting of supervisor.worker.timeout.secs.
Whether supervisor.enable supervisor should run the workers assigned to him. The default is true, which is used for unit testing of Storm and should not be modified.
Supervisor.heartbeat.frequency.secs supervisor heartbeat transmission frequency (how often is it sent)
Supervisor.monitor.frequency.secs supervisor examines the frequency of worker heartbeat
The jvm option that worker.childopts supervisor uses when starting worker. All "% ID%" strings will be replaced with identifiers for the corresponding worker
Heartbeat sending interval of worker.heartbeat.frequency.secs worker
Task.heartbeat.frequency.secs task reporting status heartbeat interval
The frequency of link synchronization between task.refresh.poll.secs task and other tasks. If task is reassigned, other tasks needs to refresh the connection to send messages to it. Generally speaking, other tasks will understand and be notified when redistribution occurs. This configuration is only designed to prevent unnotified situations.
If topology.debug is set to true,Storm, every message sent will be recorded.
Whether topology.optimize master optimizes topologies by running multiple task in a single thread at the right time.
Topology.workers executes the number of processes that should be started in the topology cluster. Components within each process that execute a certain number of tasks.topology in a threaded manner combine this parameter with parallelism tips to optimize performance
The number of acker tasks started in topology.ackers topology. Acker keeps a record of the tuples sent by spout and detects when the tuple is fully processed. When Acker detects that tuple has been processed, it sends a confirmation message to spout. Usually the number of acker should be determined based on the throughput of topology, but generally not too much. When set to 0, message reliability is disabled and storm acknowledges immediately after spout sends tuples.
The maximum processing timeout for messages sent by spout in topology.message.timeout.secs topology. If a message is not successful within the time window, ack,Storm informs spout that the message failed. However, part of spout implements the function of failure message replay.
Topology.kryo.register registers with a list of serialization schemes for Kryo, the underlying serialization framework for Storm. The serialization scheme can be a class name or an implementation of com.esotericsoftware.kryo.Serializer.
Whether topology.skip.missing.kryo.registrations Storm should skip kryo serialization schemes that it does not recognize. If set to No task may fail to load or throw an error at run time.
The maximum degree of component parallelism that topology.max.task.parallelism can allow in a topology. This configuration is mainly used to test the limit on the number of threads in local mode.
Topology.max.spout.pending the maximum number of tuples in a spout task in the pending state. This configuration applies to a single task, not the entire spouts or topology.
Maximum timeout of the topology.state.synchronization.timeout.secs component synchronization state source (reserved option, not in use)
Percentage of tuples sampling used by topology.stats.sample.rate to generate task statistics
Whether to use the serialization scheme of java in topology.fall.back.on.java.serialization topology
Zmq.threads the number of threads used for zeromq communication in each worker process
Zmq.linger.millis the duration for which the link attempts to resend the message to the target host when the connection is closed. This is an uncommonly used advanced option that can basically be ignored.
The java.library.path setting when java.library.path JVM starts (such as Nimbus,Supervisor and workers). This option tells JVM under which paths to locate the local library.
4. Start
The role assignment is not reflected in the configuration in Strom, so we need to specify it at startup:
Start Nimbus: "bin/storm nimbus > / dev/null 2 > & 1 &"
Start Supervisor: "bin/storm supervisor > / dev/null 2 > & 1 &"
Start UI: "bin/storm ui > / dev/null 2 > & 1 &"
Note: UI must be on the same host as Nimbus, otherwise UI will not work properly
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.