How to create a distributed system 01/17 Update SLTechnology News&Howtos

How to create a distributed system

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article is about how to create a distributed system. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

It is difficult to build a distributed system. It requires scalability, fault tolerance, high availability, consistency, scalability, and efficiency. In order to achieve these goals, distributed systems require many complex components to work together in a complex way. For example, when Apache Hadoop processes TB-level datasets in parallel on a large cluster, it needs to rely on a highly fault-tolerant file system (HDFS) to achieve high throughput.

Previously, each new distributed system, such as Hadoop and Cassandra, needed to build its own underlying architecture, including message processing, storage, networking, fault tolerance, and scalability. Fortunately, systems like Apache Mesos simplify the task of building and managing distributed systems by providing operating system-like management services to key building modules of distributed systems. Mesos removes CPU, storage, and other computing resources, so developers can treat the entire data center cluster as a supercomputer when developing distributed applications.

Applications built on Mesos are called frameworks, and they can solve a lot of problems: Apache Spark, a popular clustered data analysis tool, and Chronos, a fault-tolerant distributed scheduler similar to cron, which are examples of two frameworks built on Mesos. The framework can be built in a variety of languages, including curiosity, Python, Java, Haskell and Scala.

Bitcoin mining is a good example of a distributed system use case. Bitcoin will turn to verifying the reliability of a transaction for the challenge of generating acceptable hash. It can take decades, and it can take more than 150 years for a single laptop to dig a piece. As a result, there are many "mining pools" that allow miners to combine their computing resources to speed up mining. One of Mesosphere's interns, Derek, wrote a bitcoin mining framework that takes advantage of cluster resources to do the same thing. In the following content, we will take his code as an example.

A Mesos framework consists of a scheduler and an executor. Scheduler communicates with Mesos master and decides what tasks to run, while executor runs on top of slaves and performs the actual tasks. Most frameworks implement their own scheduler and use a standard executors provided by Mesos. Of course, the framework can also customize the executor itself. In this example, we will write a custom scheduler and use the standard command executor (executor) to run the Docker image containing our bitcoin service.

For scheduler here, there are two types of tasks that need to be run-the single mine server task and the multi-mine server task. The server communicates with a bitcoin mining pool and allocates blocks to each "worker". The "worker" will work hard, that is, to mine bitcoin.

The task is actually encapsulated in the executor framework, so task running means telling Mesos master to start an executor on one of the slave. Because the standard command executor (executor) is used here, you can specify that the task is a binary executable, bash script, or other command. Because Mesos supports Docker, an executable Docker image will be used in this example. Docker is a technology that allows you to package an application with the dependencies it needs to run.

To use Docker images in Mesos, you need to register their names in Docker registry:

Const (MinerServerDockerImage = "derekchiang/p2pool" MinerDaemonDockerImage = "derekchiang/cpuminer")

Then define a constant that specifies the resources required for each task:

Const (MemPerDaemonTask = 128 / / mining shouldn't be memory-intensive MemPerServerTask = 256 CPUPerServerTask = 1 / / a miner server does not use much CPU)

Now define a real scheduler, track it, and make sure it works correctly:

Type MinerScheduler struct {/ / bitcoind RPC credentials bitcoindAddr string rpcUser string rpcPass string / / mutable state minerServerRunning bool minerServerHostname string minerServerPort int / / the port that miner daemons / / connect to / / unique task ids tasksLaunched int currentDaemonTaskIDs [] * mesos.TaskID}

The scheduler must implement the following interface:

Type Scheduler interface {Registered (SchedulerDriver, * mesos.FrameworkID, * mesos.MasterInfo) Reregistered (SchedulerDriver, * mesos.MasterInfo) Disconnected (SchedulerDriver) ResourceOffers (SchedulerDriver, [] * mesos.Offer) OfferRescinded (SchedulerDriver, * mesos.OfferID) StatusUpdate (SchedulerDriver, * mesos.TaskStatus) FrameworkMessage (SchedulerDriver, * mesos.ExecutorID, * mesos.SlaveID, string) SlaveLost (SchedulerDriver, * mesos.SlaveID) ExecutorLost (SchedulerDriver, * mesos.ExecutorID) * mesos.SlaveID, int) Error (SchedulerDriver, string)}

Now let's look at a callback function:

Func (s * MinerScheduler) Registered (_ sched.SchedulerDriver, frameworkId * mesos.FrameworkID, masterInfo * mesos.MasterInfo) {log.Infoln ("Framework registered with Master", masterInfo)} func (s * MinerScheduler) Reregistered (_ sched.SchedulerDriver, masterInfo * mesos.MasterInfo) {log.Infoln ("Framework Re-Registered with Master", masterInfo)} func (s * MinerScheduler) Disconnected (sched.SchedulerDriver) {log.Infoln ("Framework disconnected with Master")}

Registered is called after scheduler successfully registers with Mesos master.

Reregistered is called when scheduler disconnects from Mesos master and registers again, for example, when master is restarted.

Disconnected is called when scheduler is disconnected from Mesos master. This happens when master is dead.

So far, only log information has been printed in the callback function, because for a simple framework like this, most callback functions can be empty there. However, the next callback function is the core of each framework and must be carefully written.

ResourceOffers is called when scheduler gets an offer from master. Each offer contains a list of resources on the cluster that can be used by the framework. Resources typically include CPU, memory, ports, and disks. A framework can use some of the resources it provides, all of them, or none at all.

For each offer, it is now expected to gather all available resources and decide whether a new server task or a new worker task needs to be published. You can send as many tasks as possible to each offer to test the capacity, but since mining bitcoin depends on CPU, each offer runs a miner task and uses all available CPU sources.

For I, offer: = range offers {/ / … Gather resource being offered and do setup if! s.minerServerRunning & & mems > = MemPerServerTask & & cpus > = CPUPerServerTask & & ports > = 2 {/ / … Launch a server task since no server is running and we / / have resources to launch it. } else if s.minerServerRunning & & mems > = MemPerDaemonTask {/ / … Launch a miner since a server is running and we have mem / / to launch one. }}

For each task, you need to create a corresponding TaskInfo message that contains the information you need to run the task.

TasksLaunchedforth + taskID = & mesos.TaskID {Value: proto.String ("miner-server-" + strconv.Itoa (s.tasksLaunched)),} Task IDs is determined by the framework, and each framework must be unique. ContainerType: = mesos.ContainerInfo_DOCKER task = & mesos.TaskInfo {Name: proto.String ("task-" + taskID.GetValue ()), TaskId: taskID, SlaveId: offer.SlaveId, Container: & mesos.ContainerInfo {Type: & containerType, Docker: & mesos.ContainerInfo_DockerInfo {Image: proto.String (MinerServerDockerImage),},} Command: & mesos.CommandInfo {Shell: proto.Bool (false), Arguments: [] string {/ / these arguments will be passed to run_p2pool.py "--bitcoind-address", s.bitcoindAddr, "--p2pool-port", strconv.Itoa (int (p2poolPort)), "- w", strconv.Itoa (int (workerPort)), s.rpcUser S.rpcPass,},}, Resources: [] * mesos.Resource {util.NewScalarResource ("cpus", CPUPerServerTask), util.NewScalarResource ("mem", MemPerServerTask),},}

TaskInfo message specifies some important metadata information about the task, which allows Mesos nodes to run the Docker container, specifically name, task ID, container information, and some parameters that need to be passed to the container. The resources required for the task are also specified here.

Now that the TaskInfo has been built, the task can run like this:

Driver.LaunchTasks ([] * mesos.OfferID {offer.Id}, tasks, & mesos.Filters {RefuseSeconds: proto.Float64 (1)})

One of the things that needs to be dealt with in the framework is what happens when the miner server shuts down. Here you can use the StatusUpdate function to deal with it.

In the lifecycle of a task, there are different types of state updates for different phases. For this framework, what you want to ensure is that if the miner server fails for some reason, the system will Kill all miner worker to avoid wasting resources. Here is the relevant code:

If strings.Contains (status.GetTaskId (). GetValue () "server") & & (status.GetState () = = mesos.TaskState_TASK_LOST | | status.GetState () = = mesos.TaskState_TASK_KILLED | | status.GetState () = = mesos.TaskState_TASK_FINISHED | | status.GetState () = = mesos.TaskState_TASK_ERROR | | status.GetState () = = mesos.TaskState_TASK_FAILED) {s.minerServerRunning = false / / kill all tasks for _ TaskID: = range s.currentDaemonTaskIDs {_, err: = driver.KillTask (taskID) if err! = nil {log.Errorf ("Failed to kill task% s", taskID)}} s.currentDaemonTaskIDs = make ([] * mesos.TaskID, 0)} Thank you for reading! This is the end of the article on "how to create a distributed system". I hope the above content can be of some help to you, so that you can learn more knowledge. If you think the article is good, you can share it for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.