How to build the background technology stack of start-up companies from scratch 07/19 Update SLTechnology News&Howtos

How to build the background technology stack of start-up companies from scratch

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article will explain in detail how to build the background technology stack of start-ups from scratch. The content of the article is of high quality, so the editor will share it with you for reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Little Hub leads the reading:

It's a long article. When it comes to starting a business, many people are passionate. Do you know what it's like to be an architect in a startup company? first, let's take a look at what kind of technology stack is needed to build an enterprise technology stack. Have you thought about it?

Preface

When it comes to the backstage technology stack, is this a picture that comes to mind?

A little dizzy, the following is just a collection of some languages we will use, and only part of the language level, as far as the whole background technology stack is concerned, this is just the beginning, starting with the language, there are a lot of content. What we are going to talk about today is the concept of big background, and everything on the server belongs to the background, such as the framework, language, database, service, operating system and so on.

My understanding of the whole background technology stack includes four levels:

Language: which development languages are used, such as C++/Java/Go/PHP/Python/Ruby, etc.

Components: which components are used, such as MQ components, database components, etc.

Process: what kind of processes and specifications, such as: development process, project process, release process, monitoring alarm process, code specification, etc.

System: systematic construction, the above process needs to be guaranteed by a system, such as: standard release process release system, code management system and so on.

Combined with the above four levels of content, the structure of the entire background technology stack is shown in figure 2:

Figure 2 background technology stack structure

All of the above requires us to build from scratch. In startups, we don't have the perfect infrastructure of big companies, and we need to assemble and assemble them from the open source world and cloud service providers. To develop a component or system that suits us to achieve our goals. We do the selection of systems and components one by one, and finally form our background technology stack.

Selection of system components

1. Project Management / Bug Management / problem Management

Project management software is the focus of the needs, problems, processes and so on of the whole business. The communication and coordination between different departments mostly depends on the project management tools. Some SaaS project management services can be used, but a lot of time does not meet the requirements. At this time, we can choose some open source projects that have certain customization capabilities and rich plug-ins to use. Generally speaking, the needs of startups can be met. Common projects are as follows:

Redmine: developed with Ruby, there are many plug-ins available, can customize fields, integrate project management, Bug problem tracking, WIKI and other functions, but many plug-ins have not been updated for N years

Phabricator: developed with PHP, the internal tool before Facebook, the buddy who developed this tool set up a company to do this software after leaving his job, which integrates code hosting, Code Review, task management, document management, problem tracking and other functions. It is highly recommended for more agile teams to use.

Jira: developed with Java, with user stories, task splits, burnout diagrams, etc., can be used for project management, can also be applied to cross-departmental communication scenarios, more powerful

Wukong CRM: this is not project management, this is customer management. The reason why it is mentioned here is that in the start-up companies of To B, they often do things with the customer as the core, and project management and problem follow-up can be done on Wukong CRM. His open source version has basically implemented CR.

< 的核心功能，还带有一个任务管理功能，用于问题跟进，不过用这个的话，还是需要另一个项目管理的软件协助，顺便说一嘴，这个系统的代码写得很难维护，只能适用于客户规模小（1 万以内）时。 2、DNS DNS 是一个很通用的服务，创业公司基本上选择一个合适的云厂商就行了，国内主要是两家：阿里万网：阿里 2014 年收购了万网，整合了其域名服务，最终形成了现在的阿里万网，其中就包含 DNS 这块的服务；腾讯 DNSPod：腾讯 2012 年以 4000 万收购 DNSPod 100% 股份，主要提供域名解析和一些防护功能；如果你的业务是在国内，主要就是这两家，选一个就好，像今日头条这样的企业用的也是 DNSPod 的服务，除非一些特殊的原因才需要自建，比如一些 CDN 厂商，或者对区域有特殊限制的。要实惠一点用阿里最便宜的基础版就好了，要成功率高一些，还是用 DNSPod 的贵的那种。在国外还是选择亚马逊吧，阿里的 DNS 服务只有在日本和美国有节点，东南亚最近才开始部点， DNSPod 也只有美国和日本，像一些出海的企业，其选择的云服务基本都是亚马逊。如果是线上产品，DNS 强烈建议用付费版，阿里的那几十块钱的付费版基本可以满足需求。如果还需要一些按省份或按区域调试的逻辑，则需要加钱，一年也就几百块，省钱省力。如果是国外，优先选择亚马逊，如果需要国内外互通并且有自己的 APP 的话，建议还是自己实现一些容灾逻辑或者智能调度，因为没有一个现成的 DNS 服务能同时较好的满足国内外场景，或者用多个域名，不同的域名走不同的 DNS 。 3、LB（负载均衡） LB（负载均衡）是一个通用服务，一般云厂商的 LB 服务基本都会如下功能：支持四层协议请求（包括 TCP、UDP 协议）；支持七层协议请求（包括 HTTP、HTTPS 协议）；集中化的证书管理系统支持 HTTPS 协议；健康检查；如果你线上的服务机器都是用的云服务，并且是在同一个云服务商的话，可以直接使用云服务商提供的 LB 服务，如阿里云的 SLB，腾讯云的 CLB，亚马逊的 ELB 等等。如果是自建机房基本都是 LVS + Nginx。 4、CDN CDN 现在已经是一个很红很红的市场，基本上只能挣一些辛苦钱，都是贴着成本在卖。国内以网宿为龙头，他们家占据整个国内市场份额的 40% 以上，后面就是腾讯，阿里。网宿有很大一部分是因为直播的兴起而崛起。国外，Amazon 和 Akamai 合起来占比大概在 50%，曾经的国际市场老大 Akamai 拥有全球超一半的份额，在 Amazon CDN 入局后，份额跌去了将近 20%，众多中小企业都转向后者，Akamai 也是无能为力。国内出海的 CDN 厂商，更多的是为国内的出海企业服务，三家大一点的 CDN 服务商里面也就网宿的节点多一些，但是也多不了多少。阿里和腾讯还处于前期阶段，仅少部分国家有节点。就创业公司来说，CDN 用腾讯云或阿里云即可，其相关系统较完善，能轻松接入，网宿在系统支持层面相对较弱一些，而且还贵一些。并且，当流量上来后，CDN 不能只用一家，需要用多家，不同的 CDN 在全国的节点覆盖不一样，而且针对不同的客户云厂商内部有些区分客户集群，并不是全节点覆盖（但有些云厂商说自己是全网节点），除了节点覆盖的问题，多 CDN 也在一定程度上起到容灾的作用。 5、RPC 框架维基百科对 RPC 的定义是：远程过程调用（Remote Procedure Call，RPC）是一个计算机通信协议。该协议允许运行于一台计算机的程序调用另一台计算机的子程序，而程序员无需额外地为这个交互作用编程。通俗来讲，一个完整的 RPC 调用过程，就是 Server 端实现了一个函数，客户端使用 RPC 框架提供的接口，调用这个函数的实现，并获取返回值的过程。业界 RPC 框架大致分为两大流派，一种侧重跨语言调用，另一种是偏重服务治理。跨语言调用型的 RPC 框架有 Thrift、gRPC、Hessian、Hprose 等。这类 RPC 框架侧重于服务的跨语言调用，能够支持大部分的语言进行语言无关的调用，非常适合多语言调用场景。但这类框架没有服务发现相关机制，实际使用时需要代理层进行请求转发和负载均衡策略控制。其中，gRPC 是 Google 开发的高性能、通用的开源 RPC 框架，其由 Google 主要面向移动应用开发并基于 HTTP/2 协议标准而设计，基于 ProtoBuf（Protocol Buffers）序列化协议开发，且支持众多开发语言。本身它不是分布式的，所以要实现框架的功能需要进一步的开发。 Hprose（High Performance Remote Object Service Engine）是一个 MIT 开源许可的新型轻量级跨语言跨平台的面向对象的高性能远程动态通讯中间件。服务治理型的 RPC 框架的特点是功能丰富，提供高性能的远程调用、服务发现及服务治理能力，适用于大型服务的服务解耦及服务治理，对于特定语言 (Java) 的项目可以实现透明化接入。缺点是语言耦合度较高，跨语言支持难度较大。国内常见的冶理型 RPC 框架如下： Dubbo：Dubbo 是阿里巴巴公司开源的一个 Java 高性能优秀的服务框架，使得应用可通过高性能的 RPC 实现服务的输出和输入功能，可以和 Spring 框架无缝集成。当年在淘宝内部，Dubbo 由于跟淘宝另一个类似的框架 HSF 有竞争关系，导致 Dubbo 团队解散，最近又活过来了，有专职同学投入。 DubboX：DubboX 是由当当在基于 Dubbo 框架扩展的一个 RPC 框架，支持 REST 风格的远程调用、Kryo/FST 序列化，增加了一些新的 feature。Motan：Motan 是新浪微博开源的一个 Java 框架。它诞生的比较晚，起于 2013 年，2016 年 5 月开源。Motan 在微博平台中已经广泛应用，每天为数百个服务完成近千亿次的调用。 rpcx：rpcx 是一个类似阿里巴巴 Dubbo 和微博 Motan 的分布式的 RPC 服务框架，基于 Golang net/rpc 实现。但是 rpcx 基本只有一个人在维护，没有完善的社区，使用前要慎重，之前做 Golang 的 RPC 选型时也有考虑这个，最终还是放弃了，选择了 gRPC，如果想自己自研一个 RPC 框架，可以参考学习一下。 6、名字发现 / 服务发现名字发现和服务发现分为两种模式，一个是客户端发现模式，一种是服务端发现模式。框架中常用的服务发现是客户端发现模式。所谓服务端发现模式是指客户端通过一个负载均衡器向服务发送请求，负载均衡器查询服务注册表并把请求路由到一台可用的服务实例上。现在常用的负载均衡器都是此类模式，常用于微服务中。所有的名字发现和服务发现都要依赖于一个可用性非常高的服务注册表，业界常用的服务注册表有如下三个： etcd，一个高可用、分布式、一致性、key-value 方式的存储，被用在分享配置和服务发现中。两个著名的项目使用了它：Kubernetes 和 Cloud Foundry。 Consul，一个发现和配置服务的工具，为客户端注册和发现服务提供了 API，Consul 还可以通过执行健康检查决定服务的可用性。 Apache ZooKeeper，是一个广泛使用、高性能的针对分布式应用的协调服务。Apache ZooKeeper 本来是 Hadoop 的子工程，现在已经是顶级工程了。除此之外也可以自己实现服务实现，或者用 Redis 也行，只是需要自己实现高可用性。 7、关系数据库关系数据库分为两种，一种是传统关系数据，如 Oracle，MySQL，Maria，DB2，PostgreSQL 等等，另一种是 NewSQL，即至少要满足以下五点的新型关系数据库：完整地支持 SQL，支持 JOIN / GROUP BY / 子查询等复杂 SQL 查询。支持传统数据标配的 ACID 事务，支持强隔离级别。具有弹性伸缩的能力，扩容缩容对于业务层完全透明。真正的高可用，异地多活、故障恢复的过程不需要人为的接入，系统能够自动地容灾和进行强一致的数据恢复。具备一定的大数据分析能力。传统关系数据库用得最多的是 MySQL，成熟，稳定，一些基本的需求都能满足，在一定数据量级之前基本单机传统数据库都可以搞定，而且现在较多的开源系统都是基于 MySQL，开箱即用，再加上主从同步和前端缓存，百万 pv 的应用都可以搞定了。不过 CentOS 7 已经放弃了 MySQL，而改使用 MariaDB。MariaDB 数据库管理系统是 MySQ L 的一个分支，主要由开源社区在维护，采用 GPL 授权许可。开发这个分支的原因之一是：甲骨文公司收购了 MySQL 后，有将 MySQL 闭源的潜在风险，因此社区采用分支的方式来避开这个风险。在 Google 发布了 F1: A Distributed SQL Database That Scales 和 Spanner: Google's Globally-Distributed Databasa 之后，业界开始流行起 NewSQL。于是有了 CockroachDB，于是有了奇叔公司的 TiDB。国内已经有比较多的公司使用 TiDB，之前在创业公司时在大数据分析时已经开始应用 TiDB，当时应用的主要原因是 MySQL 要使用分库分表，逻辑开发比较复杂，扩展性不够。 8、NoSQL NoSQL 顾名思义就是 Not-Only SQL，也有人说是 No - SQL，个人偏向于 Not-Only SQL，它并不是用来替代关系库，而是作为关系型数据库的补充而存在。常见 NoSQL 有 4 个类型：键值，适用于内容缓存，适合混合工作负载并发高扩展要求大的数据集，其优点是简单，查询速度快，缺点是缺少结构化数据，常见的有 Redis，Memcache，BerkeleyDB 和 Voldemort 等等；列式，以列簇式存储，将同一列数据存在一起，常见于分布式的文件系统，其中以 Hbase，Cassandra 为代表。Cassandra 多用于写多读少的场景，国内用得比较多的有 360，大概 1500 台机器的集群，国外大规模使用的公司比较多，如 eBay，Instagram，Apple 和沃尔玛等等；文档，数据存储方案非常适用承载大量不相关且结构差别很大的复杂信息。性能介于 kv 和关系数据库之间，它的灵感来于 lotus notes，常见的有 MongoDB，CouchDB 等等；图形，图形数据库擅长处理任何涉及关系的状况。社交网络，推荐系统等。专注于构建关系图谱，需要对整个图做计算才能得出结果，不容易做分布式的集群方案，常见的有 Neo4J，InfoGrid 等。除了以上 4 种类型，还有一些特种的数据库，如对象数据库，XML 数据库，这些都有针对性对某些存储类型做了优化的数据库。在实际应用场景中，何时使用关系数据库，何时使用 NoSQL，使用哪种类型的数据库，这是我们在做架构选型时一个非常重要的考量，甚至会影响整个架构的方案。 9、消息中间件消息中间件在后台系统中是必不可少的一个组件，一般我们会在以下场景中使用消息中间件：异步处理：异步处理是使用消息中间件的一个主要原因，在工作中最常见的异步场景有用户注册成功后需要发送注册成功邮件、缓存过期时先返回老的数据，然后异步更新缓存、异步写日志等等；通过异步处理，可以减少主流程的等待响应时间，让非主流程或者非重要业务通过消息中间件做集中的异步处理。系统解耦：比如在电商系统中，当用户成功支付完成订单后，需要将支付结果给通知 ERP 系统、发票系统、WMS、推荐系统、搜索系统、风控系统等进行业务处理；这些业务处理不需要实时处理、不需要强一致，只需要最终一致性即可，因此可以通过消息中间件进行系统解耦。通过这种系统解耦还可以应对未来不明确的系统需求。削峰填谷：当系统遇到大流量时，监控图上会看到一个一个的山峰样的流量图，通过使用消息中间件将大流量的请求放入队列，通过消费者程序将队列中的处理请求慢慢消化，达到消峰填谷的效果。最典型的场景是秒杀系统，在电商的秒杀系统中下单服务往往会是系统的瓶颈，因为下单需要对库存等做数据库操作，需要保证强一致性，此时使用消息中间件进行下单排队和流控，让下单服务慢慢把队列中的单处理完，保护下单服务，以达到削峰填谷的作用。业界消息中间件是一个非常通用的东西，大家在做选型时有使用开源的，也有自己造轮子的，甚至有直接用 MySQL 或 Redis 做队列的，关键看是否满足你的需求，如果是使用开源的项目，以下的表格在选型时可以参考：图 3 以上图的纬度为：名字、成熟度、所属社区 / 公司、文档、授权方式、开发语言、支持的协议、客户端支持的语言、性能、持久化、事务、集群、负载均衡、管理界面、部署方式、评价。 10、代码管理代码是互联网创业公司的命脉之一，代码管理很重要，常见的考量点包括两块：安全和权限管理，将代码放到内网并且对于关系公司命脉的核心代码做严格的代码控制和机器的物理隔离；代码管理工具，Git 作为代码管理的不二之选，你值得拥有。GitLab 是当今最火的开源 Git 托管服务端，没有之一，虽然有企业版，但是其社区版基本能满足我们大部分需求，结合 Gerrit 做 Code review，基本就完美了。当然 GitLab 也有代码对比，但没 Gerrit 直观。Gerrit 比 GitLab 提供了更好的代码检查界面与主线管理体验，更适合在对代码质量有高要求的文化下使用。 11、持续集成持续集成简，称 CI（continuous integration），是一种软件开发实践，即团队开发成员经常集成他们的工作，每天可能会发生多次集成。每次集成都通过自动化的构建（包括编译，发布，自动化测试）来验证，从而尽早地发现集成错误。持续集成为研发流程提供了代码分支管理 / 比对、编译、检查、发布物输出等基础工作，为测试的覆盖率版本编译、生成等提供统一支持。业界免费的持续集成工具中系统我们有如下一些选择： Jenkins：Java 写的有强大的插件机制，MIT 协议开源（免费，定制化程度高，它可以在多台机器上进行分布式地构建和负载测试）。Jenkins 可以算是无所不能，基本没有 Jenkins 做不了的，无论从小型团队到大型团队 Jenkins 都可以搞定。不过如果要大规模使用，还是需要有人力来学习和维护。 TeamCity：TeamCity 与 Jenkins 相比使用更加友好，也是一个高度可定制化的平台。但是用的人多了，TeamCity 就要收费了。 Strider：Strider 是一个开源的持续集成和部署平台，使用 Node.js 实现，存储使用的是 MongoDB，BSD 许可证，概念上类似 Travis 和 Jenkins。 GitLab CI：从 GitLab 8.0 开始，GitLab CI 就已经集成在 GitLab，我们只要在项目中添加一个 .gitlab-ci.yml 文件，然后添加一个 Runner，即可进行持续集成。并且 GitLab 与 Docker 有着非常好的相互协作的能力。免费版与付费版本不同可以参见这里：https://about.gitlab.com/products/feature-comparison/。 Travis：Travis 和 GitHub 强关联；闭源代码使用 SaaS 还需考虑安全问题；不可定制；开源项目免费，其它收费。 Go：Go 是 ThoughtWorks 公司最新的 Cruise Control 的化身。除了 ThoughtWorks 提供的商业支持，Go 是免费的。它适用于 Windows，Mac 和各种 Linux 发行版。 12、日志系统日志系统一般包括打日志，采集，中转，收集，存储，分析，呈现，搜索还有分发等。一些特殊的如染色，全链条跟踪或者监控都可能需要依赖于日志系统实现。日志系统的建设不仅仅是工具的建设，还有规范和组件的建设，最好一些基本的日志在框架和组件层面加就行了，比如全链接跟踪之类的。对于常规日志系统 ELK 能满足大部分的需求，ELK 包括如下组件： ElasticSearch 是个开源分布式搜索引擎，它的特点有：分布式，零配置，自动发现，索引自动分片，索引副本机制，RESTful 风格接口，多数据源，自动搜索负载等。 Logstash 是一个完全开源的工具，它可以对你的日志进行收集、分析，并将其存储供以后使用。 Kibana 是一个开源和免费的工具，它可以为 Logstash 和 ElasticSearch 提供的日志分析友好的 Web 界面，可以帮助汇总、分析和搜索重要数据日志。 Filebeat 已经完全替代了 Logstash-Forwarder 成为新一代的日志采集器，同时鉴于它轻量、安全等特点，越来越多人开始使用它。因为免费的 ELK 没有任何安全机制，所以这里使用了 Nginx 作反向代理，避免用户直接访问 Kibana 服务器。加上配置 Nginx 实现简单的用户认证，一定程度上提高安全性。另外，Nginx 本身具有负载均衡的作用，能够提高系统访问性能。ELK 架构如图 4 所示：图 4，ELK 流程图对于有实时计算的需求，可以使用 Flume + Kafka + Storm + MySQL 方案，一般架构如图 5 所示：图 5，实时分析系统架构图其中： Flume 是一个分布式、可靠、和高可用的海量日志采集、聚合和传输的日志收集系统，支持在日志系统中定制各类数据发送方，用于收集数据；同时，Flume 提供对数据进行简单处理，并写到各种数据接受方（可定制）的能力。 Kafka 是由 Apache 软件基金会开发的一个开源流处理平台，由 Scala 和 Java 编写。其本质上是一个 "按照分布式事务日志架构的大规模发布 / 订阅消息队列"，它以可水平扩展和高吞吐率而被广泛使用。 Kafka 追求的是高吞吐量、高负载，Flume 追求的是数据的多样性，二者结合起来简直完美。 13、监控系统监控系统只包含与后台相关的，这里主要是两块，一个是操作系统层的监控，比如机器负载，IO，网络流量，CPU，内存等操作系统指标的监控。另一个是服务质量和业务质量的监控，比如服务的可用性，成功率，失败率，容量，QPS 等等。常见业务的监控系统先有操作系统层面的监控（这部分较成熟），然后扩展出其它监控，如 Zabbix，小米的 Open-Falcon，也有一出来就是两者都支持的，如 Prometheus。如果对业务监控要求比较高一些，在创业选型中建议可以优先考虑 Prometheus。这里有一个有趣的分布，如图 6 所示。图 6，监控系统分布亚洲区域使用 Zabbix 较多，而美洲和欧洲，以及澳大利亚使用 Prometheus 居多，换句话说，英文国家地区（发达国家？）使用 Prometheus 较多。 Prometheus 是由 SoundCloud 开发的开源监控报警系统和时序列数据库（TSDB）。Prometheus 使用 Go 语言开发，是 Google BorgMon 监控系统的开源版本。相对于其它监控系统使用的 push 数据的方式，Prometheus 使用的是 pull 的方式，其架构如图 7 所示：

Fig. 7 the architecture diagram of the retro Prometheus

As shown in the figure above, the main components of Prometheus are as follows:

Prometheus Server is mainly responsible for data acquisition and storage, and provides support for PromQL query language. Server specifies the crawling target through configuration files, text files, ZooKeeper, Consul, DNS SRV Lookup, and so on. According to these targets, Server grabs metrics data regularly, and each fetching target needs to expose an interface of the http service to fetch it regularly.

Client SDK: the official client class libraries are Go, Java, Scala, Python, Ruby, and many other third-party developed class libraries that support Nodejs, PHP, Erlang and so on.

Push Gateway supports intermediate gateways with temporary Job active push metrics.

Exporter Exporter is the general name of a kind of data acquisition components of Prometheus. It is responsible for collecting data from the target and converting it into a format supported by Prometheus. Different from the traditional data acquisition component, it does not send data to the central server, but waits for the central server to grab it on its own initiative. Prometheus provides many types of Exporter to collect the running status of different services. Currently supported are database, hardware, message middleware, storage system, HTTP server, JMX and so on.

Alertmanager: is a separate service that can support Prometheus query statements and provide a very flexible alarm mode.

The query mode of Prometheus HTTP API, customizing the required output.

Grafana is an open source analysis and monitoring platform that supports data sources such as Graphite,InfluxDB,OpenTSDB,Prometheus,Elasticsearch,CloudWatch. Its UI is very beautiful and highly customized.

Startups choose the solution of Prometheus + Grafana, coupled with a unified service framework (such as gRPC), can meet the monitoring needs of most small and medium-sized teams.

14. Configure the system

With the increasing complexity of program functions, the configuration of programs is increasing: switches of various functions, downgrade switches, grayscale switches, parameter configuration, server address, database configuration, etc., in addition, the requirements for the configuration of background programs are getting higher and higher: real-time effective after configuration modification, grayscale release, sub-environment, sub-user, sub-cluster management configuration, perfect permissions, audit mechanism and so on. In such an environment, the traditional configuration files, databases and other methods have become increasingly unable to meet the needs of developers for configuration management. There are two solutions in the industry:

Based on zk and etcd, support interface and api, use database to save version history, pre-plan, go through the review process, and finally send it to a storage with push capability such as zk or etcd (the service registration itself is also using zk or etcd, and the selection is one). Clients deal directly with zk or etcd. As for grayscale publishing, it varies from family to home. One implementation is to publish an IP list that requires grayscale at the same time. When the client monitors a change in the configuration node, it compares whether it belongs to the list or not. PHP, a stateless language and other languages that zk/etcd does not support, has to set up an Agent on the client machine to listen for changes, and then write to configuration files or shared memory, such as Qconf.

The push, audit process, and configuration data management of configuration files based on operation and maintenance automation are similar to solution one. The configuration files are generated when issued, and pushed to each client based on operation and maintenance automation tools such as Puppet,Ansible. The external configuration file is reread regularly by the application, and the IP list is specified when the configuration is issued in grayscale.

Startups do not need this kind of complexity in the early stage, just go to zk, get an interface to manage the contents of zk, record everyone's operation log, connect the program directly to zk, or use zk-based optimized solutions such as Qconf.

15. Release system / deployment system

From the perspective of software production, the typical flow from code to final service is shown in figure 8:

Figure 8, flow chart

As you can see from the figure above, it is a long process from the developer to the end user of the service, and the whole can be divided into three phases:

The stage from code (Code) to finished product library (Artifact) mainly focuses on the continuous construction of the developer's code and the centralized management of the artifacts produced by the construction, which is the stage of preparing input for the deployment of the system.

The stage from the product to the runnable service mainly completes the product deployment to the designated environment, which is the most basic work of the deployment system.

The stage from the development environment to the final production environment mainly completes the migration of a change in different environments, which is the core capability of deploying the final service of the system.

The release system integrates product management, release process, rights control, online environment version change, grayscale release, online service rollback and other aspects, which is an important channel for developers to work. Open source projects do not fully meet the project, if only Web projects, Walle, Piplin are available, but the features are not satisfied, you can integrate Jenkins + Gitlab + Walle (you can consider two days to improve), the above programs basically include product management, release process, permission control, online environment version changes, grayscale release (need to be implemented by yourself), online service rollback and other functions.

16. Jumping machine

The springboard machine is faced with the need to have one that can meet the requirements of role management and authorization, access control of information resources, operation records and audit, system change and maintenance control, and generate some statistical reports to cooperate with management standards to continuously improve the compliance of IT internal control, to control and audit the operation behavior of operation and maintenance personnel, to quickly locate the causes and responsible persons for operation accidents caused by misoperation and illegal operation. Its functional modules generally include: account management, authentication management, authorization management, audit management and so on.

In open source projects, Jumpserver can achieve common requirements of the jumping machine, such as authorization, user management, server basic information recording, etc., and at the same time, it can also batch execute scripts and other functions; among them, video playback, command search, real-time monitoring and other features can also help operation and maintenance personnel to trace the operation history, easy to find operation traces, and facilitate management of other personnel to control the operation of the server.

17. Machine management

The consideration of tool selection for machine management can include the following three aspects:

Whether it is simple, whether each machine needs to deploy Agent (client)

Language choice (Puppet/Chef vs Ansible/SaltStack) open source technology, not looking at the official website is not proficient enough, do not understand the source code is not enough to master; Puppet, Chef based on Ruby development, Ansible, SaltStack based on Python development

The choice of speed (Ansible vs SaltStack) Ansible transmits data based on the SSH protocol, and SaltStack uses message queue zeroMQ to transmit data; the ability of large-scale concurrency is also acceptable for dozens of-200 brothers. If you operate thousands at a time, it is better to use salt.

As shown in figure 9:

Figure 9, comparison of machine management software

Generally speaking, startups can solve most of the problems by choosing Ansible, which is simple, does not need to install additional clients, can be run from the command line, and does not need to use configuration files. As for the more complex tasks, the Ansible configuration is handled through the YAML syntax in the configuration file called Playbook. Playbook can also use templates to extend its functionality.

The choice of start-up companies

1. Choose the right language

Choose what the team is familiar with / can control, the startup company has fewer people and more things, and there is not too much redundancy so that the R & D team is familiar with the new language, can get started quickly, can get alive quickly, and the language in which problems can be solved quickly is a good choice.

Choose something more modern, where modernity means that the language itself has completed some features that previously required special processing, such as memory management, threading, and so on.

Choose those with many open source wheels or high community activity, this principle is to ensure that you can reduce investment in the development process, have stable and reliable wheels to use, and you can quickly search for answers to questions on the Internet.

Choosing the right language to recruit people will allow the startup team to reduce the cost of hiring and quickly recruit the right people.

Choose something that is interesting and relevant to the above point, interesting, and useful when leaving people behind.

2. Select the appropriate components and cloud service providers

Choose a reliable cloud service provider

Select components of a cloud service provider

Choose mature open source components instead of the latest ones

Choose products that are landed in front-line Internet companies and open source, and form a good reputation in the community.

Open source community activity

Choosing a reliable cloud service provider is actually a false proposition, because none of the service providers is reliable, and the usability problems they promised will basically happen to you. Here, we still need to do some work on our own, such as multi-service provider backup, such as using CDN, you must not just choose one, at least two, one is disaster preparedness, the ability to maintain background switching, the other is multi-point coverage Different service providers have different resources on the CDN node.

After choosing a cloud service provider, there will be a lot of products you can choose, comparison, storage, queue, these will be ready-made products, this time will be tangled, is it to use? Or do you build it on the CVM yourself? Here, my suggestion is to use the cloud service provider in the early stage, and then do it on your own, which will reduce a lot of OPS work. But here, we should learn more about the component features and some holes of the cloud service provider. For example, their intranet will often be disconnected, and their upgrades will also flash off, so we should do a good job of fault tolerance and avoidance on the business side.

With regard to open source components, as far as possible to choose mature, mature components have stood the test of time, basically will not have a big problem, and there is a complete set of supporting tools, the problem can also be quickly found on the Internet. Basically, people have stepped on all the holes you encounter.

3. Develop processes and norms

Develop development specifications, code and code branch management specifications, key code only a small number of people have permission

Formulate the release process specification and land from the release system

Formulate norms for operation and maintenance

Formulate database operation specifications and collect database operation permissions

Formulate the alarm handling process, so that the alarm will be handled by others.

Develop a reporting mechanism, morning meeting / weekly report

4. Self-research and selection of suitable auxiliary system

All processes and specifications need to be solidified by systems, otherwise they will be castles in the air, how to choose these systems? Refer to the open source ones in the previous chapter, compare the selected languages, components, and so on, and choose the most appropriate one.

For example, project management, take a look at what kind of company you are, what is the pace of development, waterfall, agile by project, or by customer, etc., usually organized by project or task, and so on.

For example, if the log system was typed before, then the last ELK standardizes some log components. Basically, you don't have to consider the log system for a long time, and you can split it or expand it at most. When the organization is big, set up a log system by yourself.

For example, code management, project management systems, these are put on the intranet, security, in Internet companies, belong to the lifeline, lifeline things or put in other people can not get or difficult to get places will be more reliable.

5. Problems to be considered in the process of selection

The choice of the technology stack is a bit like making some kind of commitment, which can't be changed for a certain period of time, so we need to think about it.

Look at the previous content, there is a word appeared three times, appropriate, the choice is appropriate, not the best, not the latest, is the most appropriate, appropriate is for the moment, this choice is the most appropriate? For example, using the Go line of things, the technology is relatively new, does the industry have enough components? Do you have enough personnel in the organization? How much does it cost to study? Can the written things meet the business performance requirements? Can you meet the time requirement?

Looking to the future, do we need to make a change in one to three years? Does the technology stack need to be fundamentally changed? If the organization develops rapidly, does the existing technology stack need a big change when there are 200 or 500 people?

The cost needs to be considered in the process of starting a business. the cost here is not only how much money is spent and how much salary is paid, but sometimes what is more important is the cost of time. When many businesses start a business, people fight for time, that is, a time window. After that, you won't have anything to do.

Cloud-based background Technology Architecture of start-up companies

Combined with the above considerations, after the selection of systems and components, based on cloud services, the background technical architecture of a start-up company is shown in figure 10:

On how to build a startup backstage technology stack from scratch to share here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.