[TOC]
Etcd应用背景说明: 在实际生产环境中,有很多应用在同一时刻只能启动一个实例,例如更新数据库的操作,多个实例同时更新不仅会降低系统性能,还可能导致数据的不一致。但是单点部署也使得系统的容灾性减弱,比如进程异常退出 目前进程保活,也有很多方案如supervisor和systemd。但是如果宿主机down掉呢? 所有的进程保活方法都会无济于事,于是我们可以采用基于etcd自带的leader选举机制,轻松的使服务具备了高可用性。
答:Etcd(发音是“et-cee-dee”)作为开源、分布式、高可用、强一致性的key-value存储系统(采用Go开发则跨平台), 它由CoreOS团队开发,现在由Cloud Native Computing Foundation负责管理,提供了配置共享和服务发现等众多功能。它是许多分布式系统的主干,为跨服务器集群存储数据提供可靠的方式。
通过raft算法维护集群中各个节点的通信和数据一致性,节点之间是对等的关系,即使leader节点故障会很快选举出新的leader来保证系统的正常运行; 目前已广泛应用在kubernetes、ROOK、CoreDNS、M3以及openstack等领域
官方地址&文档: https://etcd.io/docs/
补充:2020年4月23日
(1)Etcd可用版本一览说明
* v3.4.x * v3.3.x * v3.2.x * v3.1.x * v2
etcd概念词汇表:
Raft:etcd所采用的保证分布式系统强一致性的算法。node:一个Raft状态机实例。Member: 一个etcd实例。它管理着一个Node,并且可以为客户端请求提供服务。Cluster:由多个Member构成可以协同工作的etcd集群。Peer:对同一个etcd集群中另外一个Member的称呼。Client: 向etcd集群发送HTTP请求的客户端。WAL:预写式日志,etcd用于持久化存储的日志格式。snapshot:etcd防止WAL文件过多而设置的快照,存储etcd数据状态。Proxy:etcd的一种模式,为etcd集群提供反向代理服务。Leader:Raft算法中通过竞选而产生的处理所有数据提交的节点。Follower:竞选失败的节点作为Raft中的从属节点,为算法提供强一致性保证。Candidate:当Follower超过一定时间接收不到Leader的心跳时转变为Candidate开始竞选。Term:某个节点成为Leader到下一次竞选时间,称为一个Term。Index:数据项编号。Raft中通过Term和Index来定位数据。Etcd的架构如下图所示,主要分为四部分 HTTP server、Store、Raft和WAL组成:
WeiyiGeek.Etcd的架构
组成部分说明:
HTTP server: 为用户提供的Api请求。Store: 用于处理 etcd 支持的各类功能的事务,包括数据索引、节点状态变更、监控与反馈、事件处理与执行等等。Raft: 利用raft算法保证节点之间数据的强一致性。WAL: 数据存储方式通过 WAL 进行数据持久化存储Snapshot 存储数据的状态快照;Entry 表示存储的具体日志内容。ETCD集群是一个分布式系统,每个ETCD节点都维护了一个状态机,并且存储了完整的数据,任意时刻至多存在一个有效的主节点,而主节点处理所有来自客户端的读写操作。
WeiyiGeek.
为了更好的了解Etcd工作机制我们需要了解三个概念(也就是下图所想表达)
leaders(指挥): 处理所有需要集群一致协商的客户端请求,并负责接受新的更改,将信息复制到follower节点并在follower验证接受后提交更改;注意:每个集群在任何给定的时间内只能有一个leaderelections(选举): 每个节点维护一个随机的election计时器,该计时器表示节点在调用新的election以及选择自己作为候选之前需要等待的时间。Term(期限): 如果leader挂了或者不再响应了,那么其他节点将在预定的时间超时之后开启一个新的term来创建新election。补充情况说明:
如果节点在超时发生之前没有收到leader的消息,则该节点将通过启动新的term将自己标记为候选,并要求其他节点投票来开始新的election(每个节点投票给请求其投票的第一个候选)。如果候选从集群中的大多数节点处获得了选票,那么它就成为了新的leader。如果存在多个候选且获得了相同数量的选票,那么现有的election term将在没有leader的情况下结束,而新的term将以新的随机选举计时器开始。答:举个例子在军事演习中,我们总会发现某架预警机周围分布着多架战斗机和歼击机,他们统一听从预警机的调度,有序的完成消灭敌军的任务。那么在这个集群中,预警机就类似于我们选主中的master,某个集群有且只有一个master,完成任务的分发等工作,其他节点配合行动,当这个master节点挂掉之后,要能够立刻选出新的节点作为master。
如上所述: 一个基于Raft的系统中,集群使用elections为给定的term选择leader。
任何更改都必须连接到leader节点。Etcd没有立即接受和提交更改,而是使用Raft算法确保大多数节点都同意更改。Leader将提议的新值发送到集群中的每个节点,然后节点发送一条消息确认收到了新值。如果大多数节点确认接收,那么leader提交新值,并向每个节点发送将该值提交到日志的消息(意味着每次更改都需要得到集群节点的仲裁才能提交)通过上面基本了解我们再来看Replicate State Machine状态转换规则:ETCD中每个节点的状态集合为(Follower、Candidate、Leader)
WeiyiGeek.
流程声明:
(1) 集群初始化时候每个节点都是Follower角色,当Follower在一定时间内没有收到来自主节点的心跳,会将自己角色改变为Candidate,并发起一次选主投票;(2) 当收到包括自己在内超过半数节点赞成后(选举成功);当收到票数不足半数选举失败或者选举超时,注意若本轮未选出主节点将进行下一轮选举。(3) 当某个Candidate节点成为Leader后,Leader节点会通过心跳与其他节点同步数据,同时参与竞选的Candidate节点进入Follower角色。(1)硬件建议
官方etcd性能基准测试:在8个vCPU、16GB RAM、50GB SSD GCE实例上运行etcd(生产环境中推荐按照官方配置进行自定义资源配置)但是对于测试来说任何具有低延迟存储和几gb内存的相对现代的机器都应该足够了,注意拥有大型v2数据存储的应用程序将需要比大型v3数据存储更多的内存,因为数据保存在匿名内存中而不是从文件映射内存;注意事项:
Etcd会将数据写入磁盘,因此强烈推荐使用SSD始终使用奇数个集群数量,因为需要通过仲裁来更新集群的状态出于性能考虑集群通常不超过7个节点单机实例安装方式:
预构建的二进制文件:https://github.com/etcd-io/etcd/releases/构建最新版本: https://github.com/etcd-io/etcd.git 采用go第三方包构建此处我们采用Build的方式进行安装Etcd,具体的实现流程如下;
(2)安装流程 Step1.Go环境安装采用二进制包直接解压安装https://golang.org/dl/(自带梯子),且版本必须在1.13以上;
wget https://studygolang.com/dl/golang/go1.14.2.linux-amd64.tar.gz -O /opt/go1.14.2.linux-amd64.tar.gz tar -zxf /opt/go1.14.2.linux-amd64.tar.gz -C /usr/local/ #在/root/.profile进行添加 cat >> /etc/profile<<END #Go环境配置 export GOROOT=/usr/local/go #第三方包的安装包路径 export GOBIN=$GOROOT/bin export GOPATH=$GOROOT/path export PATH=$PATH:$GOBIN:$GOPATH END source /etc/profile mkdir -vp /usr/local/go/path ln -s /usr/local/go/bin/* /usr/local/bin/ go version
Step2.如果使用官方构建脚本从主分支构建etcd我们先进行Clone然后build即可(如何采用此种方式安装就不需要第三步了)
git config --global http.proxy 'socks5://10.20.172.135:2083' cd etcd ./build
Step3.如果通过go get从主分支构建一个vendored etcd(一键获取代码、编译并安装), 执行以下命令即/usr/local/go/pathGo第三方包安装目录中看见下载文件;
$ echo $GOPATH # /usr/local/go/path go get -v go.etcd.io/etcd go get -v go.etcd.io/etcd/etcdctl
Step4.测试安装通过启动etcd并设置密钥,检查etcd二进制文件是否正确构建。
./usr/local/go/path/bin/etcd {"level":"warn","ts":"2020-04-23T15:12:17.368+0800","caller":"etcdmain/etcd.go:89","msg":"'data-dir' was empty; using default","data-dir":"default.etcd"} {"level":"info","ts":"2020-04-23T15:12:17.368+0800","caller":"embed/etcd.go:113","msg":"configuring peer listeners","listen-peer-urls":["http://localhost:2380"]} {"level":"info","ts":"2020-04-23T15:12:17.990+0800","caller":"membership/cluster.go:524","msg":"set initial cluster version","cluster-id":"cdf818194e3a8c32","local-member-id":"8e9e05c52164694d","cluster-version":"3.5"} {"level":"info","ts":"2020-04-23T15:12:17.990+0800","caller":"etcdserver/server.go:1850","msg":"published local member to cluster through raft","local-member-id":"8e9e05c52164694d","local-member-attributes":"{Name:default ClientURLs:[http://localhost:2379]}","request-path":"/0/members/8e9e05c52164694d/attributes","cluster-id":"cdf818194e3a8c32","publish-timeout":"7s"} {"level":"info","ts":"2020-04-23T15:12:17.991+0800","caller":"embed/serve.go:139","msg":"serving client traffic insecurely; this is strongly discouraged!","address":"127.0.0.1:2379"}
Step5.put一个关键key-value进行测试
#如果OK被打印,那么etcd正在工作 [root@initiator bin]# /usr/local/go/bin/etcdctl put name WeiyiGeek OK [root@initiator bin]# /usr/local/go/bin/etcdctl get name name WeiyiGeek [root@node3 ~]# etcdctl --endpoints=$ENDPOINTS --write-out="json" get name {"header":{"cluster_id":2819294416482393232,"member_id":17704130064291257467,"reVision":7300,"raft_term":301},"kvs":[{"key":"bmFtZQ==","create_revision":7300,"mod_revision":7300,"version":1,"value":"V2VpeWlHZWVr"}],"count":1}
Step6.至此简单实例的etcd安装完成;
#补充etcd服务启动的时候开放了两个端口(默认只能本机访问) tcp 0 0 127.0.0.1:2379 0.0.0.0:* LISTEN 108333/./etcd #客户端使用 tcp 0 0 127.0.0.1:2380 0.0.0.0:* LISTEN 108333/./etcd #对等etcd peer使用(集群使用)
参考:
https://etcd.io/docs/v3.4.0/dl-build/docker|macOS (Darwin) : https://github.com/etcd-io/etcd/releases/除了采用etcdctl命令进行数据的增删改查,我们也可以采用CURL命令采用GET/PUT方式操作etcd中的数据,但是注意V2/V3版本的些许不同; 补充知识:[2020年4月26日 10:20:57] 原子CAS操作(Compare And Swap): 基本用途就是创建分布式的锁服务,即选主仅当客户端提供的条件等于当前etcd的条件时,才会修改一个key的值。 当前提供的可以比较的条件有:
prevExist:检查key是否存在。如果prevExist为true更新请求,如果prevExist的值是false创建请求prevValue:检查key之前的valueprevIndex:检查key以前的modifiedIndex## [通用方式] #版本查看 curl -X GET http://192.168.10.243:2379/version #健康状态 curl -L http://192.168.10.243:2379/health #度量查看 curl -sL http://localhost:22379/metrics
[version 2]
#查看所有键并以json格式显示 curl -LsS http://192.168.10.243:2379/v2/keys | python -mjson.tool # put:新建key值为keyname value为“WeiyiGeekd” curl -X PUT -L http://192.168.10.243:2379/v2/keys/keyname -d value="WeiyiGeek" # get:查看key curl -X GET -L http://192.168.10.243:2379/v2/keys/keyname # delete:删除key curl -X DELETE -L http://127.0.0.1:2379/v2/keys/keyname # 新建TTL的key curl -X PUT http://127.0.0.1:2379/v2/keys/message -d value="Hello world" -d ttl=30 curl http://127.0.0.1:2379/v2/keys/message {"action":"get","node":{"key":"/message","value":"Hello world","expiration":"2019-09-29T08:08:10.674930705Z","ttl":2,"modifiedIndex":20,"createdIndex":20}} # 取消key的TTL curl -X PUT http://127.0.0.1:2379/v2/keys/message -d value="Hello world" -d ttl= -d prevExist=true {"action":"update","node":{"key":"/message","value":"Hello world","modifiedIndex":23,"createdIndex":22},"prevNode":{"key":"/message","value":"Hello world","expiration":"2019-09-29T08:10:23.220573683Z","ttl":16,"modifiedIndex":22,"createdIndex":22}} # 重置key的TTL curl -X PUT http://127.0.0.1:2379/v2/keys/message -d ttl=30 -d refresh=true -d prevExist=true {"action":"update","node":{"key":"/message","value":"Hello world","expiration":"2019-09-29T08:15:29.569276199Z","ttl":30,"modifiedIndex":26,"createdIndex":25},"prevNode":{"key":"/message","value":"Hello world","expiration":"2019-09-29T08:15:01.34698273Z","ttl":2,"modifiedIndex":25,"createdIndex":25}} # 新建带有TTL的目录 curl http://127.0.0.1:2379/v2/keys/dir -d ttl=30 -d dir=true # 在TTL到期前更新该目录的TTL curl -X PUT http://127.0.0.1:2379/v2/keys/dir -d ttl=60 -d dir=true -d prevExist=true # 向该目录插入数据 curl -X PUT http://127.0.0.1:2379/v2/keys/dir/message -d value="Hello world" # 查看该目录中的数据,但是该目录到期后数据会被自动删除 curl http://127.0.0.1:2379/v2/keys/dir/message {"action":"get","node":{"key":"/dir/message","value":"Hello world","modifiedIndex":51,"createdIndex":51}} curl http://127.0.0.1:2379/v2/keys/dir/message {"errorCode":100,"message":"Key not found","cause":"/dir","index":52} # 自动创建有序的key curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job1 {"action":"create","node":{"key":"/queue/00000000000000000042","value":"Job1","modifiedIndex":42,"createdIndex":42}} curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job2 {"action":"create","node":{"key":"/queue/00000000000000000043","value":"Job2","modifiedIndex":43,"createdIndex":43}} curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job3 {"action":"create","node":{"key":"/queue/00000000000000000044","value":"Job3","modifiedIndex":44,"createdIndex":44}} curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job4 {"action":"create","node":{"key":"/queue/00000000000000000045","value":"Job4","modifiedIndex":45,"createdIndex":45}} curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job5 {"action":"create","node":{"key":"/queue/00000000000000000046","value":"Job5","modifiedIndex":46,"createdIndex":46}} curl http://127.0.0.1:2379/v2/keys/queue -XPOST -d value=Job6 {"action":"create","node":{"key":"/queue/00000000000000000047","value":"Job6","modifiedIndex":47,"createdIndex":47}} # 查看创建有序的key curl 'http://127.0.0.1:2379/v2/keys/queue?recursive=true&sorted=true' {"action":"get","node":{"key":"/queue","dir":true,"nodes":[{"key":"/queue/00000000000000000042","value":"Job1","modifiedIndex":42,"createdIndex":42},{"key":"/queue/00000000000000000043","value":"Job2","modifiedIndex":43,"createdIndex":43},{"key":"/queue/00000000000000000044","value":"Job3","modifiedIndex":44,"createdIndex":44},{"key":"/queue/00000000000000000045","value":"Job4","modifiedIndex":45,"createdIndex":45},{"key":"/queue/00000000000000000046","value":"Job5","modifiedIndex":46,"createdIndex":46},{"key":"/queue/00000000000000000047","value":"Job6","modifiedIndex":47,"createdIndex":47}],"modifiedIndex":42,"createdIndex":42}} # 原子操作 # 插入一个已存在的key并添加参数prevExist=false,因为已经有存在的key curl -XPUT http://127.0.0.1:2379/v2/keys/foo?prevExist=false -d value=two {"errorCode":105,"message":"Key already exists","cause":"/foo","index":56} # 将插入条件换成prevValue,即检查key的value值,条件相等就替换,否则就提示条件不匹配 curl -XPUT http://127.0.0.1:2379/v2/keys/foo?prevValue=three -d value=two {"errorCode":101,"message":"Compare failed","cause":"[three != one]","index":56} #值不匹配 curl http://127.0.0.1:2379/v2/keys/foo?prevValue=one -XPUT -d value=two #值匹配替换 {"action":"compareAndSwap","node":{"key":"/foo","value":"two","modifiedIndex":57,"createdIndex":56},"prevNode":{"key":"/foo","value":"one","modifiedIndex":56,"createdIndex":56}} # 持续watch curl http://127.0.0.1:2379/v2/keys/message?wait=true
[version 3]
#PS 3.x 版本中需要对k/v进行base64编码,注意是POST请求而不再是PUT # https://www.base64encode.org/ # WeiyiGeek is 'V2VpeWlHZWVr' in Base64 # etcddemo is 'ZXRjZGRlbW8=' # btoa("Weiyi") # "V2VpeWk=" # btoa("123456") # "MTIzNDU2" # [Put and get keys ] : /v3/kv/range and /v3/kv/put [root@node3 ~]# curl -L http://localhost:2379/v3/kv/put -X POST -d '{"key": "V2VpeWlHZWVr", "value": "ZXRjZGRlbW8="}' [root@node3 ~]# etcdctl get WeiyiGeek WeiyiGeek etcddemo [root@node3 ~]# curl -L http://localhost:2379/v3/kv/range -X POST -d '{"key": "V2VpeWlHZWVr"}' {"header":{"cluster_id":"2819294416482393232","member_id":"17704130064291257467","revision":"7303","raft_term":"301"},"kvs":[{"key":"V2VpeWlHZWVr","create_revision":"7303","mod_revision":"7303","version":"1","value":"ZXRjZGRlbW8="}],"count":"1"} #get all keys prefixed with "foo" | 把所有的键都加上前缀"foo" #curl -L http://localhost:2379/v3/kv/range -X POST -d '{"key": "Zm9v", "range_end": "Zm9w"}' # [Watch keys]: /v3/watch curl -N http://localhost:2379/v3/watch -X POST -d '{"create_request": {"key":"Zm9v"} }' # {"result":{"header":{"cluster_id":"2819294416482393232","member_id":"17704130064291257467","revision":"7303","raft_term":"301"},"created":true}} # {"result":{"header":{"cluster_id":"2819294416482393232","member_id":"17704130064291257467","revision":"7304","raft_term":"314"},"events":[{"kv":{"key":"Zm9v","create_revision":"7301","mod_revision":"7304","version":"2","value":"YmFy"}}]}} # [Transactions] : /v3/kv/txn事务处理 #创建目标 curl -L http://localhost:2379/v3/kv/txn -X POST -d '{"compare":[{"target":"CREATE","key":"Zm9v","createRevision":"2"}],"success":[{"requestPut":{"key":"Zm9v","value":"YmFy"}}]}' # {"header":{"cluster_id":"12585971608760269493","member_id":"13847567121247652255","revision":"3","raft_term":"2"},"succeeded":true,"responses":[{"response_put":{"header":{"revision":"3"}}}]} #目标版本 curl -L http://localhost:2379/v3/kv/txn -X POST -d '{"compare":[{"version":"4","result":"EQUAL","target":"VERSION","key":"Zm9v"}],"success":[{"requestRange":{"key":"Zm9v"}}]}' # {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"6","raft_term":"3"},"succeeded":true,"responses":[{"response_range":{"header":{"revision":"6"},"kvs":[{"key":"Zm9v","create_revision":"2","mod_revision":"6","version":"4","value":"YmF6"}],"count":"1"}}]} # [Authentication] : /v3/auth # create root user curl -L http://127.0.0.1:2379/v3/auth/user/add -X POST -d '{"name": "root", "password": "pass"}' # create root role curl -L http://localhost:2379/v3/auth/role/add -X POST -d '{"name": "root"}' # grant root role curl -L http://localhost:2379/v3/auth/user/grant -X POST -d '{"user": "root", "role": "root"}' # enable auth curl -L http://localhost:2379/v3/auth/enable -X POST -d '{}' #使用etcd对使用/v3/auth/ Authenticate的身份验证令牌进行身份验证 #获取根用户的认证令牌 curl -L http://localhost:2379/v3/auth/authenticate -X POST -d '{"name": "root", "password": "pass"}' # {"header":{"cluster_id":"14841639068965178418","member_id":"10276657743932975437","revision":"1","raft_term":"2"},"token":"sssvIpwfnLAcWAQH.9"} # 然后在请求的Header头中Authorization字段加入上面的token即可认证然后便可以进行操作 curl -L http://localhost:2379/v3/kv/range -H 'Authorization: ExmKVoSbXOhIonIj.7329' -X POST -d '{"key": "V2VpeWlHZWVr"}' {"header":{"cluster_id":"2819294416482393232","member_id":"17704130064291257467","revision":"7307","raft_term":"314"},"kvs":[{"key":"V2VpeWlHZWVr","create_revision":"7303","mod_revision":"7303","version":"1","value":"ZXRjZGRlbW8="}],"count":"1"} # disenable auth curl -L http://localhost:2379/v3/auth/disable -X POST -d '{}' -H 'Authorization: ExmKVoSbXOhIonIj.7329' {"header":{"cluster_id":"2819294416482393232","member_id":"17704130064291257467","revision":"7307","raft_term":"314"}}[
参考地址:
https://etcd.io/docs/v3.4.0/dev-guide/api_grpc_gateway/