What is Apache Zookeeper
Imagine we have two or more application servers of the same or different application instances and they need to share some config data, each instance can modify the data and when it does the other need to be aware of the update.
so how can we achieve this behavior? let's think
we could use a common database, applications can read from and write changes to it, and they will query the data frequently at a certain interval to read the data and sync the changes
we could use socket. the applications will have their socket clients and a separate socket server which will host the config. Clients would read, push changes and ng get notified when data changes via socket
problesm with 1st approch
won`t get updates instantly
put extra load on the database by frequently querying it
problems with 2nd approach
you need to set up a socket server from scratch
implement logic on the application level for data reading and changes
availability issues, single point of failure
now Introducing Apache Zookeeper which solves all the problems with both of the approaches,
what is Zookeeper?
it is a distributed config manager, a Distributed Coordination Service for Distributed Applications
when do we use it?
when we need to share config or coordination data between applications
update config data
watch for changes in data
concepts
you can think of it as a file directory tree-like structure
so just like a tree, there are nodes and child nodes
a node can have both data and other child nodes
nodes that contain data are called znodes
zookeeper data is kept on memory for low latency
zookeeper is replicated
clients maintain a TPC connection with the server to send, recive requests and get updates
zookeeper marks each update with a number to reflect the order of sk transactions
every node in Zookeeper is identified by a path (
/
being the root path/node)zookeeper has the concept of ephemeral nodes these znodes exist as long as the session that the znode was created by is active. When the session ends the znode is automatically deleted
zookeeper has watched, client can watch a znode for any changes
some simple APIs provided by zookeeper
create : creates a node at a location in the tree
delete : deletes a node
exists : tests if a node exists at a location
get data : reads the data from a node
set data : writes data to a node
get children : retrieves a list of children of a node
sync : waits for data to be propagated
Hands-on
we will be using docker for this
let's create a zookeeper container
docker run --name my-zk zookeeper:latest
now lest enter into the container on a new terminal
docker exec -it my-zk /bin/bash
we are going to use zkCli.sh to interact with the zookeeper server as a client, to enter the client
/apache-zookeeper-3.8.0-bin/bin/zkCli.sh
create a node on zkCli terminal
create /test
set data on node
set /test "count=0"
or we could create and set data with a single command
create /test "count=0"
now to retrieve the data
get /test
/* count=0 */
watch for data change
now let's open up a new terminal and enter the zookeeper cli
docker exec -it my-zk /bin/bash
watch for changes in /test node
addWatch /test
now on the previous terminal set new data for /test node, and we would see that the second terminal reacted to the change
set /test count=1