DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report
  1. DZone
  2. Data Engineering
  3. Data
  4. ZooKeeper Primer

ZooKeeper Primer

Animesh Kumar user avatar by
Animesh Kumar
·
Jun. 14, 10 · Interview
Like (2)
Save
Tweet
Share
26.21K Views

Join the DZone community and get the full member experience.

Join For Free

distributed collaborative applications involve a set of processes or agents interacting with one another to accomplish a common goal. they execute on wide area environments with little or no knowledge of the infrastructure and almost no control over the resources available. besides, they need to sequence and order events, and ensure atomicity of actions. above all, the application needs to keep itself from nightmarish bugs like race conditions, deadlocks and partial failures.

zookeeper helps to build a distributed application by working as a coordination service provider.

it’s reliable and highly available. it exposes a simple set of primitives upon which distributed applications can build higher level services for

  • synchronization,
  • configuration maintenance,
  • groups,
  • naming,
  • leader elections and other niche needs.

what lies beneath?

zookeeper maintains a shared hierarchical namespace modeled after standard file systems. the namespace consists of data registers, called znodes. they are similar to files and directories.

note: znodes store data in memory primarily, with a logged backup on disk for reliability. it means that whatever data znodes can keep must fit into memory, hence it must be small, max to 1mb. on the other hand, it means high throughput and low latency.

znodes are identified by unique absolute paths which are “/” delimited unicode strings. to help achieve uniqueness, zookeeper provides sequential znodes where a globally maintained sequence number will be appended by zookeeper to paths, i.e. path “ /zoo-1/tiger/white- ” can be assigned with a sequence, say 5, and will become “ /zoo-1/tiger/white-5 ”.

  1. a client can create a znode, store up to 1mb of data and associate as many as children znodes as it wants.
  2. data access to and fro a znode is always atomic . either the data is read and/or written in its entirety or it fails.
  3. there are no renames and no append semantics available.
  4. each znode has an access control list (acl) that restricts who can do what.
  5. znodes maintain version numbers for data changes, acl changes, and timestamps, to allow cache validations and coordinated updates.

znodes can be one of two types: ephemeral and persistent . once set, the type can’t be changed.

  1. ephemeral znodes are deleted by zookeeper when the creating client’s session gets closed, while persistent znodes stay as long as not deleted explicitly.
  2. ephemeral znodes can’t have children.
  3. both types of znodes are visible to all clients eligible with acl policy.

up and running

there are enough literature on installing zookeeper on linux machine already. so, i am going to focus how to install zookeeper on windows machines.

  1. download and install cygwin. http://www.cygwin.com/
  2. download stable release of zookeeper. http://hadoop.apache.org/zookeeper/releases.html
  3. unzip zookeeper to some directory, say, d:/ ilabs /zookeeper-3.3.1
  4. add a new environment variable zookeeper_install and point it to d:/ilabs/zookeeper-3.3.1
  5. edit path variable and append $zookeeper_install/bin to it.
  6. now start cygwin.

now, start zookeeper server.

$ zkserver.sh start

ouch! it threw an error:

zookeeper exited abnormally because it could not find the configuration file, zoo.cfg , which it expects in
$zookeeper_install/conf directory. this is a standard java properties file.

go ahead and create zoo.cfg file in the conf directory. open it up, and add below properties:

# the number of milliseconds of each tick
ticktime=2000

# the directory where the snapshot is stored.
datadir=d:/ilabs/zoo-data/

# the port at which the clients will connect
clientport=2181

go back to cygwin, and issue the same command again. this time zookeeper should load properly.

now, connect to zookeeper. you should probably open a new cygwin window, and issue the following command.

$ zkcli.sh

this will connect to your zookeeper server running at localhost:2181 by default, and will open zk console.

let’s create a znode, say /zoo-1

[zk: localhost:2181<connected> 1] create -s /zoo-1 “hello world!” null

flag –s creates a persistent znode. hello world! is the data you assign to znode (/zoo-1) and null is its acl.

to see all znodes,

[zk: localhost:2181<connected> 2] ls /
[zoo-1, zookeeper]

this means, there are 2 nodes at the root level, /zoo-1 and /zookeeper . zookeeper uses the /zookeeper sub-tree to store management information, such as information on quotas.

for more commands, type help . if you want to further explore on the command line tools, refer: http://hadoop.apache.org/zookeeper/docs/current/zookeeperstarted.html

continue reading the primer >>

Data (computing)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Tracking Software Architecture Decisions
  • Introduction to Spring Cloud Kubernetes
  • DevOps for Developers: Continuous Integration, GitHub Actions, and Sonar Cloud
  • A Gentle Introduction to Kubernetes

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: