Apache Kafka Security: Features and Uses of Kafka
Want to learn more about using Apache Kafka Security? Check out this post to learn more about Kafka features and use cases.
Join the DZone community and get the full member experience.Join For Free
Apache Kafka Security
There are a number of features added in Kafka community in release 0.9.0.0. There is a flexibility for their usage, either separately or together, that enhances security in a Kafka cluster.
So, the list of currently-supported security measures are:
- By using either SSL or SASL and authentication of connections to Kafka Brokers from clients, other tools are possible. It supports various SASL mechanisms, including:
- SASL/GSSAPI (Kerberos) – starting at version 0.9.0.0
- SASL/PLAIN – starting at version 0.10.0.0
- SASL/SCRAM-SHA-256 and SASL/SCRAM-SHA-512 – starting at version 0.10.2.0
2. It also offers authentication of connections from brokers to ZooKeeper.
3. Moreover, it provides encryption of data, which is transferring between brokers andKafka clients or between brokers and tools using SSL. This includes:
- Authorization of reading/write operations by clients.
- Here, authorization is pluggable and supports integration with external authorization services.
Note: Make sure that security is optional.
Need for Kafka Security
Basically, Apache Kafka plays the role as an internal middle layer, which enables our back-end systems to share real-time data feeds with each other through Kafka topics. Generally, any user or application can write messages pertaining to any topic, as well as read data from any topics with a standard Kafka setup. However, it is a required to implement Kafka security when our company moves towards a shared tenancy model while multiple teams and applications use the same Kafka Cluster, or also when Kafka Cluster starts on boarding some critical and confidential information.
Problems Kafka Security Is Solving
There are three components of Kafka Security:
Encryption of Data In-Flight Using SSL/TLS
It keeps data encrypted between our producers and Kafka, as well as our consumers and Kafka. However, we can say that it is a very common pattern everyone uses when going on the web.
Authentication Using SSL or SASL
To authenticate our Kafka Cluster, SSL and SASL allow our producers and our consumers to verify their identity. It is a very secure way to enable our clients to endorse an identity. That helps tremendously in the authorization.
Authorization Using ACLs
In order to determine whether or not a particular client would be authorized to write or read to some topic, the Kafka brokers can run clients against access control lists (ACL).
Since our packets, while being routed to Kafka cluster, travel to networks and hop from machine to machine, this solves the problem of the man in the middle (MITM) attack. Any of these routers could read the content of the data if our data is PLAINTEXT.
Our data is encrypted and securely transmitted over the network with enabled encryption and carefully setup SSL certificates. Only the first and the final machine possess the ability to decrypt the packet being sent with SSL.
However, this encryption comes at a cost. That means that in order to encrypt and decrypt packets, the CPU is now leveraged for both the Kafka Clients and the Kafka Brokers. Although, SSL Security comes at the negligible cost of performance.
Note: The encryption is only in-flight and the data still sits un-encrypted on our broker’s disk.
Kafka Authentication (SSL and SASL)
Basically, authentication of Kafka clients to our brokers is possible in two ways: SSL and SASL
SSL Authentication in Kafka
It is leveraging a capability from SSL, what we also call two ways authentication. Basically, it issues a certificate to our clients, signed by a certificate authority that allows our Kafka brokers to verify the identity of the clients.
However, it is the most common setup, especially when we are leveraging a managed Kafka clusters from a provider, like Heroku, Confluent Cloud, or CloudKarafka.
SASL Authentication in Kafka
SASL refers to Simple Authorization Service Layer. The basic concept here is that the authentication mechanism and Kafka protocol are separate from each other. It is very popular with Big Data systems as well as the Hadoop setup.￼
Kafka supports the following shapes and forms of SASL:
SASL PLAINTEXT is a classic username/password combination. However, make sure, we need to store these usernames and passwords on the Kafka brokers in advance because each change needs to trigger a rolling restart. However, it’s less recommended security. Also, make sure to enable SSL encryption while using SASL/PLAINTEXT, hence that credentials aren’t sent as PLAINTEXT on the network.
It is a very secure combination alongside a challenge. Basically, password and Zookeeperhashes are stored in Zookeeper here, hence that permits us to scale security even without rebooting brokers. Make sure to enable SSL encryption, while using SASL/SCRAM, hence that credentials aren’t sent as PLAINTEXT on the network.
SASL GSSAPI (Kerberos)
It is also one of a very secure way of providing authentication. Because it works on the basis of Kerberos ticket mechanism, the most common implementation of Kerberos is Microsoft Active Directory. Since it allows the companies to manage security from within their Kerberos Server, we can say that SASL/GSSAPI is a great choice for big enterprises. Also, communications are encrypted with SSL encryption, which is optional with SASL/GSSAPI. However, setting up Kafka with Kerberos is the most difficult option, but worth it in the end.
- (WIP) SASL Extension (KIP-86 in progress)
To make it easier to configure new or custom SASL mechanisms that are not implemented in Kafka, we use it.
- (WIP) SASL OAUTHBEARER (KIP-255 in progress)
This will allow us to leverage an OAuth2 token for authentication.
However, to perform it in an easier way, we can use SASL/SCRAM or SASL/GSSAPI (Kerberos) for the authentication layer.
Kafka Authorization (ACL)
Kafka needs to be able to decide what they can and cannot do as soon as our Kafka clients are authenticated. This is where Authorization comes in, which is controlled by the Access Control Lists (ACL).
Since ACL can help us prevent disasters, they are very helpful. Let’s understand it with an example, we have a topic that needs to be writeable from only a subset of clients or hosts. Also, we want to prevent our average user from writing anything to these topics, thus it prevents any data corruption or deserialization errors. ACLs are also great if we have some sensitive data, and we need to prove to regulators that only certain applications or users can access that data.
We can use the Kafka-ACLs command to adds ACLs. It also even has some facilities and shortcuts to add producers or consumers.
kafka-acl --topic test --producer --authorizer-properties zookeeper.connect=localhost:2181 --add --allow-principal User:alice
The result being:
Adding ACLs for resource `Topic:test:'
User:alice has Allow permission for operations: Describe from hosts: *
User:alice has Allow permission for operations: Write from hosts: *
Adding ACLs for resource `Cluster:kafka-cluster`:
User:alice has Allow permission for operations: Create from hosts: *
Note: Store ACL in Zookeeper by using the default SimpleAclAuthorizer, only. Also, ensure only Kafka brokers may write to Zookeeper (zookeeper.set.acl=true). Else, any user could come in and edit ACLs, thus defeating the point of security.
There are two necessary steps in order to enable ZooKeeper authentication on brokers:
- At first, set the appropriate system property just after creating a JAAS login file and to point to it.
- Set the configuration property zookeeper.set.acl in each broker to true.
Basically, the ZooKeeper’s metadata for the Kafka cluster is world-readable, but only brokers can modify it because inappropriate manipulation of that data can cause cluster disruption. Also, we recommend limiting the access to ZooKeeper via network segmentation.
We need to execute the several steps to enable ZooKeeper authentication with minimal disruption to our operations, if we are running a version of Kafka that does not support security or simply with security disabled, and if we want to make the cluster secure:
- At first, perform a rolling restart setting the JAAS login file, which enables brokers to authenticate. At the end of the rolling restart, brokers are able to manipulate znodes with strict ACLs, but they will not create znodes with those ACLs
- Now, we need to do it the second time and make sure this time set the configuration parameter zookeeper.set.acl to true. As a result, that can enable the use of secure ACLs at the time of creating znodes.
- Moreover, execute the ZkSecurityMigrator tool. So, in order to execute the tool, use this script: ./bin/zookeeper-security-migration.sh with zookeeper.acl set to secure. This tool traverses the corresponding sub-trees changing the ACLs of the znodes.
With the following steps, we can turn off authentication in a secure cluster:
- Perform a rolling restart of brokers setting the JAAS login file, which enables brokers to authenticate, but setting zookeeper.set.acl to false. However, brokers stop creating znodes with secure ACLs at the end of the rolling restart. Although they are still able to authenticate and manipulate all znodes.
- Also, execute the tool ZkSecurityMigrator tool with this script ./bin/zookeeper-security-migration.sh with zookeeper.acl set to unsecure. It traverses the corresponding sub-trees changing the ACLs of the znodes.
- Further, do perform it a second time as well. Make sure this time omitting the system property which sets the JAAS login file.
Example of how to run the migration tool:
./bin/zookeeper-security-migration.sh –zookeeper.acl=secure –zookeeper.connect=localhost:2181
Run this to see the full list of parameters:
Migrating the ZooKeeper Ensemble
We need to enable authentication on the ZooKeeper ensemble. Hence, we need to perform a rolling restart of the server and set a few properties to do it.
Hence, in this Kafka security tutorial, we have seen the introduction to Kafka Security. Moreover, we also discussed the need for Kafka Security and problems that are solved by Kafka Security. In addition, we discussed SSL Encryption and SSL and SASL Kafka authentication. Along with this, in the authorization, we saw Kafka topic authorization. Finally, we looked at Zookeeper Authentication and its major steps. However, if any doubt occurs, feel free to ask in the comment section below!
Published at DZone with permission of Rinu Gour. See the original article here.
Opinions expressed by DZone contributors are their own.