{{announcement.body}}
{{announcement.title}}

Kafka Authorization as a Graph

DZone 's Guide to

Kafka Authorization as a Graph

Learn how to set up a Kafka cluster on Kubernetes using Strimzi and kafka-acl-viewer, an open source tool that makes it easy to visualize access control lists.

· Big Data Zone ·
Free Resource

In this article, we describe an open source tool that makes it possible to visualize access control lists in Kafka to help you get an overview of how access in a Kafka cluster is configured.

Using access control lists (ACLs) to limit access in a Kafka cluster is a great way to secure your data but it can quickly become difficult to overview who can access what. We will set up a Kafka cluster on Kubernetes using Strimzi and deploy kafka-acl-viewer  in order to visualize the ACLs as a graph. Strimzi leverages Kubernetes Custom Resources and the Operator pattern  so that we can work with Kafka in the same declarative manner that we are used to with Kubernetes.

Strimzi Installation

In order to get started you will need a Kubernetes cluster to run Strimzi and kafka-acl-viewer. During development I usually use Minikube, but any Kubernetes cluster will do just fine. One thing to note if you do go with Minikube is that Strimzi requires a bit more RAM in your cluster than the default 2GB. To avoid issues make sure you configure Minikube with at least 4GB of RAM when you start your cluster.

Shell
 




x


1
minikube start --memory=4g


When your cluster is up and running go ahead and create yourself a namespace to run things in.

Shell
 




xxxxxxxxxx
1


1
kubectl create namespace kafka


The rest of the post and its examples will assume you are using a namespace called  kafka .

Now you're ready to apply the Strimzi installation from Github, note the piping through  sed  to create everything in the correct namespace you created above.

Shell
 




xxxxxxxxxx
1


 
1
curl -L https://github.com/strimzi/strimzi-kafka-operator/releases/download/0.17.0/strimzi-cluster-operator-0.17.0.yaml \
2
 
          
3
  | sed 's/namespace: .*/namespace: kafka/' \
4
 
          
5
  | kubectl apply -f - -n kafka


This will create the Strimzi deployments in charge of managing your Kafka installation, Cluster Roles, Cluster Role Bindings, and the CRDs (Custom Resource Definitions) you use to configure and manage your cluster.

The next step is to actually create our Kafka cluster, this is done through one of these CRDs, the Kafka custom resource. In order to enable the use of ACLs we need some kind of authentication so that the clients have an identity to tie the access to. Luckily Strimzi makes it really easy to set up TLS authentication. Strimzi will automatically issue certificates for your clients and store them as Kubernetes Secrets for easy access in your application deployments. More on that shortly, first, let us bring up the Kafka cluster.

The following Kafka resource will create a persistent Kafka cluster with one node and TLS authentication enabled.

Shell
 




xxxxxxxxxx
1
38


 
1
cat <<EOF | kubectl apply -f -
2
apiVersion: kafka.strimzi.io/v1beta1
3
kind: Kafka
4
metadata:
5
  name: my-cluster
6
  namespace: kafka
7
spec:
8
  kafka:
9
    version: 2.4.0
10
    replicas: 1
11
    listeners:
12
      tls:
13
        authentication:
14
          type: tls
15
    authorization:
16
      type: simple
17
    config:
18
      offsets.topic.replication.factor: 1
19
      transaction.state.log.replication.factor: 1
20
      transaction.state.log.min.isr: 1
21
      log.message.format.version: "2.4"
22
    storage:
23
      type: jbod
24
      volumes:
25
      - id: 0
26
        type: persistent-claim
27
        size: 100Gi
28
        deleteClaim: false
29
  zookeeper:
30
    replicas: 1
31
    storage:
32
      type: persistent-claim
33
      size: 100Gi
34
      deleteClaim: false
35
  entityOperator:
36
    topicOperator: {}
37
    userOperator: {}
38
EOF


When you apply the Kafka resource the Strimzi Cluster Operator will start up a Kafka cluster according to the configuration. This will take a few minutes depending on how fast your cluster is. The following command can be used to wait for everything to be ready.

Shell
 




xxxxxxxxxx
1


1
kubectl wait kafka/my-cluster --for=condition=Ready --timeout=5m -n kafka


Your Kafka cluster should now be up and running!

Topics and ACLs Using Strimzi

On to the fun part, creating topics and ACLs! Normally it is quite the headache with longwinded kafka-cli commands to manage your Kafka topics and ACLs. Luckily Strimzi makes this into a much more user-friendly experience by leveraging Kubernetes CRDs, just like the Kafka resource we used in the previous section to create the cluster.

Here is an example of a KafkaTopic resource describing the configuration for a topic in the cluster.

Shell
 




x


1
cat <<EOF | kubectl apply -f -
2
apiVersion: kafka.strimzi.io/v1beta1
3
kind: KafkaTopic
4
metadata:
5
  name: sales
6
  namespace: kafka
7
  labels:
8
    strimzi.io/cluster: my-cluster
9
spec:
10
  partitions: 10
11
  replicas: 1
12
EOF


Applying this in your Kubernetes cluster will cause the Strimzi Topic Operator to pick it up and create the topic in your Kafka cluster. The topic will be given the same name as the Kubernetes resource. The  strimzi.io/cluster  label specifies which Kafka cluster it should be created in, if you have more than one Kafka cluster in the namespace. The  spec  section defines topic specific configuration such as the number of partitions and how it should be replicated. Since we only run a single node in the Kafka cluster we are limited to one replica.

Kafka users and their access is described in a similar manner with a KafkaUser resource, here is an example of such a resource.

Shell
 




xxxxxxxxxx
1
21


1
cat <<EOF | kubectl apply -f -
2
apiVersion: kafka.strimzi.io/v1beta1
3
kind: KafkaUser
4
metadata:
5
  name: shipment-api
6
  namespace: kafka
7
  labels:
8
    strimzi.io/cluster: my-cluster
9
spec:
10
  authentication:
11
    type: tls
12
  authorization:
13
    type: simple
14
    acls:
15
      - resource:
16
          type: topic
17
          name: shipments
18
          patternType: literal
19
        operation: Write
20
        host: "*"
21
EOF


This resource will be picked up by the Strimzi User Operator which will issue a client certificate signed by the certificate authority trusted by the Kafka cluster nodes. The certificate along with the private key will be stored in a Kubernetes Secret with the same name as the user, the secret also contains the public key of the certificate authority issuing the certificates for the cluster nodes. This allows a client to connect and authenticate with the Kafka cluster using mutual TLS. The client is then identified by its certificate and the cluster can authorise it to access resources in the cluster according to the ACLs.

The ACLs listed in the  acls  section are applied on the Kafka cluster by the User Operator as well. This specific example gives the user access to perform API calls grouped under the Write operation on the shipments topic. More information about the specifics of the Kafka access model can be found here: Authorization using ACLs — Confluent Platform.

These Kubernetes resources already give a nice and searchable definition of your cluster access model but it is somewhat difficult to get an overview. Let's fix that!

Deploying kafka-acl-viewer

kafka-acl-viewer is a small open-source application written in Go. It connects to Kafka using Shopify/sarama and fetches information about ACLs and topics directly from the cluster. This information is rendered as a graph using visjs/vis-network.

Before we can deploy the application we need to set up a KafkaUser with the appropriate access to the Kafka cluster.

Shell
 




xxxxxxxxxx
1
27


1
cat <<EOF | kubectl apply -f -
2
apiVersion: kafka.strimzi.io/v1beta1
3
kind: KafkaUser
4
metadata:
5
  name: kafka-acl-viewer
6
  namespace: kafka
7
  labels:
8
    strimzi.io/cluster: my-cluster
9
spec:
10
  authentication:
11
    type: tls
12
  authorization:
13
    type: simple
14
    acls:
15
      # Read ACLs
16
      - resource:
17
          type: cluster
18
        operation: Describe
19
        host: "*"
20
      # Read all topics
21
      - resource:
22
          type: topic
23
          name: "*"
24
          patternType: literal
25
        operation: Describe
26
        host: "*"
27
EOF


The kafka-acl-viewer application will contact the cluster directly to list active ACLs and Topics in the cluster. The Describe operation on the cluster resource is needed to list the ACLs. The second section allows the application to do Describe on all topics in the cluster to read metadata and offsets for topics among other things. It does not give the application access to read the data on the topic, that would require the Read operation. You can refer to Authorization using ACLs — Confluent Platform in order to see exactly which API calls the different operations allow.

Once the ACL is applied kafka-acl-viewer can be deployed, this is a fairly standard Kubernetes deployment.

Shell
 




xxxxxxxxxx
1
53


 
1
cat <<EOF | kubectl apply -f -
2
apiVersion: apps/v1
3
kind: Deployment
4
metadata:
5
  name: kafka-acl-viewer
6
  namespace: kafka
7
  labels:
8
    app: kafka-acl-viewer
9
spec:
10
  selector:
11
    matchLabels:
12
      app: kafka-acl-viewer
13
  template:
14
    metadata:
15
      labels:
16
        app: kafka-acl-viewer
17
    spec:
18
      containers:
19
      - name: kafka-acl-viewer
20
        image: bjorngylling/kafka-acl-viewer:v0.5-alpha
21
        ports:
22
        - containerPort: 8080
23
        volumeMounts:
24
          - name: kafka-acl-viewer-certs
25
            mountPath: "/kafka-client-certs"
26
            readOnly: true
27
          - name: kafka-cluster-cert
28
            mountPath: "/kafka-ca-certs"
29
            readOnly: true
30
        env:
31
          - name: KAFKA_URL
32
            value: "my-cluster-kafka-bootstrap:9093"
33
          - name: CA_FILE
34
            value: "/kafka-ca-certs/ca.crt"
35
          - name: CERT_FILE
36
            value: "/kafka-client-certs/user.crt"
37
          - name: KEY_FILE
38
            value: "/kafka-client-certs/user.key"
39
          - name: FETCH_INTERVAL
40
            value: "10s"
41
      volumes:
42
        - name: kafka-acl-viewer-certs
43
          secret:
44
            secretName: kafka-acl-viewer
45
            items:
46
              - key: user.crt
47
                path: user.crt
48
              - key: user.key
49
                path: user.key
50
        - name: kafka-cluster-cert
51
          secret:
52
            secretName: my-cluster-cluster-ca-cert
53
EOF


As you can see the certificates and the private key are regular Kubernetes secrets which we mount as files in our pod where the application can access them.

When the application is up and running the easiest way to access it is to use  kubectl port-forward -n kafka deploy/kafka-acl-viewer 8080 . If you are going to use it on a more permanent basis you probably want to set up some kind of ingress. If you now open localhost:8080 you should see a view of the current accesses in the cluster. The blue boxes are Kafka resources such as topics and the cluster itself and the green boxes are users. The arrows between them represent different types of operations, select one of the resources or users to see what type of operations are connected to it.

Try creating some more KafkaUsers and KafkaTopics and watch the graph expand as you refresh the page. Or you can import the the example setup I use for testing which is available in the kafka-acl-viewer repo.

The Future of kafka-acl-viewer

Right now the main problem with the tool is that when you have a big cluster with a lot of ACLs in it, the view becomes difficult to overview which defeats the point. I have begun trying out ways to filter the graph but I'm not quite happy with the results yet.

Feel free to try kafka-acl-viewer in your Kafka cluster, feedback and pull requests are always welcome!

Topics:
big data, kafka, kafka-acl-viewer, kubernetes, streaming, strimzi, tutorial, visualization

Published at DZone with permission of Bjorn Gylling . See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}