Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Running a HTTP Traffic Replayer on AWS Elastic Beanstalk

DZone's Guide to

Running a HTTP Traffic Replayer on AWS Elastic Beanstalk

Learn about Gor, a new utility for replication production HTTP traffic to a staging and testing environment, and the challenges running it on AWS Elastic Beanstalk.

· Cloud Zone
Free Resource

Deploy and scale data-rich applications in minutes and with ease. Mesosphere DC/OS includes everything you need to elastically run containerized apps and data services in production.

Gor is a great utility for replicating (a subset of) production traffic to a staging/test environment. Running it on AWS Elastic Beanstalk (EB) has some challenges, mainly that it doesn’t support running as a daemon and that there isn’t any documentation/examples for doing this. Well, here is a solution:

# File: .ebextensions/10gor.config
# Config Gor to copy a sample of Prod http traffice to staging
files:
  # Utility for daemonizing binaries such as gor; see http://libslack.org/daemon
  /opt/daemon.rpm:
    source: "https://s3-eu-west-1.amazonaws.com/elasticbeanstalk-eu-west-1-<our>/our_fileserver/daemon-0.6.4-1.x86_64.rpm"
    authentication: S3Access # See AWS::CloudFormation::Authentication below
    owner: root
    group: root

  # daemon config so that we don't need to repeat these command line options and
  # can just use the service's name ("gor")
  # We need to intercept the port 8080, not 80 (that iptables redirect to 8080)
  /etc/daemon.conf:
    # Troubleshooting tips:
    # 1) Send stderr/out output of both daemon and the service to a file - see the commented-out line
    #    (it should be also possible to send it to syslog but that did not work for me)
    # 2) Add "foreground" to the options and start the service manually on the server
    content: |
      gor  respawn,command=/opt/gor --input-raw :8080 --output-http 'https://our-staging.example.com|1'
      #gor  output=/var/log/gor.log,respawn,command=/opt/gor --stats --output-http-stats --input-raw :8080 --output-http 'https://our-staging.example.com|1'
      #gor  errlog=/var/log/daemonapp.log,dbglog=/var/log/daemonapp.log,output=/var/log/gor.log,verbose,debug,respawn,command=/opt/gor --verbose --stats --output-http-stats --input-raw :8080 --output-http 'https://our-staging.example.com|1'
    # Use the commented-out line to enable stats logging to syslog (/var/log/messages) every 5s for troubleshooting
    content: |
      gor  respawn,command=/opt/gor --input-raw :8080 --output-http 'https://our-staging.example.com|1'
      #gor  respawn,output=gor.info,command=/opt/gor --stats --output-http-stats --input-raw :8080 --output-http 'https://our-staging.example.com|1'
  # HTTP traffic replicator; see https://github.com/buger/gor
  /opt/gor:
    source: "https://s3-eu-west-1.amazonaws.com/elasticbeanstalk-eu-west-1-<our>/our_fileserver/gor"
    authentication: S3Access
    mode: "000755"
    owner: root
    group: root

  # System V service that supports chkconfig
  /etc/init.d/gor:
    mode: "000755"
    owner: root
    group: root
    content: |
      ## The chkconfig <levels> <startup order> <stop order> + descr. needed to
      ## support ensureRunning
      # chkconfig: 345 92 08
      # description: Gor copies traffic to staging
      ### BEGIN INIT INFO
      # Provides: gor
      # Short-Description: Start Gor to copy traffic to staging
      ### END INIT INFO
      # See how we were called.
      case "$1" in
        start)
              /usr/local/bin/daemon --name gor
              ;;
        stop)
              /usr/local/bin/daemon --name gor --stop
              ;;
        status)
              if ! /usr/local/bin/daemon --name gor --running; then
                  echo "gor is stopped"; exit 3
              fi
              ;;
        restart)
              /usr/local/bin/daemon --name gor --stop
              /usr/local/bin/daemon --name gor
              ;;
        *)
              echo $"Usage: $0 {start|stop|restart}"
              exit 2
      esac
      exit 0
commands:
  "Install daemon":
    command: rpm -i /opt/daemon.rpm
    test: /bin/sh -c "! rpm -q daemon" # It seems 'test: ! rpm -q daemon' ignores the !

services:
  sysvinit:
    gor:
      enabled: true
      ensureRunning: true

  # Allow files: to access our S3 bucket
  # See https://forums.aws.amazon.com/thread.jspa?messageID=557993
  # BEWARE: We have to explicitely allow access to any subdirectory by
  # editing the S3 bucket's policy and adding the subdir to the allowed Resources
  AWSEBAutoScalingGroup:
    Metadata:
      AWS::CloudFormation::Authentication:
        "S3Access": # reference this in the "authentication" property
          type: S3
          roleName: aws-elasticbeanstalk-ec2-role
          buckets: elasticbeanstalk-eu-west-1-<our id>


Highlights

  1. We want to run Gor as a service (instead of just a background + nohup command) because that is the only way to ensure it will keep running even as EB adds and removes nodes.
  2. Use the daemon utility to run Gor as a daemon (which it does not support out of the box). Daemon is small and works well. It will ignore gor’s output and automatically restart it if it dies.
  3. Create an init.d script for gor. To support ebextensions’s ensureRunning, it has to support chkconfig
  4. The test for whether daemon is installed cannot be just ! rpm -q daemon but needs to be /bin/sh -c "! rpm -q daemon"; the test property seems to require a single command to execute
  5. The files are downloaded from a private S3 bucket (which needs to be accessible by the EC2 role used and have the policy to allow access to the files in question)

Side note

I originally wanted to run Gor only on a single node using a container_command with leader_only to enable it on just that node. However that does not work because this is only run when the app is deployed but not when autoscaling adds new nodes (f.ex. after killing some old ones – typically starting with the leader). The new nodes are somewhat cloned from the existing ones, so they have the package, service, etc., but the command does not run there. And there is no “leader” concept outside of the EB deployment process. So the only option is to run Gor on all the nodes.

Discover new technologies simplifying running containers and data services in production with this free eBook by O'Reilly. Courtesy of Mesosphere.

Topics:
HTTP ,AWS ,elastic beanstalk ,cloud

Published at DZone with permission of Jakub Holý, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}