Over a million developers have joined DZone.

Ghost Traffic on AWS

DZone's Guide to

Ghost Traffic on AWS

Even though Halloween is over, it's not too late for a spooky story involving graves, unknown traffic, and tips for writing health checks, ELB, and DNS caching.

· Cloud Zone ·
Free Resource

Discover the all-in-one cloud to help teams build better software. Brought to you in partnership with DigitalOcean

We were reconfiguring some of our network endpoints at work, and the day after Halloween, I see this traffic appear in our local access logs: - - [01/Nov/2017:15:09:43 -0400] "GET /grave/Charles-Karlson/16427428 HTTP/1.1" 404 967

Line after line. Every few seconds someone — or something — is asking about another grave.

This environment was running under Amazon's Elastic Beanstalk service, which assigns public DNS addresses based on the environment name, and I figured we probably weren't the only people to come up with the name tracker-test-two. Which is true, and leads to one of the takeaways that I'll list below. But the truth, as revealed from the ELB access logs, was stranger:

2017-11-01T19:09:43.168404Z awseb-e-f-AWSEBLoa-1FG2EMDGMS77G 0.000045 0.003369 0.00002 404 404 0 967 "GET https://hu.billiongraves.com:443/grave/Charles-Karlson/16427428 HTTP/1.1" "Mozilla/5.0 (compatible; SemrushBot/1.2~bl; +http://www.semrush.com/bot.html)" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2

That's a real site: If you go to it, you'll see an image of a gravestone and information about the person buried under that stone. Useful, I'm sure, to people who are tracking their ancestry. But why was the traffic going to our web server?

The answer is DNS caching: To reduce load on the domain name system, intermediate nameservers are allowed to cache the results of a DNS lookup. That generally works pretty well, and billiongraves.com has a 60-second timeout, which should have stopped traffic to that IP address fairly quickly (if you haven't already guessed, they also run on AWS servers, and recently reconfigured their environments). However, some programs and platforms do their own caching, and don't pay attention to the timeout — Java is particularly bad at this. So “sembot” is using its cached value for the URL and will continue to make those requests until restarted.

There was another interesting artifact in our logs, although not quite as spooky:

2017-11-02T11:49:26.343413Z awseb-e-h-AWSEBLoa-1SXOP8PJPZS8R 0.000116 0.013641 0.000074 200 200 0 7520 "GET HTTP/1.1" "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2

This traffic was appearing every few seconds in the logs for our old server group. This isn't a DNS caching issue: It's hitting an explicit IP address. And it doesn't look like a script kiddie trying to find vulnerabilities in our servers: It's always the same endpoint, always from the same source IP (which is probably somewhere in the midwest), and happens every 60 seconds. As far as I can tell, it's a homegrown health-check program.

Which means that some company is going to panic when I shut down that environment later this morning.

To wrap up, here are some takeaways that you might find useful:

  1. Don't write your own health checks. If you're using an ELB, you can configure it to run a health check and raise an alarm when the health check fails.
  2. If you feel you must write your own health check, use a hostname rather than an explicit IP. I know you think you'll have that IP forever, but you probably won't.
  3. Pick an appropriate time-to-live for your DNS entries. The Internet Police might yell at me for suggesting 60 seconds, but it makes your life a lot easier if you need to transition environments. Interestingly, Google (which hosts this blog) uses 78 seconds.
  4. If you're using Elastic Beanstalk, use URLs that contain random text (the environment ID is a good choice). Remember that it's a shared namespace.
  5. If you're writing an application that repeatedly connects to remote hosts, understand how your platform deals with DNS caching, and try to honor the server's intentions.
  6. If you run billiongraves.com — or any site with similar content — don't change your IPs on Halloween.

Balance virtual machines with a healthy amount of memory tuned to host and scale applications. Brought to you in partnership with DigitalOcean

cloud ,aws ,traffic analysis ,dns caching ,elastic beanstalk

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}