DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • LLMops: The Future of AI Model Management
  • A Deep Dive Into Distributed Tracing
  • Stop Debugging Glue Jobs Manually: Building an Agentic Observability Layer for Data Pipelines
  • Stop Running Two Data Systems for One Agent Query

Trending

  • Testing AI-Infused Apps: A Dual-Layer Framework for AI Quality Assurance
  • 5 AI Security Incidents That Broke Things in Production (and What They Have in Common)
  • Metal and Skins
  • Jakarta EE 12: Entering the Data Age of Enterprise Java
  1. DZone
  2. Data Engineering
  3. Data
  4. From CloudWatch to Cost Watch: Cutting Observability Costs With Vector

From CloudWatch to Cost Watch: Cutting Observability Costs With Vector

Organizations must balance observability with cost, making scalable, cost-effective log pipelines essential for sustainable enterprise operations.

By 
Gaurav Mittal user avatar
Gaurav Mittal
·
Sep. 01, 25 · Review
Likes (2)
Comment
Save
Tweet
Share
2.3K Views

Join the DZone community and get the full member experience.

Join For Free

Introduction

In modern cloud environments, traditional approaches for storing logs in isolated systems have become inadequate. As distributed software systems become more common, where different components run across multiple services and regions, it is essential to continuously collect and forward both system and application logs to a centralized location for in-depth analysis. These logs play an important role in debugging, performance monitoring, and ensuring the overall health and reliability of the infrastructure. 

In the AWS cloud environment, many such components of the distributed software system are still hosted on Amazon EC2 instances and use an agent-based approach to transmit system and application logs to a centralized service, where this data is ingested and stored for further use by observability platforms. While observability improves operational insight and system reliability, it also increases the cost of data ingestion and long-term storage. Therefore, organizations must maintain a careful balance between observability depth and the financial sustainability of the platform. Selecting a resilient, scalable, and cost-effective ingestion and storage solution has become an important element of any observability strategy, especially when the platform is being used at enterprise scale.

CloudWatch Agent: Convenient but Costly

The Amazon CloudWatch Agent serves as a powerful tool for monitoring and log collection. It can be installed on Amazon EC2 instances running a variety of operating systems, including both Linux and Windows, and can be configured using a JSON file to collect and send system and application logs to the Amazon CloudWatch Logs service. While the Amazon CloudWatch Agent is a native and convenient tool for collecting logs and system-level metrics from Amazon EC2 instances, it comes with certain limitations, most notably, its restriction to sending logs only to Amazon CloudWatch Logs. This limitation can become a significant concern for organizations operating at scale, where log ingestion and storage costs on CloudWatch Logs can grow rapidly with the volume and over time.

CloudWatch agent sending logs to CloudWatch Logs

CloudWatch agent sending logs to CloudWatch Logs

In contrast, Amazon S3 offers a more cost-effective and flexible solution for storing log data. With support for multiple storage classes such as S3 Standard-Infrequent Access, S3 Glacier, and S3 Glacier Deep Archive, Amazon S3 allows enterprises to automatically transition older logs to lower-cost archival tiers, which significantly reduces long-term storage expenses. Unlike CloudWatch Logs, Amazon S3 does not charge ingestion fees, making it a more budget-friendly option for high-volume log data. Once stored in S3, the logs can be efficiently analyzed using Amazon Athena, which enables SQL-like querying of data directly from S3 without the need for data movement or complex ETL pipelines. Although Amazon S3 is well-suited for log storage, AWS currently does not provide a direct, native method for sending logs from EC2 instances to S3.

Sample CloudWatch agent configuration for an Amazon Linux EC2 instance, sending Syslog and Messages to CloudWatch Logs:

JSON
 
{
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [
          {
            "file_path": "/var/log/syslog",
            "log_group_name": "/ec2/logs/syslog",
            "log_stream_name": "{instance_id}",
            "multi_line_start_pattern": "^{timestamp_format}",
            "timezone": "UTC"
          },
          {
            "file_path": "/var/log/messages",
            "log_group_name": "/ec2/logs/messages",
            "log_stream_name": "{instance_id}",
            "multi_line_start_pattern": "^{timestamp_format}",
            "timezone": "UTC"
          }
        ]
      }
    },
    "log_stream_name": "default",
    "force_flush_interval": 15
  }
}


In summary, while the CloudWatch Agent is well-integrated into the AWS ecosystem, its log destination limitations and higher cost structure make it less suitable for log archiving and advanced analytics. For organizations seeking a more scalable and economical observability solution, leveraging Amazon S3 with a third-party log forwarding agent and Amazon Athena presents a more flexible and cost-efficient alternative. These limitations prompt organizations to explore modern alternatives that offer more control and lower costs.

Vector Agent: Modern Observability, Minimal Costs

The Vector agent is a high-performance, open-source observability pipeline designed to collect, transform, and route telemetry data such as logs, metrics, and events. Built in Rust, Vector is engineered for efficiency, delivering low-latency performance and high throughput while maintaining a small resource footprint. It can collect data from a variety of sources, including local log files, journald, syslog, Docker containers, and Kubernetes pods, and forward it to numerous destinations like Amazon S3, Amazon CloudWatch, Elasticsearch, Kafka, and Datadog. Vector surpasses similar tools like Fluent Bit v3 and OpenTelemetry Collector in flexibility and performance, particularly with its native support for batching, buffering, and direct S3 integration.

Vector agent sending logs to an S3 bucket

Vector agent sending logs to an S3 bucket


Vector supports advanced data processing features such as log parsing, filtering, enrichment, and transformation using its powerful and customizable transformation language, Vector Remap Language (VRL). It also includes features like batching, retry logic, rate limiting, and backpressure handling to ensure resilience and efficient resource usage in production environments. Configuration files, which can be written in YAML, JSON, or TOML, are bundled with the agent during deployment.

Sample Vector Configuration for Amazon Linux EC2 instance, sending Syslog and Messages to an S3 bucket

JSON
 
{
  "data_dir": "/var/lib/vector",
  "sources": {
    "sys_logs": {
      "type": "file",
      "include": ["/var/log/syslog", "/var/log/messages"],
      "ignore_older": 86400,
      "read_from": "beginning"
    }
  },
  "transforms": {
    "parse_logs": {
      "type": "remap",
      "inputs": ["sys_logs"],
      "source": ". = parse_syslog!(.message)"
    }
  },
  "sinks": {
    "s3_sink": {
      "type": "aws_s3",
      "inputs": ["parse_logs"],
      "bucket": "vector-agent-test-logs",
      "key_prefix": "ec2-logs/",
      "compression": "gzip",
      "encoding": {
        "codec": "text"
      },
      "region": "us-east-1",
      "batch": {
        "max_bytes": 10485760
      }
    }
  }
}


Compared to the CloudWatch Agent, Vector delivers higher performance. It supports a broader range of data sources and platforms, making it well-suited as a standard solution for mid-sized and large organizations with complex log collection, transformation, and storage requirements. Its powerful Vector Remap Language (VRL) enables the implementation of sophisticated transformations, allowing organizations to standardize and process logs to meet a wide variety of operational and compliance use cases.

Log Smarter, Spend Less: Comparing Total Cost of Ownership

Since Vector supports sending logs directly to Amazon S3, it can significantly reduce the cost of log ingestion and storage without compromising the accessibility or usability of the log data. The table below compares logs sent to Amazon CloudWatch Logs via the CloudWatch Agent and those sent to Amazon S3 using the Vector agent.

Feature / Cost Component CloudWatch Agent Vector Agent

Agent Compute Platform

Amazon EC2 Instance

Amazon EC2 Instance

Agent OS Platform

Amazon Linux

Amazon Linux

Log Source

/var/log/messages, Syslog

/var/log/messages, Syslog

Log Destination

Amazon CloudWatch Logs

Amazon S3 Bucket

AWS Region

US East (N. Virginia)

US East (N. Virginia)

Data Ingested per Month

500 GB

500 GB

Ingestion Cost (Standard)

$250.00

$0.00

Storage Cost (Standard)

$15.00

$11.50

S3 PUT Request Cost (10 MB blocks)

N/A

$0.26

Query/Retrieval Cost (Estimate)

$10 (CloudWatch Insights queries)

$3 (Athena: $5 per TB scanned, assume 0.6 TB scanned)

Total Monthly Cost

$275.00

$14.76


Bar chart showing different cost comparisons between CloudWatch Agent and Vector Agent

Bar chart showing different cost comparisons between CloudWatch Agent and Vector Agent


The comparison between the Amazon CloudWatch Agent and the Vector Agent shows a significant cost advantage and operational flexibility offered by Vector. CloudWatch incurs high ingestion and moderate storage and query costs, while Vector eliminates ingestion costs by sending logs directly to Amazon S3. With only minor costs for storage, PUT requests, and Athena-based querying, the Vector Agent offers a monthly log management cost of under $15 for 500 GB of data, compared to $275 with CloudWatch, reducing the total cost by an impressive ~95%. Additionally, Vector supports advanced features such as log transformation, enrichment, and batching through its Vector Remap Language (VRL), making it highly suitable for complex, scalable environments.

Conclusion

Selecting the right tool for log ingestion and storage is crucial. While Amazon CloudWatch Agent provides seamless AWS integration, its cost structure and destination options can become burdensome at scale. Vector, with its high performance, advanced processing features, and direct S3 integration, presents a compelling alternative. For enterprises aiming to optimize both cost and capability, Vector represents a modern and scalable approach to observability in the AWS ecosystem, along with a wide variety of support for other platforms.

Disclaimer

The information presented in this document is intended for informational purposes only and reflects the author’s understanding as of the time of writing. It does not constitute professional advice, nor does it represent the views of Amazon Web Services (AWS), Vector project maintainers, or any other third-party vendors mentioned. Cost estimates are based on publicly available pricing for the year 2025 and may vary depending on actual usage, configurations, AWS region, and future pricing changes. Organizations are encouraged to perform their assessments and consult with qualified professionals before making architectural or financial decisions based on this content.


AWS Data structure Observability Data (computing)

Opinions expressed by DZone contributors are their own.

Related

  • LLMops: The Future of AI Model Management
  • A Deep Dive Into Distributed Tracing
  • Stop Debugging Glue Jobs Manually: Building an Agentic Observability Layer for Data Pipelines
  • Stop Running Two Data Systems for One Agent Query

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook