{{announcement.body}}
{{announcement.title}}

Serverless Network Log Shipping, Enrichment, and Transformation

DZone 's Guide to

Serverless Network Log Shipping, Enrichment, and Transformation

Learn the difference between plain flow logs and enhanced flow logs to see how they can help give you better data in your serverless infrastructure.

· Cloud Zone ·
Free Resource

What Are VCN Flow Logs

Virtual Cloud Network(VCN) flow log is a feature of Oracle Cloud Infrastructure that enables customers to capture information on the IP/Network Traffic that flows through a subnet as observed from a Virtual NIC in the subnet. Flow logs can be extremely useful in a number of tasks such as 

  • Diagnosing overly permissive or overly restrictive  firewall rules in Security Lists / Network Security Groups that restrict/allow the flow of packets
  • Monitoring network traffic reaching and originating from your Instances (VMs, DBs, Bare Metal Servers)
  • Traffic volumes processed by a given VirtualNIC or Subnet
  • For Forensic analysis when exported to a store that acts as a non-repudiable source of truth like a SIEM. 

What Is This Article About?

This article outlines the architecture and deployment methodology of a serverless, cloud-native, scalable, low-cost, zero-maintenance method of processing and enriching the Virtual Cloud Network (VCN) Flow Logs and populate to a SIEM like Splunk. The same method can, however, be used for 

  • Publishing data to other common enterprise based SIEMs like QRadar or ArcSight

The architecture uses standard formats like a flat JSON published over a standard HTTP Event Collector which is a default functionality in most SIEMs , Log Aggregation and Event Collector services. 

Oracle Cloud Infrastructure announced the Cloud Native Limited Availability program for customers and developers developing on Oracle Cloud Infrastructure in late 2019. The most recent addition to Oracle's cloud-native suite was VCN Flow Logs. For further information on how you can sign up for the Limited Availability(LA) Program.

Whitelist your tenancy for Logging

Announcing Limited Availability of Oracle Cloud Infrastructure Logging Service

We're pleased to announce the Limited Availability (LA) release of the Oracle Cloud Infrastructure Logging service. The…

blogs.oracle.com

Statistics

  • Number of Data Centers / Regions in Tenancy - 4 Regions
  • Number of Compartments in each Region - 96 Compartments
  • Number of VCNs in tenancy - 119
  • Number of Subnets per VCN - 2 Subnets / VCN
  • Avg. Number of Unique Flow Log events processed/ day   - 160 Million. + Events 

In Addition to transporting network flow logs at scale, this architecture serves as a means to enrich flow logs as they get populated with context.

For the Impatient

Link to the entire source code and deployment tutorial.

vamsiramakrishnan/splunk-export-logs

A repository that hosts Functions used to export OCI VCN Flow Logs to Splunk - vamsiramakrishnan/splunk-export-logs

github.com

The Utility of VCN Flow Logs

What Questions Do Plain Flow Logs Help Answer?

  • Which Source? 
  • Which Destination? 
  • Which Protocol? 
  • Did it pass through the firewall?

Raw VCN Flow Log

GitHub Flavored Markdown
 




xxxxxxxxxx
1
17


1
HEADERS
2
--------
3
<version>
4
<srcaddr>
5
<dstaddr>
6
<srcport>
7
<dstport>
8
<protocol>
9
<packets>
10
<bytes>
11
<start_time>
12
<end_time>
13
<action>
14
<status>
15
PAYLOAD
16
--------2 172.16.2.145 172.16.2.179 82 64 13 112 441 1557424462 1557424486 REJECT OK
17
 
          


What Questions Do Enhanced Flow Logs Help Answer

  • What is the Virtual NIC(vNIC) did this packet originate from ?
  • Which Subnet is the Virtual NIC a part of ?
  • Which VCN is the Subnet a part of ?
  • What is the Security List or Network Security Group that rejected/ allowed this packet flow 
  • Which compartment do all these resources belong to?

Enriched JSON Flow Log

The additional metadata enhances the readability, adds context, and enhances debugging capability.

JSON
 




xxxxxxxxxx
1
27


 
1
{
2
    "version": "",
3
    "srcaddr": "-",
4
    "dstaddr": "-",
5
    "srcport": "-",
6
    "dstport": "-",
7
    "protocol": "-",
8
    "packets": "-",
9
    "bytes": "-",
10
    "start_time": "",
11
    "end_time": "
12
    "status": "",
13
    "compartmentId": "",
14
    "compartmentName": "",
15
    "availabilityDomain": "",
16
    "vcnId": "",
17
    "vcnName": "",
18
    "subnetId": "",
19
    "subnetName": "",
20
    "vnicId": "",
21
    "vnicName": "",
22
    "securityListIds": [""],
23
    "securityListNames": [""],
24
    "nsgIds": [],
25
    "nsgNames": []
26
}


Deployment Architecture

Components Used

Logging

Oracle Cloud Infrastructure Logging service is a cloud-native, completely managed service that can Collect, Index, Search & Aggregate Logs from multiple log sources ( OCI Services). The Oracle Logging here is leveraged to extract VCN Flow Log information from each subnet in the tenancy.

Object Storage

Oracle Cloud Infrastructure Object Storage is a low-cost, highly-scalable, the zero-management target for all VCN Flow log files. Automated object lifecycle policies set on the objects helps to simplify the garbage collection of log-files pushed by the Logging service after being processed by the serverless data pipeline.

Events Service

The glue of the event-driven architecture generates events when log files are created in the object storage based on a pre-defined set of conditions. Events service can be leveraged to drive highly automated, zero-ops workflows. For this project, Events service helps trigger targeted events using attribute filters and on the specific event of flow log creation. 

Functions

Oracle Functions is an extension to the popular open-source Fn Project backed by Oracle. Oracle Cloud Infrastructure provides seamless integration with the Fn-project, adds simplified code to deploy functionality with popular languages such as Python, Java etc.  In this architecture, the enrich-flow-log function does the following

  1. Read the flow log object created 
  2. Extract the object metadata and use it to populate the network metadata by querying the OCI API
  3. Parses the log file into a JSON document 
  4. Uses the Splunk HEC interface to publish it to Splunk.

Key Characteristics

Ease of Use

Zero patching
Zero Maintenance
No need to size
Deploy and Forget model

Scalability

Functions are massively parallel
The entire architecture is event-driven

Security

The Function is deployed on a private subnet
Instance Principals and Dynamic groups for Least privilege
No Ingress traffic is allowed.
Object Storage Access through Service Gateway only
Egress to Splunk through NAT Gateway
Egress allowed only on specific Splunk Ports & Url

Cost

Truly pay-per-use
Events are based on invocations only
Object Storage Policy to retain objects for only 1 Day
Fn cost based on invocations only

Multi-Region Deployment Architecture

Below is a representation of how this setup can be extended to multiple regions populating data to the same Splunk instance. 

Further Reading

 https://medium.com/@vamsiramakrishnan/delivering-a-low-cost-scalable-audit-events-pipeline-to-a-siem-splunk-qradar-from-oracle-482b86f7c2c3 

I hope you learned more about the ease of developing cloud-native functionality using serverless, event-driven, fully managed, pay-per-use components that Oracle Cloud Infrastructure offers through this blog. You can follow me on  medium at https://medium.com/@vamsiramakrishnan

Topics:
cloud native, cloud networking, devops, logging, oracle, oracle cloud, oracle cloud infrastructure, security, serverless, splunk

Published at DZone with permission of Vamsi Ramakrishnan . See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}