DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • How We Built a Smarter University Chatbot Using LLaMA2, AWS SageMaker, and RAG
  • Build Multimodal RAG Apps With Amazon Bedrock and OpenSearch
  • Enterprise RAG in Amazon Bedrock: Introduction to KnowledgeBases
  • Build a GitHub Slack Bot With AWS Bedrock and MCP, Part 1

Trending

  • When One MVP Is Really Four Systems: A Better Way to Plan Multi-Role Apps
  • OpenAPI From Code With Spring and Java: A Recipe for Your CI
  • From Indicators to Insights: Automating IOC Enrichment Using Python and Threat Feeds
  • LLM-Powered Deep Parsing for Industrial Inventory Search
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Automating AWS Glue Infra and Code Reviews With RAG and Amazon Bedrock

Automating AWS Glue Infra and Code Reviews With RAG and Amazon Bedrock

Automate AWS Glue reviews with infra-first RAG governance, enforcing enterprise standards, reducing manual work, and shifting checks left.

By 
pooja chhabra user avatar
pooja chhabra
·
Jan. 29, 26 · Tutorial
Likes (0)
Comment
Save
Tweet
Share
1.8K Views

Join the DZone community and get the full member experience.

Join For Free

In many enterprises, the transition from a "working" pipeline to a "production-ready" pipeline is gated by a manual checklist. In most enterprises, a “simple” Glue review involves answering questions like:

  • Is the Glue job deployed?
  • Was it provisioned via CloudFormation?
  • Does the expected crawler exist?
  • Is the code production-grade?
  • Does it follow internal best practices?

Traditionally, a senior engineer would spend 4–6 hours per use case and manually:

  • Cross-references CloudFormation
  • Opens job scripts
  • Reviews against a checklist stored somewhere
  • Writes feedback by hand

In a fast-scaling organization, this creates a bottleneck. To solve this, I developed a utility that implements shift-left governance — moving the review process from the final "pre-production" stage directly into the development cycle.

So instead of waiting for a Senior Engineer to find a missing Crawler or a hardcoded S3 path near production deployment, the developer runs an automated script during development.

  • Proactive vs. reactive: Issues are fixed while in the development phase.
  • Cost reduction: It is significantly cheaper that 4-6hrs a senior engineer would spend.
  • Empowerment: Developers receive instant, expert-level feedback without waiting on a human reviewer.

The Solution Architecture: RAG-Powered Governance

The core of this implementation is a retrieval-augmented generation (RAG) pattern. While LLMs are smart, they don't natively know your specific "Enterprise Internal Standards."

The RAG Approach: Grounding the Review

Instead of hardcoding rules into a prompt, we store the Enterprise Best Practices Checklist as a document in Amazon S3.

Design Principle: Deterministic Infrastructure Validation Before Probabilistic Inference

The key insight was simple: A Glue job review is meaningless without infrastructure context.

So instead of building “another code reviewer,” let's build a system that:

  1. Starts with real deployed infrastructure
  2. Verifies it against enterprise standards
  3. Reviews code only if infra checks pass
  4. Uses GenAI with strict grounding

The High-Level Architecture

High-level architecture

At a glance, the system works like this:

  1. The reviewer runs a local CLI command with a use case name.
  2. The system discovers all Glue jobs for that use case.
  3. It validates:
    • Glue job existence
    • Deployment status
    • CloudFormation provenance
    • Expected crawlers
  4. It retrieves enterprise review standards from Amazon S3, along with the glue job script to be reviewed against this checklist.
  5. It runs a GenAI-powered code review.
  6. It generates Markdown reports per job.

All fully automated.

Step 1: Discover Glue Jobs Automatically

Instead of hardcoding job names, I used the AWS Glue Data Catalog to discover jobs associated with a use case.

This allows the review process to:

  • Stay decoupled from naming conventions
  • Work across environments
  • Scale as pipelines grow

Step 2: Infrastructure Validation (The Missing Piece)

Before touching the code, the system validates the infrastructure.

For each Glue job, it checks:

  • Does the job exist?
  • Is it deployed?
  • Was it created via CloudFormation?
  • Does required crawler exist?

Example output:

Plain Text
 
Glue Job Exists

- Resource: Glue Job (<gluejob name>)  

                    Status: Deployed

                    Source: IaC (CloudFormation: <stack name>)

                    Compliance: Project standards met

                    If something is missing:

                    Crawler Doesn't Exist

                    <crawler name> Expected

                    - Compliance: Project standards not met


Step 3: The RAG Approach (Grounding the review)

Once infra checks pass, the system sends the script and the RAG-retrieved checklist to a Claude Sonnet 3.5 LLM hosted on Amazon Bedrock.

Here’s how:

  • The enterprise checklist is stored in Amazon S3
  • Relevant checklist is retrieved at runtime
  • Glue job script is retrieved from job metadata at runtime.
  • They are injected directly into the LLM prompt
  • The LLM evaluates code only against retrieved standards

The model does not invent rules.
It reasons strictly within enterprise-defined constraints.

This allows the governance rules to evolve (e.g., adding a new requirement) just by updating an S3 file, without changing a single line of the review code.

Why Not Free-Form GenAI?

Unconstrained GenAI:

  • Hallucinates standards
  • Produces inconsistent feedback
  • Is impossible to audit

That’s unacceptable in an enterprise.

What the Code Review Output Looks Like

Each Glue job gets a Markdown report like this:

Markdown
 
## Code Structure & Readability

Is the code modular and function-based?

- Status: NEEDS IMPROVEMENT

- Explanation: Logic is implemented in a single main function. But there is repetitive code for listing objects in S3 buckets.

- Recommendation: Create a separate function for listing S3 objects to reduce repetition.


Every checklist item includes:

  • Pass/fail status
  • Clear explanation
  • Actionable recommendation

Exactly how a senior engineer would review — but consistently.

Performance and Productivity Gains

By shifting governance left and using RAG, we can turn a slow, expert-driven, manual process into a fast, deterministic, self-service workflow:

metric manual process rag-automated process improvement
Review Time ~4 hrs ~3-4 Minutes 98% Faster
Rule Updates           
Manual Re-training Update S3 File Instant
Reviewer dependency Senior engineer Self-service Reallocation of senior expertise to high-impact architecture
Consistency Human-dependent 100% Policy-aligned No subjective bias
Review quality Variable Standardised Institutionalisation of "Best Practices.
Cost High-Cost Engineering Hours Nominal API/Compute Cost. (<$0.10 per PR via Amazon Bedrock) >99% Reduction in Direct Review Expense
Audit trail Manually 1 to 1 shared review comments Markdown artifacts High-fidelity visibility for stakeholders


Net result: ~95% reduction in review time.

Conclusion: Governance as a Service

This implementation proves that "Enterprise Standards" don't have to be a manual burden. By utilizing a RAG approach with Amazon Bedrock, a living governance engine can be created that executes in minutes and is active from the development phase itself. This allows senior engineers to focus on architecture and innovation, while the "bottleneck" of checklist-checking is eliminated.

The code for this implementation is available in my GitHub repository. 

Clone URL: https://github.com/chhabrapooja/infra-aware-code-review-automation.git 

I will make the repository public following the publication of this article.

Why This Works (and Scales)

Some reasons are:

  1. Infra comes first. Most review tools ignore infrastructure. This one starts there.
  2. Grounded GenAI implementation. RAG ensures correctness, consistency, and auditability.
  3. Standards are decoupled from code. Updating the checklist requires no redeployment.
  4. Decoupling infrastructure by lifecycle and/or use case. Instead of maintaining a single stack, we implemented granular stack segmentation based on the specific pipeline stage to locate specific resources instantly.

Future Roadmap

CI/CD Integration: Trigger reviews automatically on GitHub Pull Requests.

AWS RAG

Opinions expressed by DZone contributors are their own.

Related

  • How We Built a Smarter University Chatbot Using LLaMA2, AWS SageMaker, and RAG
  • Build Multimodal RAG Apps With Amazon Bedrock and OpenSearch
  • Enterprise RAG in Amazon Bedrock: Introduction to KnowledgeBases
  • Build a GitHub Slack Bot With AWS Bedrock and MCP, Part 1

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook