GitOps-Backed Agentic Operator for Kubernetes: Safe Auto-Remediation With LLMs and Policy Guardrails
Event-Driven Architecture Patterns: Real-World Lessons From IoT Development
Kubernetes in the Enterprise
Over a decade in, Kubernetes is the central force in modern application delivery. However, as its adoption has matured, so have its challenges: sprawling toolchains, complex cluster architectures, escalating costs, and the balancing act between developer agility and operational control. Beyond running Kubernetes at scale, organizations must also tackle the cultural and strategic shifts needed to make it work for their teams.As the industry pushes toward more intelligent and integrated operations, platform engineering and internal developer platforms are helping teams address issues like Kubernetes tool sprawl, while AI continues cementing its usefulness for optimizing cluster management, observability, and release pipelines.DZone’s 2025 Kubernetes in the Enterprise Trend Report examines the realities of building and running Kubernetes in production today. Our research and expert-written articles explore how teams are streamlining workflows, modernizing legacy systems, and using Kubernetes as the foundation for the next wave of intelligent, scalable applications. Whether you’re on your first prod cluster or refining a globally distributed platform, this report delivers the data, perspectives, and practical takeaways you need to meet Kubernetes’ demands head-on.
Getting Started With CI/CD Pipeline Security
Java Caching Essentials
The deal closed on Friday. By Monday, the acquired company’s CI/CD pipeline had triggered an outage during onboarding. No one knew where the deployment scripts were, and half the infrastructure was still named after an ex-employee’s dog. This is not edge case material — it is standard operating procedure in the world of permanent capital SaaS. As private equity and holdco operators lean into long-term ownership models, they are gobbling up B2B software companies at a record pace. Evergreen players like Constellation Software and Battery Ventures are scaling by acquisition, not by codebase. But while the spreadsheets align in the boardroom, the infrastructure underneath these deals tells a different story. There is no API for operational chaos — yet somehow, we are expected to merge pipelines, schemas, and APIs like they came off the same assembly line. “You can close a deal in 90 days. Fixing their CI config takes 90 weeks.” CI/CD: Where ‘Continuous’ Meant Continuously Breaking There is an assumption in many acquisitions that engineering systems will simply align themselves post-deal. But in practice, the CI/CD systems are where the ghosts of previous org structures come out to play. Legacy pipelines are often held together by tribal knowledge, undocumented environment variables, and cron jobs that everyone’s too afraid to touch. Even understanding where one deployment starts and another ends can be an archaeological exercise. The first stop in post-acquisition entropy is always the CI/CD pipeline. One company deploys via Jenkins. The next uses GitHub Actions with undocumented secrets. A third runs a Bash script from 2016 on a cron job. None of them agree on naming conventions. All of them break differently. YAML # Excerpt from legacy GitLab CI stages: - build - test - deploy deploy_prod: stage: deploy script: - echo "$PROD_SECRET" # hardcoded secret - scp ./build [email protected]:/var/www/html only: - main Trying to standardize across a new portfolio often ends in Terraform battles: Plain Text # Merge conflict between two Terraform states resource "aws_s3_bucket" "app_logs" { bucket = "company-a-logs" acl = "private" } # Conflicts with: resource "aws_s3_bucket" "app_logs" { bucket = "company-b-logs" versioning { enabled = true } } It is not about tools. It is about assumptions. One team assumes infra-as-code is sacred. The other treats it like a suggestion. Merging them means rewriting not just files, but mental models. Beyond the tooling clash, the culture clash is harder to resolve. Some teams bake deployments into sprint planning, while others treat release windows like an optional adventure. Shared ownership rarely survives past the first failed release. The result? A Frankenstein pipeline with half-written rollback scripts, misaligned approval gates, and a backlog full of To Do's no one owns. Schema Spaghetti: Data Contracts from a Parallel Universe Data is supposed to be the common language of modern SaaS. But when you acquire multiple companies with different interpretations of the same fields, it becomes a game of semantic roulette. Some systems were built with normalized databases and event sourcing; others store user preferences in XML blobs in a PostgreSQL column labeled 'misc_data'. The problem is not just a lack of standardization — it is the outright contradiction in business logic baked into these schemas. Even if you get the deployment ducks in a row, your data is already betraying you. Every acquired SaaS has a different take on "user," "order," or "status." You do not notice it until the dashboards stop matching or a unified billing report starts quoting fictional revenue. JSON // App A { "order_status": "complete", "customer_id": "987654", "tax": null } // App B { "status": "fulfilled", "clientId": 987654, "tax": "included" } And then the ETL pipeline dies quietly: Markdown ERROR: Unexpected field 'clientId' at line 742 ERROR: Null value in 'tax' violates NOT NULL constraint You think this is an easy fix until you discover there is no data contract. Just an API built six years ago that someone “wrote docs for” in a deprecated Notion workspace. Even worse, changes get pushed without schema versioning. Your data team gets paged at 2 AM because a critical dashboard stopped rendering, all because someone renamed a key in a hotfix. Tracking down the issue feels like digital archaeology. Most of it lives between inconsistent data models and tribal knowledge. Nobody owns the schema — everyone inherits the problem. “Our analytics platform became a liar with confidence.” API Versioning Hell: When Integration Becomes Interpretation If the data is contradictory and the deployments are broken, the APIs are downright poetic in their dysfunction. Inherited APIs often reflect a time when REST was optional, authentication was an afterthought, and versioning was a dream deferred. The effort to merge these into a unified platform is less about engineering and more about forensic linguistics. You are not integrating systems — you are translating dialects of dysfunction. Welcome to the final boss: legacy APIs. Not because they are old. Because they lie. One system uses REST v1.2 but pretends to be v2. Another returns HTTP 200 with error payloads. Swagger files exist, but they are aspirational. Plain Text # Two APIs returning conflicting payloads for the same endpoint curl https://api.company-a.com/user/123 { "id": 123, "email": "[email protected]" } curl https://api.company-b.com/user/123 { "user_id": "123", "contact": { "email": "[email protected]" } } Then there is the documentation: YAML # Excerpt from Swagger file /user/{id}: get: summary: Get user responses: 200: description: Success content: application/json: schema: $ref: '#/components/schemas/User' # But 'User' is undefined in components Integration becomes improvisation. You set up adapter services just to reconcile version mismatches. Internal SLAs are dropped to "best effort." Debugging these APIs is often trial by fire. You spin up Postman collections that start resembling fan fiction. Devs end up memorizing undocumented behavior patterns instead of fixing them. “Every integration becomes a negotiation. And someone always loses.” Tech Debt Is a Line Item. Chaos Is a System. By the time engineering leadership starts asking why everything is slow, the chaos has calcified. You are no longer debugging APIs — you are negotiating across philosophies. This is the unsexy reality of scaling SaaS via acquisition. You do not build new platforms. You inherit entropy. You inherit decisions made under entirely different constraints. Your job is not to erase them, but to surface, absorb, and redirect them. Three takeaways: Unify observability before code: If you cannot see across systems, you cannot scale them. Treat architectural unification like product work: Roadmap it. Staff it. Do not just assign it to “DevOps.” Audit data before you trust it: Contracts, not just pipelines, must align. “In rollups, you are not scaling products — you are scaling mismatches. The trick is surviving long enough to standardize anything.”
The True Challenge of Modern Web App QA The complexity of today's web applications has fundamentally changed the nature of software quality assurance. We've moved far beyond the simple, static pages of the past. Today's applications are vast, intricate ecosystems defined by: Microservices and distributed architectures running in the cloudComponents deployed at the edgeReal-time data feeds and APIs from third partiesUser interfaces built with multiple, deeply nested layers Each of these interconnected components introduces unique quirks, hidden dependencies, and potential failure points. Despite this exponential increase in complexity, testing often remains an under-resourced afterthought — a simple box to tick just before a major release. The reality is stark: overlooked quality issues are not just minor bugs. They actively delay deployment timelines, introduce critical security vulnerabilities, and ultimately inflict long-term damage to your brand reputation and user trust. In the fast-paced world of digital services, you don't get a chance to "reset the scene." Once the users encounter the problem, the damage is already done. So, how do we shift the testing mindset from an afterthought to a core function? The solution lies in recognizing and preemptively addressing the true costs of inadequate testing. Unmasking the Real Costs of Missed Issues Production failures rarely announce themselves with fanfare. Often, they are subtle issues that hide in plain sight until peak usage or a crucial user journey. These issues translate directly into lost revenue, wasted time, and user abandonment. Hidden Cost Scenario 1: Cross-Browser State Drift Your primary testing environment — likely Chrome — shows a flawless user dashboard. However, a major corporate client uses Safari or an older Edge version. Suddenly, charts are misaligned, buttons vanish, and critical layout breaks. The cost to fix this issue is high: an immediate halt on new development, urgent back-porting of fixes, and a stressful emergency patch deployment. Hidden Cost Scenario 2: Load-Induced Session Failure Your staging environment performs perfectly during standard load testing. But a successful flash sale or viral event drives traffic to ten times the expected volume. Under this sudden stress, session tokens begin to expire prematurely, logging out users mid-checkout. The consequence is immediate: a surge of support tickets, real-time social media backlash, and a direct impact on conversion rates. Hidden Cost Scenario 3: Untested Third-Party API Dependence Your application relies on a critical third-party service, such as a payment gateway or data feed. On release day, a sudden spike in their API latency causes your entire checkout process to time out. Without a realistic simulation and a robust fallback handler, your application fails at its most critical moment, resulting in a quantifiable loss of revenue. These are not merely technical bugs; they are business failures caused by a testing strategy that lacked sufficient depth and realism. Critical Gaps You Cannot Ignore in Web Application Testing The most dangerous testing gaps are rarely due to a lack of effort, but rather an inability to keep pace with system complexity. Addressing them requires modern tools and a focused strategy. ChallengeWhy It's MissedRecommended ToolsFaulty State ManagementSession storage, cookies, and cache behavior vary drastically; teams test in their local environment and assume uniformity.Playwright for multi-browser automation; device emulation and real-world coverage services.Shadow DOM & Inaccessible UICustom components conceal elements from standard selectors, causing automation scripts to fail silently or inconsistently.Playwright + Applitools for visual checks; advanced custom locators for Shadow Roots.Flawed Third-Party API SimulationTeams rely on basic API mocks but fail to simulate real-world contract changes, failures, or latency spikes.Pact for contract testing; network throttling features in browser automation tools like Playwright.Unrealistic Load Test JourneysStress tests hit generic endpoints, completely bypassing real user flows like complex login sequences or multi-step forms.Gatling or k6 for scripting full, realistic user scenarios that mirror production usage.Microservice Data DriftConfiguration in staging environments doesn't match production; data contracts degrade over time without continuous validation.Pact for contract validation; automated environment synchronization scripts.False Positives in CI PipelinesTests pass locally but fail intermittently in the Continuous Integration (CI) environment due to flaky selectors or timing issues.Testim for AI-based locator healing; Applitools for reliable visual baselines and retry logic. Why Cross-Functional Collaboration Is Essential The outdated model of "throwing code over the wall" to the QA team is incompatible with modern, high-velocity development. The most effective quality strategies rely on seamless collaboration: Shared ownership: Developers must write unit and integration tests, while QA engineers focus on exploratory testing and edge cases. Product Managers define real-world acceptance criteria that prioritize user value.TestOps integration: This is the practice of embedding test execution, monitoring, and reporting directly into CI/CD pipelines to provide continuous, immediate feedback.Test-Driven Development (TDD): Building features with testing in mind from day one drastically reduces the cost of defect remediation later in the cycle. A Practical Quality Blueprint for Engineering and Product Leaders Adopting a quality-first culture requires a structured roadmap: Align testing with business goals: Identify high-value user journeys and critical revenue-generating workflows. These must receive the highest priority in both manual and automated coverage.Integrate non-functional testing early: Performance, security, and accessibility cannot be treated as post-development phases. They must be incorporated into initial sprint planning and acceptance criteria.Audit coverage continuously: Move beyond simple test counts. Use dashboards to evaluate whether tests are meaningful, stable, and cover the high-risk areas identified in step one.Modernize the toolchain: Tool stagnation creates blind spots. Regularly evaluate and upgrade your stack with modern web application testing services that address today's complex architectures.Automate strategically: Over-automation on unstable, low-value flows can be as costly as under-testing. Focus automation efforts on stable, high-value user journeys. When to Engage External Quality Experts Internal teams often lack the time or specialized expertise to handle all quality demands. Consider bringing in external web application testing services when you see the following signs: Releases are consistently delayed due to test instability or flakiness.Your existing test environments are limited, outdated, or difficult to maintain.Security audits or third-party integrations consistently lack full, dedicated coverage.Your engineering team spends more time debugging the test automation pipeline than writing new feature code. Conclusion Every successful web application that collapses under load or fails in a niche browser demonstrates the critical cost of missing testing details. These quiet issues — from fragile APIs to complex state management — can drain time, budgets, and user trust. The path forward is not simply "more testing," but smarter, targeted quality engineering supported by a modern toolkit and a collaborative culture. If a flawless user experience and unblemished brand reputation are your goals, prioritize quality. Test thoroughly, deploy with confidence.
Do you struggle to organize your React project folder structures? The right folder organization does more than just keep things tidy. The development efficiency and team collaboration of your project depend on it. A solid folder structure makes the project look clean and speeds up scalability, collaboration, and debugging. The way developers organize files and components is vital to the long-term success of any React project. React applications are easier to scale, debug, and onboard new team members through a clean folder structure and clear system design. The best folder structure practices help developers find files and components quickly, which reduces the time spent searching through the code. The consistent folder structure of your application prevents chaos and confusion as it grows. Team members work together more smoothly, with fewer merge conflicts and improved productivity. This section explores the best practices for different React project folder structures. We start with simple setups for small projects and move to advanced, feature-based organizations for large-scale applications later. You will discover practical guidance to create a maintainable and scalable React project structure, whether you are launching something new or reorganizing an existing project. Simple React Folder Structure for Small Projects Simple organization beats complexity in early-stage projects. A flat folder structure offers clear visibility without extra layers, particularly when fewer than 15 components are present. The simple structure is as follows: src/ – Main source directorycomponents/ – Houses all UI componentshooks/ – Contains custom hooksassets/ – Stores images, icons, and static filesApp.js – Main application componentindex.js – Entry point This streamlined approach eliminates decision fatigue during development. Developers can focus on building features rather than debating file locations. New team members can quickly find files without diving through nested directories. The hooks/folder serves as a central place for reusable logic that multiple components require. Expert React developers suggest setting up this pattern early, as it works well regardless of the project size. Test organizations can follow two paths: keep tests with components (Button.js and Button.test.js together) or create a separate tests/ directory. Each approach offers benefits: co-located files close, whereas separate test folders maintain clear boundaries. This structure shows its limitations as projects expand. The components/folder becomes difficult to manage once you cross 15–20 components. Notwithstanding this, this straightforward setup allows easy reorganization as the architecture of the application evolves. Intermediate Structure With Page-Based Organization Simple projects can use a technical-type organization, but larger applications require a better approach. Page-based organization works really well here; you group files by feature or route instead of type. Colocation is the key principle that keeps related files together, regardless of their technical category. We organized the React project folder structure around pages and routes. Plain Text src/ ├── pages/ │ ├── Login/ │ │ ├── index.js # Main page component │ │ ├── LoginForm.js # Page-specific component │ │ └── useLogin.js # Page-specific hook │ └── Dashboard/ │ └── ... ├── components/ # Shared components ├── hooks/ # Shared hooks ├── context/ # React Context files ├── services/ # API calls, utilities └── assets/ # Images, fonts, etc. The basic rule is Colocate first, extract later. Your page folder should contain every component, hook, or utility at the beginning. When you find something that multiple pages use, move it to a shared folder. Mid-sized applications benefit from this structure in several ways, including: The code's responsibilities become clearer when the related pieces stay together. Developers can quickly locate the page-specific code. Debugging becomes easier with all relevant files in one place. The structure scaled well without complex abstractions. This approach naturally guides the evolution of an application from simple to advanced structures. Advanced Feature-Based React Project Structure Feature-based architecture is at the forefront of React project organization, particularly in enterprise-scale applications with complex domains. The code is grouped by business features or domains and creates self-contained modules that package all related functionalities. The basic structure includes the following components: Plain Text src/ ├── features/ # Self-contained business domains │ ├── authentication/ │ ├── users/ │ ├── posts/ ├── core/ # Global, app-wide concerns │ ├── api/ │ ├── config/ ├── layouts/ # Page structure components └── shared/ # Common utilities and components Each feature folder functions as a mini-application with its components, hooks, services, and state management. Clear boundaries between different application parts make the codebase easier to understand and maintain as it expands. Barrel files (index.js) can enhance this structure by serving as public APIs for features. These files export only externally needed items and hide the implementation details. JavaScript // features/users/index.js export { UserList } from '. /components/UserList'. export { useUsers } from. js '; /hooks/useUsers'; // Internal components/utilities remain private Your structure becomes more robust with absolute imports configured in jsconfig.json or tsconfig.json. JSON { "compilerOptions": { "baseUrl": "src" } } Clean imports, such as import { Button } from 'shared/ui, become possible without relative path complexity. The Facade pattern helps manage third-party libraries using wrapper services that abstract external dependencies. Library updates or replacements are simplified with centralized integration points. This feature-based approach created a scalable foundation. Teams collaborate better through clear ownership boundaries, and new developers can onboard easily. Conclusion The structure of your React project depends on the size, complexity, and future growth plans of your project. This study shows how folder organization affects development optimization, team collaboration, and code maintainability. Small projects perform well with simple structures. The components, hooks, and assets live in clearly defined top-level folders. This approach works well until you reach about 15–20 components. Page-based organization becomes a natural progression after that point, and groups related files by feature or route rather than technical type. Feature-based architecture is the gold standard for enterprise applications. It creates self-contained modules that encapsulate the business domain. This structure is especially beneficial when multiple teams work on different application areas simultaneously. Your project structure should progress in parallel with your application. You do not need to stick to one approach forever. Many successful React applications begin with minimal organization and gradually shift toward more sophisticated patterns as they scale up. Start with the simplest structure that meets your current needs, and then refactor as patterns emerge. Techniques such as barrel files, absolute imports, and API facades will also strengthen whatever structure you choose. This makes the codebase more maintainable over time. The best React project structure is not the most complex or trendy. It helps your team build features quickly while reducing cognitive load. This principle will guide you in building applications that remain maintainable from setup to scale.
Kubernetes has steadily evolved into an industry standard for container orchestration, powering platforms from small developer clusters to hyperscale AI and data infrastructures. Every new release introduces features that not only make workloads easier to manage but also improve performance, cost efficiency, and resilience. With the v1.34 release, one of the standout enhancements is the introduction of traffic distribution preferences for Kubernetes Services. Specifically: PreferSameNode: route traffic to the same node as the client pod if possible.PreferSameZone: routing traffic by giving preference to endpoints in the same topology zone before going for cross-zone. These policies add smarter, locality-aware routing to Service traffic distribution. Instead of treating all pods equally, Kubernetes can now prefer pods that are closer to the client, whether on the same node or in the same availability zone (AZ). This change is simple, but it has meaningful implications for performance-sensitive and cost-sensitive workloads, particularly in large multi-node and multi-zone clusters. Traffic Distribution Significance Traditionally, a Kubernetes Service balances traffic evenly across all endpoints/pods that match its selector. This even traffic distribution is simple, predictable, and works well for most use cases. However, it does not take into consideration topology, the physical or logical placement of pods across nodes and zones. Round-Robin Challenges Increased latency: If a client pod on Node A routes to a Service endpoint on Node B (or worst case to a different zone), the extra network hop adds milliseconds of delay.Cross-zone costs: In cloud environments, cross-az traffic is often billed by cloud providers; even a few mb's of cross-zone traffic across thousands of pods can rack up significant costs.Cache inefficiency: Some ML inference services cache models in memory per pod. If requests bounce across pods randomly, cache hit rates increase, hurting both performance and resource efficiency. What’s New in Kubernetes v1.34 The new trafficDistribution field, Kubernetes services now support an optional field under spec: Shell spec: trafficDistribution: PreferSameNode | PreferSameZone Default behavior (if unset): traffic is still distributed evenly across all endpoints.PreferSameNode: The kube-proxy (or service proxy) will attempt to send traffic to pods running on the same node as the client pod. If no such endpoints are available, it falls back to zone-level or cluster-wide balancing.PreferSameZone: The proxy will prioritize endpoints within the same topology zone as the client pod. If none are available, it falls back to cluster-wide distribution. Traffic Distribution High-Level Diagram These preferences are optional, and if no preference is specified, then by default, the traffic will be evenly distributed across all endpoints in the cluster. Benefits Lower latency: Requests take fewer network hops when served locally on the same node or within the same zone. This is especially critical for microservices with low SLA requirements or ML workloads where inference times are measured in milliseconds.Reduced costs: Cloud providers typically charge for cross-zone traffic. Routing to local pods first avoids these charges unless necessary.Improved cache utilization: Workloads such as ML inference pods often keep models, embeddings, or feature stores warm in memory, with the same node routing, which increases cache hit rates.Built-in fault tolerance: Both policies are preferences, not hard requirements. If no local endpoints exist due to a node being drained or a zone outage, then Kubernetes seamlessly falls back to cluster-wide distribution. Use Cases ML inference services cache warm models in the pod.Distributed systems where data nodes align with zones.Larger orgs deploying across multiple AZs can achieve smart failover as traffic stays local under normal conditions, but failover seamlessly if the zone experiences an outage. Demo Walkthrough We will try to cover traffic distribution scenarios — default, PreferSameZone, PreferSameNode, and fallback — in the demo below. Demo: Set Up Cluster, Deploy Pods, Services, and Client Step 1: Start a multi-node cluster on minikube and label the nodes with zones: Shell minikube start -p mnode --nodes=3 --kubernetes-version=v1.34.0 kubectl config use-context mnode kubectl label node mnode-m02 topology.kubernetes.io/zone=zone-a --overwrite kubectl label node mnode-m03 topology.kubernetes.io/zone=zone-b --overwrite Step 2: Deploy the echo app with two replicas and the echo service. Shell # echo-pod.yaml apiVersion: apps/v1 kind: Deployment metadata: name: echo spec: replicas: 2 selector: matchLabels: app: echo template: metadata: labels: app: echo spec: containers: - name: echo image: hashicorp/http-echo args: - "-text=Hello from $(POD_NAME)" env: - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name ports: - containerPort: 5678 # echo-service.yaml apiVersion: v1 kind: Service metadata: name: echo-svc spec: selector: app: echo ports: - port: 80 targetPort: 5678 Shell kubectl apply -f echo-pod.yaml kubectl apply -f echo-service.yaml # verify pods are running on separate nodes and zones kubectl get pods -l app=echo -o=custom-columns=NAME:.metadata.name,NODE:.spec.nodeName --no-headers \ | while read pod node; do zone=$(kubectl get node "$node" -o jsonpath='{.metadata.labels.topology\.kubernetes\.io/zone}') printf "%-35s %-15s %s\n" "$pod" "$node" "$zone" done As you can see in the screenshot below, two echo pods spin up on separate nodes (mnode-m02, mnode-m03) and availability zones (zone-a, zone-b). Step 3: Deploy a client pod in zone A. YAML # client.yaml apiVersion: v1 kind: Pod metadata: name: client spec: nodeSelector: topology.kubernetes.io/zone: zone-a restartPolicy: Never containers: - name: client image: alpine:3.19 command: ["sh", "-c", "sleep infinity"] stdin: true tty: true Shell kubectl apply -f client.yaml kubectl get pod client -o=custom-columns=NAME:.metadata.name,NODE:.spec.nodeName --no-headers \ | while read pod node; do zone=$(kubectl get node "$node" -o jsonpath='{.metadata.labels.topology\.kubernetes\.io/zone}') printf "%-35s %-15s %s\n" "$pod" "$node" "$zone" done Client pod is scheduled on node mnode-m02 in zone-a. Step 4: Set up a helper script in the client pod. Shell kubectl exec -it client -- sh apk add --no-cache curl jq cat > /hit.sh <<'EOS' #!/bin/sh COUNT="${1:-20}" SVC="${2:-echo}" PORT="${3:-80}" i=1 while [ "$i" -le "$COUNT" ]; do curl -s "http://${SVC}:${PORT}/" \ | jq -r '.env.POD_NAME + "@" + .env.NODE_NAME' i=$((i+1)) done | sort | uniq -c EOS chmod +x /hit.sh exit Demo: Default Behavior Form client shell run script: hit.sh to generate traffic from client pod to echo service. Shell /hit.sh 20 Behavior: In the below screenshot, you can see traffic routed to both pods (10 requests each) in round-robin style. Demo: PreferSameNode Patch echo service definition spec to add/patch traffic distribution: PreferSameNode. Shell kubectl patch svc echo --type merge -p '{"spec":{"trafficDistribution":"PreferSameNode"}' Form client shell run script:hit.sh to generate traffic from client pod to echo service. Shell /hit.sh 40 Behavior: Traffic should get routed to pod:echo-687cbdc966-mgwn5@mnode-m02 residing on the same node:mnode-m02 as client pod. Demo: PreferSameZone Update echo service definition spec to add/patch traffic distribution: PreferSameNode Shell kubectl patch svc echo --type merge -p '{"spec":{"trafficDistribution":"PreferSameZone"}' Form client shell run script:hit.sh to generate traffic from client pod to echo service. Shell /hit.sh 40 Behavior: traffic should get routed to the pod residing in the same zone (zone-a) as the client pod. Demo: Fallback Force all echo pods to zone-b, then test again: Shell kubectl scale deploy echo --replicas=1 kubectl patch deploy echo --type merge -p \ '{"spec":{"template":{"spec":{"nodeSelector":{"topology.kubernetes.io/zone":"zone-b"}}}' kubectl rollout status deploy/echo Form client shell run script:hit.sh to generate traffic from client pod to echo service. Result Summary policybehavior Default Traffic distributed across all endpoints in round-robin fashion. PreferSameNode Prefers pods on the same node, falls back if none available. PreferSameZone Prefers pods in the same zone, falls back if none available. Conclusion Kubernetes release v1.34 adds two small but impactful capabilities: PreferSameNode and PreferSameZone, these preferences helps developers and k8s operators to make traffic routing smarter, ensuring traffic prioritizes local endpoints while maintaining resiliency with fallback mechanism. References PreferClose traffic distribution is deprecated
As the number of data centers and their size grow worldwide, requiring increased efficiency, scalability, and agility from IT infrastructure, the convergence of virtual machines (VMs) and cloud-native technologies is crucial for success. A recent conversation between Dave Neary of Ampere Computing and Alexandra Settle, Product Manager for SUSE Virtualization, highlights a significant step forward: the general availability of SUSE Virtualization for Arm64 architecture, and Harvester’s pivotal role within SUSE’s ecosystem. This white paper summarizes their discussion, highlighting how SUSE is empowering organizations to modernize infrastructure with energy-efficient, high-performance solutions. SUSE Virtualization and Harvester: A Hyperconverged Foundation SUSE Virtualization, powered by its open-source upstream project Harvester, emerges as a robust hyperconverged infrastructure (HCI) solution designed to simplify IT operations. It unifies the management of virtual machines, containers, distributed storage, and comprehensive observability under a single, intuitive platform. Built on a Linux base, Harvester leverages Kubernetes (specifically RKE2) as its orchestration engine. This foundation integrates key cloud-native technologies, such as KubeVirt for virtual machine management, Longhorn for persistent block storage, and Prometheus/Grafana for monitoring and analytics. The core value proposition of SUSE Virtualization is its ability to bridge the gap between legacy VM workloads and modern containerized applications. It enables organizations to run both side-by-side, facilitating a smoother, incremental transition towards cloud-native architectures without the need to replace existing investments. This approach addresses the prevalent challenge of modernization, where virtual machines, despite the rise of containers, continue to play a critical role. SUSE further supports this journey through partnerships with migration specialists, assisting customers in transitioning VMs to Harvester and ultimately to container-based architectures as appropriate. The Strategic Advantage of Arm64 With Ampere A critical enabler of this modernization strategy is the integration of Arm64 architecture support. Ampere Computing is at the forefront of designing energy-efficient, many-core processors specifically tailored for cloud workloads. The advantages of Arm64 processors, particularly Ampere’s offerings, are compelling: they deliver single-threaded cores for predictable vCPU performance, enabling higher core density per rack. This translates into significantly lower power consumption and substantial cost benefits, positioning Arm as an attractive, greener alternative for cloud operators and enterprises alike. These attributes directly align with customer demands for enhanced performance, reduced operational costs, and environmentally sustainable infrastructure. A Phased Rollout and Community Engagement The path to general availability for Arm64 support has been a collaborative effort. Harvester first introduced Arm64 as a technical preview in its 1.3 release. Through extensive testing and QA with partners like Ampere, and invaluable community input, Arm64 achieved general availability and full support with the SUSE Virtualization 1.5 release (community release 1.5.0 in late May '25), with the enterprise “Prime” release following shortly thereafter. This commitment extends across the SUSE ecosystem, with components like Longhorn already supporting Arm64, and future plans targeting Rancher Manager for multi-cluster management on Arm64 in FY25. SUSE actively encourages users to test Harvester on Arm64, provide feedback, and engage with the active community channels on the Rancher Users Slack. This collaborative approach recognizes that real-world testing across diverse environments is crucial for platform refinement and robustness. Conclusion: A Path to Efficient Cloud Modernization The partnership between SUSE and Ampere, underscored by the general availability of SUSE Virtualization on Arm64, represents a significant leap forward in cloud modernization. It offers organizations a powerful, energy-efficient, and cost-effective path to integrate their existing VM infrastructure with the agility of cloud-native containers. This strategic alignment empowers businesses to build resilient, scalable, and sustainable IT environments for the future. Key points: SUSE Virtualization (Harvester) is an HCI solution for VMs, containers, storage, and observability, built on Kubernetes.It bridges legacy VMs with modern cloud-native applications, enabling incremental modernization.Arm64 support is now generally available, offering significant benefits like lower power consumption, higher core density, and reduced costs.Ampere Computing provides energy-efficient, high-performance Arm64 processors tailored for cloud workloads.The platform leverages KubeVirt, Longhorn, Prometheus/Grafana, all integrated with Kubernetes (RKE2). To gain deeper insights and see the full discussion, we invite you to watch the complete video: Check out the full Ampere article collection here.
In the previous article, I discussed the most often overlooked aspect of Domain-Driven Design: the strategic side. When it comes to software development, teams tend to rush toward code, believing that implementation will clarify the domain. History shows the opposite — building without understanding the underlying reason or direction often leads to systems that are technically correct but conceptually wrong. As the old Latin root of strategy (strategos, “the art of the general”) suggests, the plan must precede the movement. Now that we’ve explored the “why” and “what,” it’s time to turn to the “how.” Tactical DDD represents this next step — the process of transforming a well-understood domain into expressive, maintainable code. While strategic design defines boundaries and fosters a shared understanding, tactical design brings those ideas to life within each bounded context. Tactical DDD focuses on implementing the domain model. It provides a rich vocabulary of design patterns — entities, value objects, aggregates, repositories, and domain services — each serving a precise purpose in expressing business logic. These patterns were not invented from scratch by Eric Evans; instead, they emerged from decades of object-oriented design thinking, later refined to fit complex business domains. The term “entity,” for instance, descends from the Latin “entitas” — “being” — emphasizing identity and continuity through change, while “value objects” recall the algebraic notion of equality by content rather than identity. What makes tactical DDD essential is its ability to create models that are not only accurate but also resilient to change in an era of a vast amount of tools and architecture patterns, such as distributed systems, microservices, and cloud-native architectures. Without a good direction, we can mislead and generate unnecessary complexity. This layer bridges the conceptual clarity of the strategic model with the practical demands of implementation. Tactical design ensures that business rules are captured in code rather than scattered across services, controllers, or database scripts. It’s about writing software that behaves like the business, not merely one that stores its data. As the strategic part defines direction, the tactical part defines execution. It consists of seven essential patterns that turn concepts into code. Entities – Objects with identity that persist and evolve.Value Objects – Immutable objects defined only by their attributes.Aggregates – Groups of related entities ensuring consistent boundaries.Repositories – Interfaces that abstract persistence of aggregates.Factories – Responsible for creating complex domain objects.Domain Services – Hold domain logic that doesn’t fit an entity or value object.Domain Events – Capture and communicate significant occurrences in the domain. Each plays a specific role in expressing business logic faithfully within a bounded context. Together, they bring the domain model to life, ensuring that design decisions remain aligned with the business, even as they are implemented deep within the code. Entities Entities represent domain objects with a unique identity, or ID, that persists over time, even as their attributes might change. They model continuity — something that remains the same even when its data evolves. They capture real-world concepts like Order, Customer, or Invoice, where identity defines existence. In e-commerce, an Order remains the same object whether it’s created, updated, or completed. Java public class Order { private final UUID orderId; private List<OrderItem> items = new ArrayList<>(); private OrderStatus status = OrderStatus.NEW; public void addItem(OrderItem item) { items.add(item); } } Value Objects Value Objects describe elements of the domain that are defined entirely by their values, not by their identity. They are immutable, replaceable, and ensure equality through content. In practice, value objects like Money, Address, or DateRange make models safer and more precise. For example, a Money object adds two amounts of the same currency, ensuring correctness and immutability. Java public record Money(BigDecimal amount, String currency) { public Money add(Money other) { if (!currency.equals(other.currency())){ throw new IllegalArgumentException("Currencies must match"); } return new Money(amount.add(other.amount()), currency); } } Aggregates Aggregates organize related entities and value objects under a single consistency boundary, ensuring that business rules remain valid. The aggregate root acts as the guardian of its internal state. A typical example is an Order controlling its OrderItems. All modifications are routed through the root, preserving invariants such as total price and item limits. Java public class Order { private final UUID orderId; private final List<OrderItem> items = new ArrayList<>(); public void addItem(Product product, int quantity) { items.add(new OrderItem(product, quantity)); } public BigDecimal total() { return items.stream() .map(OrderItem::subtotal).reduce(BigDecimal.ZERO, BigDecimal::add); } } Repositories Repositories abstract the way aggregates are stored and retrieved, allowing the domain to stay independent of database concerns. They act as in-memory collections that handle persistence transparently. A repository enables the domain to operate at a higher level, focusing on business logic rather than SQL or API calls. For example, an OrderRepository manages how Order objects are saved or found, without exposing infrastructure details. Java public interface OrderRepository { Optional<Order> findById(UUID id); void save(Order order); void delete(Order order); } Factories Factories are responsible for creating complex domain objects while ensuring that all invariants are satisfied. They centralize creation logic, keeping entities free from construction complexity. When creating an Order, for example, a factory ensures the object starts in a valid state and respects business rules — avoiding scattered creation logic throughout the code. Java public class OrderFactory { public Order create(Customer customer, List<Product> products) { Order order = new Order(UUID.randomUUID(), customer); products.forEach(p -> order.addItem(p, 1)); return order; } } Domain Services Domain Services hold domain logic that doesn’t naturally belong to an entity or value object. They express behaviors that involve multiple aggregates or cross-cutting business rules. For instance, a PaymentService could coordinate payment processing for an order. It operates at the domain level, preserving the model’s purity while integrating with external systems when necessary. Java public class PaymentService { private final PaymentGateway gateway; public PaymentService(PaymentGateway gateway) { this.gateway = gateway; } public PaymentReceipt processPayment(Order order, Money amount) { return gateway.charge(order.getOrderId(), amount); } } Domain Events Domain Events capture meaningful occurrences within the business domain. They represent something that happened — not an external trigger, but a fact that the domain itself wants to share. This makes the model more expressive, reactive, and aligned with real business language. For example, when an Order is placed, it can publish an OrderPlacedEvent. Other parts of the system, such as billing, shipping, or notification services, can then react independently, promoting decoupling and scalability. Java public record OrderPlacedEvent(UUID orderId, Instant occurredAt) { public static OrderPlacedEvent from(Order order) { return new OrderPlacedEvent(order.getOrderId(), Instant.now()); } } Application Services — Orchestrating Use Cases Although Application Services are not part of the original seven tactical DDD patterns, they deserve mention for their role in modern architectures. They act as use-case orchestrators, coordinating domain operations without containing business logic themselves. Application services sit above the domain layer, ensuring that controllers, APIs, or message handlers remain thin and focused on their primary purpose: communication. For example, when placing an order, an application service coordinates the creation of the Order, its persistence, and the payment process. The domain remains responsible for what happens, while the application service decides when and in which sequence those actions occur. Java public class OrderApplicationService { private final OrderRepository repository; private final PaymentService paymentService; private final OrderFactory factory; public OrderApplicationService(OrderRepository repository, PaymentService paymentService, OrderFactory factory) { this.repository = repository; this.paymentService = paymentService; this.factory = factory; } @Transactional public void placeOrder(Customer customer, List<Product> products) { Order order = factory.create(customer, products); repository.save(order); paymentService.processPayment(order, order.total()); } } In practice, application services serve as the entry points for use cases, managing transactions, invoking domain logic, and triggering external integrations as needed. They maintain the model’s purity while enabling the system to execute coherent business flows from end to end. Conclusion Tactical Domain-Driven Design brings strategy to life. While the strategic side defines boundaries and shared understanding, the tactical patterns — entities, value objects, aggregates, repositories, factories, domain services, and domain events — translate that vision into expressive, maintainable code. Even the application service, although not part of the original seven, plays a vital role in orchestrating use cases and maintaining the model's purity.
Infrastructure security has long been about protecting networks, hosts, and cloud platforms. Application security focuses on securing APIs, data flows, and business logic to protect critical assets. Both approaches are critical, but they can’t provide complete protection on their own. When isolated from each other, there is a higher risk that attackers can exploit the security gaps in either layer, which is why workload identities should be employed to serve as a bridge that unifies both layers. The Two Security Worlds: Infrastructure vs. Application Infrastructure security is about protecting the foundation, including measures such as network segmentation, least-privilege identity and access management (IAM) roles, container isolation, and applying zero-trust principles at the infrastructure level. Application security works higher up in the stack, closer to data and user interactions. It deals with authentication, authorization, token management, session handling, and API protection. If pursued separately, weak security gaps may exist. Application teams may assume the infrastructure is already secure, while infrastructure teams might not always see the fine-grained context the application relies on. What Is a Workload Identity? A workload identity, or a non-human identity, is an identity assigned to a piece of software, such as an API client, microservice, or scheduled job, rather than a human person. Each workload is associated with a unique set of attributes that distinguish it from others and enables it to exchange identity information with other workloads. A workload identity consists of three components: The workload identifierThe workload credentialOptionally, a private key linked to the workload credential The workload identifier contains a unique ID that differentiates it from others. The workload identifier, along with other attributes, is attested by the trusted third party in a workload credential. The workload credential, together with the workload's private key, then enables the verification of the identifier and the additional attributes. What is a Workload Credential? To establish trust, workloads receive credentials from credential issuers and can use them to authenticate when interacting with systems and services. Workload credentials are documents that a recipient can cryptographically verify and authenticate. When used in this way, the recipient receives a high level of assurance of the caller's identity. Workload credentials are typically short-lived and automatically managed by the underlying platform, reducing exposure and strengthening security. A workload stores its credentials in memory or on disk and presents them as proof of ownership when communicating with other workloads. Common credential issuers include Kubernetes, cloud providers such as AWS, Azure, and Google Cloud, as well as open-source solutions like SPIRE, which can issue interoperable JSON Web Token (JWT) and X.509 credentials for workload-to-workload authentication. For example, a microservice named api-service could be issued a JWT that asserts the service’s identity. Such a credential could look similar to this: Plain Text eyJhbGciOiJQUzI1NiJ9.eyJpc3MiOiJ3b3JrbG9hZC1pZGVudGl0eS1pc3N1ZXIiLCJzdWIiOiJhcGktc2VydmljZSIsImF1ZCI6Imh0dHBzOi8vbG9naW4uZXhhbXBsZS5jb20iLCJleHAiOjE3NDYwNTA0MDB9.l-ZMRZa6_FCXaHucfjqdH4jge3xwIiGCD5mdWbDyz_NYzBRbB-ut-ht7wwDmWn-NDY8Ojv3yA_VPXiGxvZg-4XyRRvvuVBlzQVoauv3CcrnxuCpTqU2sc_GTSM6l6aeMhUMMpQI1zSpL6smOCjRaFYxtPLmrQ3WMfyUgAu23V_Fm4DDYwReiuxLLYBeRkxyU_wV3ff1CTEz0KF2DIW7taRxmqXRRoFY84g2vcMHlclRtsxxz9f1ABeqJb8miIwrnQcq5rVFEcA769NL60LNgTsb8KUWNmZC4hTJdHRoyVLIeD2_GGiwp4YSTi6YEJhTUegORzzc8ckinJsleAZQXXg Which decodes to the following values: JSON { "alg":"PS256" } { "iss":"workload-identity-issuer", "sub":"api-service", "aud":"https://login.example.com", "exp":1746050400 } The service can then use the JWT to identify itself in workflows that support this credential format. For example, when running an OAuth flow to get tokens from an authorization server. Workload Identities Use Cases Service meshes use workload credentials to encrypt internal messages and outsource the complexity of key management from workloads.Administrators can use service meshes to configure which workloads can call each other, using workload identities.Developers can use workload credentials explicitly in their code to send or receive a workload credential.Workload credentials enable organizations to replace weak credentials like API keys or passwords. They are also short-lived and automatically renewed by the credential issuer. In some cases, workload credentials are automatically handled by service mesh components like sidecars. In other cases, a workload may need to explicitly send a workload credential. Below you can see an example of two workloads using workload credentials in a client assertion flow: The platform issues a JWT credential. This is a service account token with a configurable lifetime. The platform will automatically renew it before it expires.When workload A needs to prove its identity to workload B, it presents the JWT credential as proof of identity. The platform also provides a mechanism for recipients like workload B to verify the JWT. This ensures the credential hasn’t been tampered with and that it represents the expected identity. Workload B can use a library to fetch the public key from a JSON Web Key Set (JWKS) and validate the credential with minimal code. After verification, Workload B applies a policy to determine what actions Workload A is authorized to perform. Securing Workloads The first step is to look at the big picture. In most environments, there are many different workloads, with the most critical ones exposing business data through APIs. Behind these APIs lies an ecosystem of supporting components, including API gateways, observability tools, and continuous delivery systems. Each of these connections must be secured, and traditional approaches such as static API keys introduce vulnerabilities that attackers can exploit. Non-human identities exist to make these system-to-system connections stronger and easier to trust. They use verifiable credentials instead of static secrets, giving each workload a reliable way to prove who it is. These identities aren’t limited to internal systems either. A mobile app, for instance, can use attestation to confirm its integrity, and two partner APIs can authenticate each other using signed credentials. Bridging the Gap: Authorization Using OAuth Workload identities provide a strong foundation for security, but they do not address several key issues, such as user authentication and consent, or best practices for API clients, including those that run in browsers. A strong security solution needs to apply security at multiple endpoints, not just in backend components. In particular, workload identities are not sufficient to secure APIs correctly. APIs must accurately evaluate user identities and apply fine-grained authorization decisions. Without this additional layer, APIs risk granting over-privileged access to business data. To avoid vulnerabilities, APIs must employ a security design that extends beyond workload-level trust and also incorporates business authorization. OAuth 2.0 provides a widely adopted framework for authorization and business-level protection. Mature APIs rely on token-based security, typically using access tokens that carry claims about the requesting user or system. These claims represent pieces of contextual information, such as user roles, permissions, or attributes that the API can interpret to make authorization decisions. By validating these claims, the API determines whether a request should be granted and what data can be returned. In this way, tokens provide the critical context needed to ensure that requests are properly verified and access is correctly controlled. OAuth Clients and Workload Credentials Workload security and API authorization can reinforce each other. On one level, APIs rely on access tokens. But by introducing workload credentials and identities, the security of those tokens can be significantly strengthened. Traditionally, a client connects to the authorization server and authenticates (for example, by presenting a password or secret) to obtain access tokens. With workload credentials, this process is hardened. Instead of static secrets, the workload uses a dynamic certificate or JWT to authenticate itself directly with the authorization server. This approach allows the authorization server to issue sender-constrained tokens, which are access tokens that are cryptographically bound to a specific workload. As a result, even if a token is intercepted, it cannot be reused by an unauthorized party. The Workload Identity in Multi-System Environments (WIMSE) working group is currently designing draft specifications that combine workload identities with token-based access control. When workload credential security is combined with business-level authorization, the outcome is a more resilient, zero-trust solution that enforces both workload and user-level protection. When you combine workload identities with OAuth, you get a complete API security solution. In particular, you are no longer restricted to allow/deny rules per component. Instead, you can implement partial access according to your particular business rules. Combine Tools and Techniques Start with OAuth 2.0 to protect API data and usersUse workload identities to harden backend connectionsUse the strongest security available in client environments Stay up-to-date with best practices Conclusion Combining workload identities with OAuth provides a stronger security foundation, covering clients, APIs, gateways, and backend systems. Together, they create the basis for a practical zero-trust approach. The key is to treat authorization as a first-class concern so that every request is evaluated with the right context. Just as important is leveraging all available security tools together, rather than in isolation, to build a defense that is both consistent and adaptable.
Write-heavy database workloads bring a distinctly different set of challenges than read-heavy ones. For example: Scaling writes can be costly, especially if you pay per operation, and writes are 5X more costly than readsLocking can add delays and reduce throughputI/O bottlenecks can lead to write amplification and complicate crash recoveryDatabase backpressure can throttle the incoming load While cost matters quite a lot, in many cases, it’s not a topic we want to cover here. Rather, let’s focus on the performance-related complexities that teams commonly face and discuss your options for tackling them. What Do We Mean by “a Real-Time Write Heavy Workload”? First, let’s clarify what we mean by a “real-time write-heavy” workload. We’re talking about workloads that: Ingest a large amount of data (e.g., over 50K OPS)Involve more writes than readsAre bound by strict latency SLAs (e.g., single-digit millisecond P99 latency) In the wild, they occur across everything from online gaming to real-time stock exchanges. A few specific examples: Internet of Things (IoT) workloads tend to involve small but frequent append-only writes of time series data. Here, the ingestion rate is primarily determined by the number of endpoints collecting data. Think of smart home sensors or industrial monitoring equipment constantly sending data streams to be processed and stored.Logging and monitoring systems also deal with frequent data ingestion, but they don't have a fixed ingestion rate. They may not necessarily be only, and may be prone to hotspots, such as when one endpoint misbehaves.Online gaming platforms need to process real-time user interactions, including game state changes, player actions, and messaging. The workload tends to be spiky, with sudden surges in activity. They’re extremely latency sensitive since even small delays can impact the gaming experience.E-commerce and retail workloads are typically update-heavy and often involve batch processing. These systems must maintain accurate inventory levels, process customer reviews, track order status, and manage shopping cart operations, which usually require reading existing data before making updates.Ad tech and real-time bidding systems require split-second decisions. These systems handle complex bid processing, including impression tracking and auction results, while simultaneously monitoring user interactions such as clicks and conversions. They must also detect fraud in real time and manage sophisticated audience segmentation for targeted advertising.Real-time sock exchange systems must support high-frequency trading operations, constant stock price updates, and complex order-matching processes — all while maintaining absolute data consistency and minimal latency. Next, let’s look at key architectural and configuration considerations that impact write performance. Storage Engine Architecture The choice of storage engine architecture fundamentally impacts write performance in databases. Two primary approaches exist: LSM trees and B-Trees. Databases known to handle writes efficiently, such as ScyllaDB, Apache Cassandra, HBase, and Google BigTable, use Log-Structured Merge Trees (LSM). This architecture is ideal for handling large volumes of writes. Since writes are immediately appended to memory, this allows for very fast initial storage. Once the “memtable” in memory fills up, the recent writes are flushed to disk in sorted order. That reduces the need for random I/O. For example, here’s what the ScyllaDB write path looks like: With B-tree structures, each write operation requires locating and modifying a node in the tree — and that involves both sequential and random I/O. As the dataset grows, the tree can require additional nodes and rebalancing, leading to more disk I/O, which can impact performance. B-trees are generally better suited for workloads involving joins and ad-hoc queries. Payload Size Payload size also impacts performance. With small payloads, throughput is good, but CPU processing is the primary bottleneck. As the payload size increases, you get lower overall throughput, and disk utilization also increases. Ultimately, a small write usually fits in all the buffers, and everything can be processed quite quickly. That’s why it’s easy to get high throughput. For larger payloads, you need to allocate larger buffers or multiple buffers. The larger the payloads, the more resources (network and disk) are required to service those payloads. Compression Disk utilization is something to watch closely with a write-heavy workload. Although storage is continuously becoming cheaper, it’s still not free. Compression can help keep things in check — so choose your compression strategy wisely. Faster compression speeds are important for write-heavy workloads, but also consider your available CPU and memory resources. Be sure to look at the compression chunk size parameter. Compression basically splits your data into smaller blocks (or chunks) and then compresses each block separately. When tuning this setting, realize that larger chunks are better for reads while smaller ones are better for writes, and take your payload size into consideration. Compaction For LSM-based databases, the compaction strategy you select also influences write performance. Compaction involves merging multiple SSTables into fewer, more organized files to optimize read performance, reclaim disk space, reduce data fragmentation, and maintain overall system efficiency. When selecting compaction strategies, you could aim for low read amplification, which makes reads as efficient as possible. Or, you could aim for low write amplification by avoiding compaction from being too aggressive. Or, you could prioritize low-space amplification and have compaction purge data as efficiently as possible. For example, ScyllaDB offers several compaction strategies (and Cassandra offers similar ones): Size-tiered compaction strategy (STCS): Triggered when the system has enough (four by default) similarly sized SSTables.Leveled compaction strategy (LCS): The system uses small, fixed-size (by default 160 MB) SSTables distributed across different levels.Incremental Compaction Strategy (ICS): Shares the same read and write amplification factors as STCS, but it fixes its 2x temporary space amplification issue by breaking huge SSTables into SSTable runs, which are comprised of a sorted set of smaller (1 GB by default), non-overlapping SSTables.Time-window compaction strategy (TWCS): Designed for time series data. For write-heavy workloads, we warn users to avoid leveled compaction at all costs. That strategy is designed for read-heavy use cases. Using it can result in a regrettable 40x write amplification. Batching In databases like ScyllaDB and Cassandra, batching can actually be a bit of a trap – especially for write-heavy workloads. If you're used to relational databases, batching might seem like a good option for handling a high volume of writes. But it can actually slow things down if it’s not done carefully. Mainly, that’s because large or unstructured batches end up creating a lot of coordination and network overhead between nodes. However, that’s really not what you want in a distributed database like ScyllaDB. Here’s how to think about batching when you’re dealing with heavy writes: Batch by the partition key: Group your writes by the partition key so the batch goes to a coordinator node that also owns the data. That way, the coordinator doesn’t have to reach out to other nodes for extra data. Instead, it just handles its own, which cuts down on unnecessary network traffic.Keep batches small and targeted: Breaking up large batches into smaller ones by partitioning keeps things efficient. It avoids overloading the network and lets each node work on only the data it owns. You still get the benefits of batching, but without the overhead that can bog things down.Stick to unlogged batches: Considering you follow the earlier points, it’s best to use unlogged batches. Logged batches add extra consistency checks, which can really slow down the write. So, if you’re in a write-heavy situation, structure your batches carefully to avoid the delays that big, cross-node batches can introduce. Wrapping Up We offered quite a few warnings, but don’t worry. It was easy to compile a list of lessons learned because so many teams are extremely successful working with real-time write-heavy workloads. Now you know many of their secrets, without having to experience their mistakes. :-) If you want to learn more, here are some firsthand perspectives from teams who tackled quite interesting write-heavy challenges: Zillow: Consuming records from multiple data producers, which resulted in out-of-order writes that could result in incorrect updatesTractian: Preparing for 10X growth in high-frequency data writes from IoT devicesFanatics: Heavy write operations like handling orders, shopping carts, and product updates for this online sports retailer Also, take a look at the following video, where we go even deeper into these write-heavy challenges and walk you through what these workloads look like on ScyllaDB.
A 60-hour training job had become the new normal. GPUs were saturated, data pipelines looked healthy, and infra monitoring didn’t flag any issues. But something was off. The model wasn't large, nor was the data complex enough to justify that duration. What we eventually discovered wasn't in the Python code or the model definition. It was buried deep in the compiler stack. Identifying the Invisible Bottleneck Figure: Model pipeline showing the expected flow toward fused kernel optimization and the alternate fallback path leading to GPU under-utilization. The figure highlights critical decision points in the compiler stack affecting performance. We were deploying a quantized neural network using TensorFlow with TensorRT acceleration, routed through a custom TVM stack for inference optimization. On paper, this stack was airtight: optimized kernels, precompiled operators, and GPU targeting enabled. But profiling told another story. Despite all optimizations turned on, GPU utilization plateaued. Memory spikes were inconsistent. Logs showed fallback ops triggering sporadically. We drilled into the Relay IR emitted by TVM and discovered a subtle but costly regression: certain activation patterns (like leaky_relu fused after layer_norm) weren’t being lowered correctly for quantized inference. Instead of a fused kernel, we were getting segmented ops that killed parallelism and introduced memory stalls. APL # Expected fused pattern (not observed due to mispass) %0 = nn.batch_norm(%input, %gamma, %beta, epsilon=0.001) %1 = nn.leaky_relu(%0) # Compiled form should have been a single fused op # Actual IR observed during fallback %0 = nn.batch_norm(%input, %gamma, %beta, epsilon=0.001) %1 = nn.op.identity(%0) %2 = nn.leaky_relu(%1) # Segmentation introduced identity pass, breaking the fuse pattern The root cause? A compiler pass treating a narrow edge case as a generic transform. The Dirty Work of Debugging Acceleration In debugging TVM's Relay IR stack, one overlooked ally was the relay.analysis module. We scripted out a pattern matcher to scan through blocks and detect unintended op separation, especially when quantization annotations were injected. The IR was instrumented with logs to trace op-to-op transitions. Python from tvm.relay.dataflow_pattern import wildcard, is_op, rewrite, DFPatternCallback class FuseChecker(DFPatternCallback): def __init__(self): super().__init__() self.pattern = is_op("nn.batch_norm")(wildcard()) >> is_op("nn.leaky_relu") def callback(self, pre, post, node_map): print("Unfused pattern detected in block:", pre) rewrite(FuseChecker(), mod['main']) This gave us visibility into the transformation path and showed that, despite high-level optimizations, certain common patterns weren’t being caught. Worse, the IR graph diverged depending on how the quantization pre-pass handled calibration annotations. Fixing this wasn’t elegant. First, we had to isolate the affected patterns with debug passes in Relay. We then created a custom lowering path that preserved fused execution for our target GPU architecture. Meanwhile, TensorRT logs revealed that calibration ranges had silently defaulted to asymmetric scaling for certain ops, leading to poor quantization fidelity; something no benchmarking script had caught. APL [TensorRT] WARNING: Detected uncalibrated layer: layer_norm_23 [TensorRT] INFO: Tactic #12 for conv1_3x3 failed due to insufficient workspace [TensorRT] INFO: Fallback triggered for leaky_relu_5 [TensorRT] WARNING: Calibration range fallback defaulted to (min=-6.0, max=6.0) We re-quantized using percentile calibration and disabled selective TensorRT fusions that were behaving unpredictably. The changes weren’t just in code; they were in judgment calls around what to optimize and when. Performance engineering is part systems diagnosis, part gut instinct. Infrastructure Rewrites We Didn’t Want To Do To maintain reproducibility, we added hashing logic to model IR checkpoints. This allowed us to fingerprint model graphs before and after every compiler optimization pass. Any IR delta triggered a pipeline rerun and a deployment alert. We also introduced an internal version control mechanism for compiled artifacts, stored in S3 buckets with hash-tagged lineage references. This way, deployment failures could be traced back to a specific commit in the compiler configuration and not just the source model. None of these fixes was isolated. Once our quantization flow changed, our SageMaker Edge deployment containers broke due to pa ackage mismatch and model signature incompatibility. We had to revalidate across device classes, update Docker images, and reprofile edge deployment times. Python # TVM Lowering Configuration import tvm from tvm import relay target = tvm.target.Target("cuda") with tvm.transform.PassContext(opt_level=3): lib = relay.build(mod, target=target, params=params) What made this harder was the legacy cost-tracking infrastructure built during my time at Amazon Go. Any model versioning tweak disrupted our resource billing granularity. So we also had to rewire metering hooks, regenerate EC2 cost estimates, and rewrite tagging policies. Tooling complexity cascades. A single op tweak at the compiler level turned into a week-long infra-wide dependency resolution Performance Tradeoff Yes, we got a 5x speedup. But it came with tradeoffs. Some quantized models lost accuracy in edge deployment. Others were too fragile to generalize across hardware classes. We had to A/B test between 8-bit and mixed-precision models. In one case, we rolled back to a non-quantized deployment just to preserve prediction confidence. Quantization also impacted explainability in downstream model audits. We noticed inconsistent behavior in post-deployment trace logs, particularly in user-facing applications where timing-sensitive predictions created drift across device tiers. Optimizing the calibration configuration for precision often meant sacrificing consistency — a trade-off that's hard to communicate outside the infra team. The hardest part? Convincing teams that ‘faster’ didn’t always mean ‘better.’ Closing Reflection Most performance wins don’t come from new tools. They come from understanding how existing ones fail. TVM, TensorRT, SageMaker... they all offer acceleration, but none of them account for context. We learned to build visibility into our compilers, not just our models. We now inspect every IR block before deployment. We trace every fallback path. We don’t benchmark for speed but we benchmark for behavior. We also built internal dashboards to track compiler-side regressions over time. Having that historical visibility has helped us preemptively catch fallback patterns that would have otherwise crept into production. That 16-hour drop wasn’t just a speed win. It was a visibility win. And in ML infrastructure, that’s the metric that really matters.
This iteration is based on existing experience scaling big data with Apache Spark workloads and uses more refinements by still preserving the eight most important strategies but moving high-value but less important strategies — such as preferring narrow transformations, applying code-level best practices, leveraging Databricks Runtime features, and optimizing cluster configuration — to a Miscellaneous section, thereby not losing focus on impactful areas such as shuffles and memory, but still addressing them thoroughly. Diagrams for in-phased insights and example code can be completely executed in Databricks or vanilla Spark sessions, and for all of these to be worth your time, the application will yield unbelievable performance benefits, often in the range of 5–20x in real-world pipelines. Optimization Strategies 1. Partitioning and Parallelism Strategy: Use repartition() to enhance parallelism before shuffle-intensive operations like joins, and coalesce() to minimize partitions pre-write to prevent small-file issues that hammer storage metadata. Python from pyspark.sql import SparkSession from pyspark.sql.functions import rand spark = SparkSession.builder.appName("PartitionExample").getOrCreate() # Sample DataFrame creation data = [(i, f"val_{i}") for i in range(1000000)] df = spark.createDataFrame(data, ["id", "value"]) # Repartition for parallelism before a join or aggregation df_repartitioned = df.repartition(200, "id") # Shuffle to 200 even partitions # Perform a sample operation (e.g., groupBy) aggregated = df_repartitioned.groupBy("id").count() # Coalesce before writing to reduce output files aggregated_coalesced = aggregated.coalesce(10) aggregated_coalesced.write.mode("overwrite").parquet("/tmp/output") print(f"Partitions after repartition: {df_repartitioned.rdd.getNumPartitions()}") print(f"Partitions after coalesce: {aggregated_coalesced.rdd.getNumPartitions()}") Explanation: Partitioning is foundational for parallelism of tasks and load balancing in Spark's distributed model. repartition(n) ensures even data spread via full shuffle, ideal pre-joins to avoid executor overload. coalesce(m) (where m < current partitions) merges locally for efficient writes, cutting I/O costs in Databricks' Delta or S3. Risks: Over-repartitioning increases shuffle overhead; monitor via Spark UI's "Input Size" metrics. Benefits: Scalable for TB-scale data; universal across Spark envs. Diagram: 2. Caching and Persistence Strategy: Cache or persist reusable DataFrames to skip recomputation in iterative or multi-use scenarios. Python from pyspark.sql import SparkSession from pyspark.storagelevel import StorageLevel spark = SparkSession.builder.appName("CachingExample").getOrCreate() # Create a sample DataFrame df = spark.range(1000000).withColumn("squared", spark.range(1000000).id ** 2) # Cache for memory-only (default) df.cache() print("First computation (uncached effectively, but sets cache):", df.count()) # Reuse: Faster second time print("Second computation (from cache):", df.count()) # Persist with custom level (e.g., memory and disk) df.persist(StorageLevel.MEMORY_AND_DISK) print("Persisted count:", df.count()) # Clean up df.unpersist() Explanation: Recomputation kills performance in loops or DAG branches. cache() uses MEMORY_ONLY; persist() allows levels like MEMORY_AND_DISK for spill resilience. In Databricks, this leverages fast NVMe; watch memory usage to avoid evictions. Benefits: Up to 10x speedup in ML training. Risks: Memory exhaustion – use spark.ui to track. Diagram: 3. Predicate Pushdown Strategy: Filter early to leverage storage-level pruning, especially with Parquet/Delta. Python from pyspark.sql import SparkSession from pyspark.sql.functions import col spark = SparkSession.builder.appName("PushdownExample").getOrCreate() # Read from Parquet (supports pushdown) df = spark.read.parquet("/tmp/large_dataset.parquet") # Assume pre-written large file # Early filter: Pushed down to storage filtered_df = df.filter(col("value") > 100).filter(col("category") == "A") # Further ops: Less data shuffled result = filtered_df.groupBy("category").sum("value") result.show() # Compare explain plans df.explain() # Without filter filtered_df.explain() # With pushdown visible Explanation: Pushdown skips irrelevant data at the source, slashing reads. Delta Lake enhances with stats; universal but format-dependent (Parquet, yes; JSON, no). Benefits: Network savings. Risks: Over-filtering hides data issues. Diagram: 4. Skew Handling Strategy: Salt keys or custom-partition to even out distributions. Python from pyspark.sql import SparkSession from pyspark.sql.functions import col, concat, lit, rand, floor spark = SparkSession.builder.appName("SkewExample").getOrCreate() # Skewed DataFrame skewed_df = spark.createDataFrame([(i % 10, i) for i in range(1000000)], ["key", "value"]) # Many duplicates on low keys # Salt keys: Append random suffix (0-9) salted_df = skewed_df.withColumn("salted_key", concat(col("key"), lit("_"), floor(rand() * 10).cast("string"))) # Group on salted key, then aggregate temp_agg = salted_df.groupBy("salted_key").sum("value") # Remove salt for final result final_agg = temp_agg.withColumn("original_key", col("salted_key").substr(1, 1)).groupBy("original_key").sum("sum(value)") final_agg.show() Explanation: Skew starves executors; salting disperses hot keys temporarily. Custom partitioners (via RDDs) offer precision. Check UI task times. Benefits: Balanced execution. Risks: Extra compute for salting. Diagram: 5. Optimize Write Operations Strategy: Bucket/partition wisely, coalesce files, use Delta's Optimize/Z-Order. Python from pyspark.sql import SparkSession spark = SparkSession.builder.appName("WriteOptExample").getOrCreate() # Sample DataFrame df = spark.range(1000000).withColumn("category", (spark.range(1000000).id % 10).cast("string")) # Partition by column for query efficiency df.write.mode("overwrite").partitionBy("category").parquet("/tmp/partitioned") # For Delta: Write, then optimize df.write.format("delta").mode("overwrite").save("/tmp/delta_table") spark.sql("OPTIMIZE delta.`/tmp/delta_table` ZORDER BY (id)") # Coalesce before write df.coalesce(5).write.mode("overwrite").parquet("/tmp/coalesced") Explanation: Writes create file explosions; coalescing consolidates. Delta's Z-Order clusters for scans; Benefits: Faster reads; Databricks-specific but portable via Hive. Diagram: 6. Leverage Adaptive Query Execution (AQE) Strategy: Enable AQE for runtime tweaks like auto-skew handling. Python from pyspark.sql import SparkSession spark = SparkSession.builder.appName("AQEExample").getOrCreate() # Enable AQE spark.conf.set("spark.sql.adaptive.enabled", "true") spark.conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true") # Sample join that benefits from AQE (auto-broadcast if small) large_df = spark.range(1000000) small_df = spark.range(100) result = large_df.join(small_df, large_df.id == small_df.id) result.explain() # Shows adaptive plans result.show() Explanation: AQE adjusts post-stats (e.g., reduces partitions); benefits: Hands-off optimization; Spark 3+ universal. Diagram: 7. Job and Stage Optimization Strategy: Tune via Spark UI insights, adjusting memory/parallelism. Python from pyspark.sql import SparkSession spark = SparkSession.builder.appName("TuneExample") \ .config("spark.executor.memory", "4g") \ .config("spark.sql.shuffle.partitions", "100") \ .getOrCreate() # Sample job df = spark.range(10000000).groupBy("id").count() df.write.mode("overwrite").parquet("/tmp/tuned") # After run, check UI for GC/stages; adjust configs iteratively Explanation: UI flags GC (>10% bad); tune shuffle.partitions to match cores. Benefits: Resource efficiency; universal. Diagram: 8. Optimize Joins With Broadcast Hash Join (BHJ) Strategy: Broadcast small sides to eliminate shuffles. Python from pyspark.sql import SparkSession from pyspark.sql.functions import broadcast spark = SparkSession.builder.appName("BHJExample").getOrCreate() # Large and small DataFrames large_df = spark.range(1000000).toDF("key") small_df = spark.range(100).toDF("key") # Broadcast small for BHJ result = large_df.join(broadcast(small_df), "key") result.explain() # Shows BroadcastHashJoin result.show() Explanation: BHJ copies small DF to nodes; tune spark.sql.autoBroadcastJoinThreshold. Benefits: Shuffle-free. Risks: Memory for broadcast. Diagram: Miscellaneous Strategies These additional techniques complement the core set, offering targeted enhancements for specific scenarios. While not always foundational, they can provide significant boosts in code efficiency, platform-specific acceleration, and infrastructure tuning. Prefer Narrow Transformations Strategy: Favor narrow transformations like filter() and select() over wide ones like groupBy() or join(). Python from pyspark.sql import SparkSession from pyspark.sql.functions import col spark = SparkSession.builder.appName("NarrowExample").getOrCreate() # Sample large DataFrame df = spark.range(1000000).withColumn("value", spark.range(1000000).id * 2) # Narrow: Filter and select first (no shuffle) narrow_df = df.filter(col("value") > 500000).select("id") # Then wide: GroupBy (shuffle only on reduced data) result = narrow_df.groupBy("id").count() result.show() Explanation: Narrow ops process per-partition, avoiding shuffles; chain them early to prune. Benefits: Lower overhead Risks: Over-chaining increases complexity in code. Diagram: Code-Level Best Practices Strategy: Use select() to specify columns explicitly, avoiding *. Python from pyspark.sql import SparkSession spark = SparkSession.builder.appName("CodeBestExample").getOrCreate() # Sample wide table df = spark.createDataFrame([(1, "A", 100, "extra1"), (2, "B", 200, "extra2")], ["id", "category", "value", "unused"]) # Bad: Select all (*) all_df = df.select("*") # Loads unnecessary columns # Good: Select specific slim_df = df.select("id", "category", "value") # Process: Less memory used result = slim_df.filter(col("value") > 150) result.show() Explanation: * loads extras, increasing memory; select() trims. Benefits: Leaner pipelines; risks: Missing columns in evolving schemas. Diagram: Utilize Databricks Runtime Features Strategy: Harness Delta Cache and Photon for I/O and compute acceleration. Code Python from pyspark.sql import SparkSession spark = SparkSession.builder.appName("RuntimeFeaturesExample").getOrCreate() # Assume Databricks Runtime with Photon enabled spark.conf.set("spark.databricks.delta.cache.enabled", "true") # Delta Cache # Read Delta (caches automatically) df = spark.read.format("delta").load("/tmp/delta_table") # Query: Benefits from cache/Photon vectorization result = df.filter(col("value") > 100).groupBy("category").sum("value") result.show() Explanation: Delta Cache preloads locally; Photon vectorizes. Benefits: Latency drops; Databricks-only, emulate with manual caching elsewhere. Diagram Optimize Cluster Configuration for Big Data Strategy: Select instance types and enable autoscaling. For example, AWS EMR, etc. Python # This is configured via Databricks UI/CLI, not code, but example job config: # In Databricks notebook or job setup: # Cluster: Autoscaling enabled, min 2-max 10 workers # Instance: i3.xlarge (storage-optimized) or r5.2xlarge (memory-optimized) from pyspark.sql import SparkSession spark = SparkSession.builder.appName("ClusterOptExample").getOrCreate() # Run heavy job: Autoscaling handles load df = spark.range(100000000).groupBy("id").count() # Scales up automatically df.show() Explanation: Match instances to workload (e.g., memory for joins); autoscaling adapts. Benefits: Cost savings; Databricks-specific, but can be applied to AWS EMR, etc., with auto- and managed-scaling of instance configuration JSON during cluster bootstrap. Diagram Applicability to Databricks and Other Spark Environments Universal: Some of these methods apply to EMR, Synapse, and other Spark platforms, like Partitioning, caching, predicate pushdown, skew handling techniques, narrow transformations, coding practices, AQE, job optimization, and BHJ.Databricks-specific: Write operations with Delta, features in the Runtime, cluster configuration (and configuration changes) are all native to Databricks (but can be leveraged with alternatives like Iceberg or some manual tuning). Conclusion In this article, I tried to demonstrate eight core strategies that underpin addressing shuffle, memory, and I/O bottlenecks, and improving efficiency. The miscellaneous section describes some subtle refinement approaches, platform-specific improvements, and infrastructure tuning. You now have flexibility and variability in workloads, including ad hoc queries and production ETL pipelines. Collectively, these 12 strategies (core and misc.) promote a way of thinking holistically about optimization. Start by profiling in Spark UI, adaptively implement incremental improvements using the snippets provided here, and benchmark exhaustively to demonstrate the improvements (using metrics for each). By applying these techniques in Databricks, you will not only reduce costs and latency but also build scalable, resilient big data engineering solutions. As Spark development (2025 trends) continues to expand, please revisit this reference and new tools, such as MLflow, for experimentation capabilities, moving bottlenecks into breakthroughs.
How Does a Scrum Master Improve the Productivity of the Development Team?
November 7, 2025 by
AI Code Generation: The Productivity Paradox in Software Development
November 5, 2025 by
AIOps to Agentic AIOps: Building Trustworthy Symbiotic Workflows With Human-in-the-Loop LLMs
November 4, 2025
by
CORE
Docker Security: 6 Practical Labs From Audit to AI Protection
November 10, 2025 by
How Tool-Call Observability Enables You to Support Reliable and Secure AI Agents
November 10, 2025 by
Event-Driven Architecture Patterns: Real-World Lessons From IoT Development
November 10, 2025 by
Kubernetes Guardrail Extension: Bringing Compliance-as-Code to Your Browser
November 11, 2025 by
Docker Security: 6 Practical Labs From Audit to AI Protection
November 10, 2025 by
How Tool-Call Observability Enables You to Support Reliable and Secure AI Agents
November 10, 2025 by
Kubernetes Guardrail Extension: Bringing Compliance-as-Code to Your Browser
November 11, 2025 by
Docker Security: 6 Practical Labs From Audit to AI Protection
November 10, 2025 by
How Tool-Call Observability Enables You to Support Reliable and Secure AI Agents
November 10, 2025 by
Kubernetes Guardrail Extension: Bringing Compliance-as-Code to Your Browser
November 11, 2025 by
Docker Security: 6 Practical Labs From Audit to AI Protection
November 10, 2025 by
How Tool-Call Observability Enables You to Support Reliable and Secure AI Agents
November 10, 2025 by
Docker Security: 6 Practical Labs From Audit to AI Protection
November 10, 2025 by
How Tool-Call Observability Enables You to Support Reliable and Secure AI Agents
November 10, 2025 by