DZone Spotlight

Thursday, October 23 View All Articles »

GraphQL vs REST API: Which Is Better for Your Project in 2025?

By Yilia Lin

Key Takeaways REST APIs excel in simplicity, caching, and microservices architecture, with widespread adoption and a mature tooling ecosystemGraphQL provides precise data fetching, reduces over-fetching, and offers superior flexibility for complex data relationshipsPerformance varies by use case: REST wins for simple CRUD operations and caching scenarios, while GraphQL shines in mobile apps and complex queriesAPI Gateway integration is crucial for managing both approaches effectively, providing unified security, monitoring, and transformation capabilitiesNo universal winner: The choice depends on project requirements, team expertise, and specific technical constraints rather than inherent superiority Understanding REST APIs and GraphQL: The Foundation of Modern API Architecture When evaluating modern API architectures, developers frequently encounter the question: "What is a RESTful API, and how does it compare to GraphQL?" According to recent industry data, over 61% of organizations are now using GraphQL, while REST continues to dominate enterprise environments. Understanding both approaches is essential for making informed architectural decisions. What Is a RESTful API? A RESTful API (Representational State Transfer) is an architectural style that leverages HTTP protocols to create scalable web services. REST and RESTful services follow six key principles: statelessness, client-server architecture, cacheability, layered system, uniform interface, and code on demand (optional). Unlike the traditional SOAP protocol and REST debate, where SOAP and REST discussions centered on protocol complexity, RESTful APIs embrace simplicity and web-native patterns. The fundamental concept behind RESTful API architecture involves treating every piece of data as a resource, accessible through standard HTTP methods (GET, POST, PUT, DELETE). This approach has made REST API RESTful implementations the backbone of countless web applications, from simple CRUD operations to complex enterprise systems. What Is GraphQL? GraphQL represents a paradigm shift from traditional REST approaches. Developed by Facebook in 2012 and open-sourced in 2015, GraphQL is a query language and runtime for APIs that enables clients to request exactly the data they need. Unlike REST's resource-based approach, GraphQL operates through a single endpoint that can handle complex data fetching scenarios. The core innovation of GraphQL lies in its declarative data fetching model. When you need to perform a GraphQL query to get the number of customers along with their recent orders and contact information, a single request can retrieve all related data. This contrasts sharply with REST, where multiple API calls would be necessary. GraphQL mutation capabilities further extend its functionality, allowing clients to modify data using the same expressive query language. This unified approach to both reading and writing data represents a significant departure from REST's verb-based HTTP methods. Historical Context The evolution from SOAP protocol vs REST to modern GraphQL reflects changing application needs. REST APIs have revolutionized how computer systems communicate over the internet, providing a secure, scalable interface that follows specific architectural rules. However, as applications became more sophisticated and mobile-first, the limitations of REST's fixed data structures became apparent. GraphQL emerged as a response to these challenges, particularly the over-fetching and under-fetching problems inherent in REST architectures. While REST remains excellent for many use cases, GraphQL's client-driven approach addresses specific pain points in modern application development. Key Differences: When to Choose GraphQL vs. REST API The choice between GraphQL and REST involves understanding fundamental differences in how each approach handles data fetching, performance optimization, and development workflows. Data Fetching Approaches REST uses multiple endpoints for each resource, requiring separate HTTP calls for different data types. A typical REST implementation might require: HTTP GET /api/users/123 GET /api/users/123/orders This multi-request pattern often leads to over-fetching (receiving unnecessary data) or under-fetching (requiring additional requests). In contrast, GraphQL allows clients to specify exactly what data they need in a single request: Shell query { user(id: 123) { name email orders { id total items { name price } } } } Performance Considerations Performance characteristics vary significantly between approaches. RESTful APIs excel in scenarios where caching is crucial, as HTTP caching mechanisms are well-established and widely supported. The stateless nature of REST makes it highly scalable for simple operations. GraphQL shines in bandwidth-constrained environments, particularly mobile applications. By fetching only required data, GraphQL can reduce payload sizes by 30-50% compared to equivalent REST implementations. However, this efficiency comes with increased server-side complexity, as resolvers must efficiently handle arbitrary query combinations. Development Experience REST's simplicity makes it accessible to developers at all skill levels. The HTTP-based approach aligns naturally with web development patterns, and debugging tools are mature and widely available. RESTful API documentation follows established conventions, making integration straightforward. GraphQL offers powerful introspection capabilities and schema-first development, but requires a steeper learning curve. The strongly-typed schema provides excellent developer experience through auto-completion and compile-time validation, but teams must invest time in understanding GraphQL-specific concepts like resolvers, fragments, and query optimization. Scalability Factors REST is well-suited for microservices architectures, where each service exposes functionality through well-defined APIs. The stateless nature of RESTful services makes horizontal scaling straightforward, and load balancing strategies are well-established. GraphQL presents unique scalability challenges in distributed systems. Query complexity can vary dramatically, making resource planning difficult. Advanced GraphQL implementations require sophisticated caching strategies and query analysis to prevent performance degradation. Technical Implementation: REST vs. GraphQL in Practice Understanding the practical implementation details of both approaches helps developers make informed decisions about which technology best fits their specific requirements. REST API Implementation Patterns RESTful API implementation follows well-established patterns centered around HTTP methods and resource-based URLs. A typical REST API for user management might include: HTTP GET /api/users # List all users POST /api/users # Create new user GET /api/users/123 # Get specific user PUT /api/users/123 # Update user DELETE /api/users/123 # Delete user This approach leverages HTTP's built-in semantics, making RESTful APIs intuitive for developers familiar with web protocols. Status codes provide clear communication about operation results, and stateless communication ensures scalability. Versioning in REST typically involves URL-based strategies (/v1/users, /v2/users) or header-based approaches. While this can lead to API proliferation, it provides clear backward compatibility guarantees. GraphQL Implementation Essentials GraphQL implementation begins with schema definition, establishing the contract between client and server: Shell type User { id: ID! name: String! email: String! orders: [Order!]! } type Order { id: ID! total: Float! createdAt: String! } type Query { users: [User!]! user(id: ID!): User } type Mutation { createUser(name: String!, email: String!): User! } GraphQL mutation operations provide a structured approach to data modification, maintaining the same expressive power as queries. Resolvers handle the actual data fetching logic, allowing for flexible backend integration. Security Considerations Both approaches require careful security implementation, but with different focus areas. RESTful APIs benefit from standard HTTP security practices: authentication headers, CORS policies, and input validation at the endpoint level. GraphQL introduces unique security challenges, particularly around query complexity and depth limiting. Malicious clients could potentially craft expensive queries that strain server resources. Implementing query complexity analysis, depth limiting, and timeout mechanisms becomes crucial for GraphQL security. Error Handling and Monitoring REST relies on HTTP status codes for error communication, providing a standardized approach that integrates well with existing monitoring tools. Error responses follow predictable patterns, making debugging straightforward. GraphQL uses a different error model, where HTTP status is typically 200 even for errors, with actual error information embedded in the response payload. This approach requires specialized monitoring tools and error handling strategies but provides more detailed error context. API Gateway Management: Optimizing GraphQL and REST APIs Modern API management requires sophisticated gateway solutions that can handle both REST and GraphQL effectively. API gateways serve as the critical infrastructure layer that enables organizations to manage, secure, and optimize their API ecosystems regardless of the underlying architecture. Managing RESTful APIs With API Gateway RESTful APIs integrate naturally with traditional API gateway patterns. Standard gateway features like route configuration, load balancing, and protocol translation work seamlessly with REST's resource-based approach. Caching strategies are particularly effective with RESTful services, as the predictable URL patterns and HTTP semantics enable sophisticated caching policies. API gateways excel at transforming REST requests and responses, enabling legacy system integration and API evolution without breaking existing clients. Rate limiting and throttling policies can be applied at the resource level, providing granular control over API consumption. GraphQL API Gateway Integration GraphQL presents unique challenges and opportunities for API gateway integration. Modern gateways like API7 provide GraphQL-specific features, including schema stitching, query complexity analysis, and GraphQL-to-REST transformation capabilities. Query complexity analysis becomes crucial for protecting backend services from expensive operations. API gateways can implement sophisticated policies that evaluate query depth, field count, and estimated execution time before forwarding requests to GraphQL servers. Schema federation support allows organizations to compose multiple GraphQL services into a unified API surface, with the gateway handling query planning and execution across distributed services. Unified API Management Approach Leading API gateway solutions support multi-protocol environments, enabling organizations to manage both RESTful APIs and GraphQL services through a single management plane. This unified approach provides consistent authentication, authorization, monitoring, and analytics across all API types. Developer portal integration becomes particularly valuable in mixed environments, as it can generate documentation and provide testing interfaces for both REST endpoints and GraphQL schemas. This consistency improves developer experience and reduces onboarding complexity. Performance Optimization Techniques API gateways enable sophisticated performance optimization for both API types. Intelligent caching can be applied to GraphQL queries based on query fingerprinting and field-level cache policies. For RESTful APIs, traditional HTTP caching mechanisms provide excellent performance benefits. Request and response transformation capabilities allow gateways to optimize data formats, compress payloads, and aggregate multiple backend calls into a single client response. Global load balancing and failover mechanisms ensure high availability for both GraphQL and REST services. Making the Right Choice: Decision Framework and Future Trends Selecting between GraphQL and REST requires a structured evaluation of technical requirements, team capabilities, and long-term strategic goals. Rather than viewing this as a binary choice, successful organizations often adopt hybrid approaches that leverage the strengths of both paradigms. Decision Criteria Matrix Project requirements should drive the technology choice. RESTful APIs excel in scenarios requiring: Simple CRUD operations with well-defined resourcesHeavy caching requirementsIntegration with existing HTTP-based infrastructureTeam familiarity with web standardsMicroservices architectures with clear service boundaries GraphQL provides advantages when projects involve: Complex data relationships and nested queriesMobile applications with bandwidth constraintsRapidly evolving client requirementsMultiple client types with different data needsReal-time features require subscription support Use Case Scenarios Enterprise applications often benefit from REST's maturity and simplicity. E-commerce platforms, content management systems, and traditional web applications typically align well with RESTful service patterns. The predictable structure and extensive tooling ecosystem make REST an excellent choice for teams building standard business applications. GraphQL shines in scenarios requiring flexible data access patterns. Social media platforms, analytics dashboards, and mobile applications often see significant benefits from GraphQL's precise data fetching capabilities. When you need to execute a GraphQL query to get the number of customers along with their transaction history and preferences, the single-request efficiency becomes invaluable. Future Outlook and Trends The API landscape continues evolving, with both REST and GraphQL finding distinct niches. REST maintains strong adoption in enterprise environments, while GraphQL usage grows in frontend-driven applications and mobile development. Emerging trends include hybrid approaches where REST APIs serve as data sources for GraphQL gateways, providing the best of both worlds. API gateway evolution increasingly focuses on protocol translation and unified management capabilities. Industry adoption data shows continued growth for both approaches, suggesting that the future involves coexistence rather than replacement. Organizations are increasingly adopting API-first strategies that can accommodate multiple paradigms based on specific use case requirements. Conclusion and Recommendations The GraphQL vs REST debate oversimplifies what should be a nuanced technical decision. Both approaches offer distinct advantages, and the optimal choice depends on specific project requirements, team expertise, and organizational constraints. RESTful APIs remain the gold standard for simple, cacheable, and well-understood interaction patterns. Their alignment with HTTP semantics, mature tooling ecosystem, and widespread developer familiarity make them an excellent default choice for many applications. GraphQL provides compelling advantages for applications requiring flexible data access, precise resource utilization, and rapid iteration. The investment in learning GraphQL concepts pays dividends in scenarios where its strengths align with project needs. The most successful API strategies often involve thoughtful integration of both approaches, leveraged through sophisticated API gateway solutions that can manage, secure, and optimize diverse API ecosystems. As API management continues evolving, the ability to support multiple paradigms becomes increasingly valuable for maintaining architectural flexibility and meeting diverse client requirements. Rather than asking "which is better," developers should ask "which approach best serves my specific requirements?" The answer will vary based on context, but understanding the strengths and limitations of both GraphQL and REST enables informed decisions that drive successful API implementations. More

From Platform Cowboys to Governance Marshals: Taming the AI Wild West

By Hugo Guerrero

CORE

The rapid ascent of artificial intelligence has ushered in an unprecedented era, often likened to a modern-day gold rush. This "AI gold rush," while brimming with potential, also bears a striking resemblance to the chaotic and lawless frontier of the American Wild West. We are witnessing an explosion of AI initiatives — from unmonitored chatbots running rampant to independent teams deploying large language models (LLMs) without oversight — all contributing to skyrocketing budgets and an increasingly unpredictable technological landscape. This unbridled enthusiasm, though undeniably promising for innovation, concurrently harbors significant and often underestimated dangers. The current trajectory of AI development has indeed forged a new kind of "lawless land." Pervasive "shadow deployments" of AI systems, unsecured AI endpoints, and unchecked API calls are running wild, creating a critical lack of visibility into who is developing what, and how. Much like the historical gold rush, this is a full-throttle race to exploit a new resource, with alarmingly little consideration given to inherent risks, essential security protocols, or spiraling costs. The industry is already rife with cautionary tales: the rogue AI agent that inadvertently leaked highly sensitive corporate data, or the autonomous agent that, in a mere five minutes, initiated a thousand unauthorized API calls. These "oops moments" are not isolated incidents; they are becoming distressingly common occurrences in this new, unregulated frontier. This is precisely where the critical role of the platform engineer emerges. In this burgeoning chaos, the platform engineer is uniquely positioned to bring much-needed order, stepping into the role of the new "sheriff." More accurately, given the complexities of AI, they are evolving into the governance marshal. This transformation isn't a mere rebranding; it reflects a profound evolution of the role itself. Historically, during the nascent stages of DevOps, platform engineers operated more as "cowboys" — driven by speed, experimentation, and a minimal set of rules. With the maturation of Kubernetes and the advent of widespread cloud adoption, they transitioned into "settlers," diligently building stable, reliable platforms that empowered developers. Now, in the dynamic age of AI, the platform engineer must embrace the mantle of the marshal — a decisive leader singularly focused on instilling governance, ensuring safety, and establishing comprehensive observability across this volatile new frontier. The Evolution of the Platform Engineer: From Builder to Guardian This shift in identity signifies far more than just a new job title; it represents a fundamental redefinition of core responsibilities. The essence of the platform engineer's role is no longer solely about deploying and managing infrastructure. It has expanded to encompass the crucial mandate of ensuring that this infrastructure remains safe, stable, and inherently trusted. This new form of leadership transcends traditional hierarchical structures; it is fundamentally about influence — the ability to define and enforce the critical standards upon which all other development will be built. While it may occasionally necessitate saying "no" to risky endeavors, more often, it involves saying "yes" with a clearly defined and robust set of guardrails, enabling innovation within secure parameters. As a governance marshal, the platform engineer is tasked with three paramount responsibilities: Gatekeeper of infrastructure: The platform engineer stands as the primary guardian at the very entry point of modern AI infrastructure. Their duty is to meticulously vet and ensure that everything entering the system is unequivocally safe, secure, and compliant with established policies and regulations. This involves rigorous checks and controls to prevent unauthorized or malicious elements from compromising the ecosystem.Governance builder: Beyond merely enforcing rules, the platform engineer is responsible for actively designing and integrating governance mechanisms directly into the fabric of the platform itself. This means embedding policies, compliance frameworks, and security protocols as foundational components, rather than afterthoughts. By building governance into the core, they create a self-regulating environment that naturally steers development towards best practices.Enabler of innovation: Crucially, the ultimate objective of the platform engineer is not to impede progress or stifle creativity. Instead, their mission is to empower teams to build and experiment fearlessly, without the constant dread of catastrophic failures. This role transforms into that of a strategic enabler, turning seemingly impossible technical feats into repeatable, manageable processes through the provision of standardized templates, robust self-service tools, and clearly defined operational pathways. They construct the scaffolding that allows innovation to flourish securely. Consider the platform engineer not as an obstructionist, but rather as a highly skilled and visionary highway engineer. They are meticulously designing the safe on-ramps, erecting unambiguous signage, and setting appropriate speed limits that enable complex AI workflows to operate at peak efficiency and speed, all while meticulously preventing collisions and catastrophic system failures. The Governance Arsenal: The AI Marshall Stack Platform engineers do not enter this challenging new domain unprepared. They possess a sophisticated toolkit — their "governance arsenal" — collectively known as the AI Marshall Stack. This arsenal is composed of several critical components: AI gateway: Functioning as a "fortified outpost," the AI Gateway establishes a single, secure point of entry for all applications connecting to various LLMs and external AI vendors. This strategic choke point is where fundamental controls are implemented, including intelligent rate limiting to prevent overload, robust authentication to verify user identities, and critical PII (Personally Identifiable Information) redaction to protect sensitive data before it reaches the AI models.Access control: This element represents "the law" within the AI ecosystem. By leveraging granular role-based access control (RBAC), the platform engineer can precisely define and enforce who has permission to utilize specific AI tools, services, and data. This ensures that only authorized individuals and applications can interact with sensitive AI resources, minimizing unauthorized access and potential misuse.Rate limiting: This is the essential "crowd control" mechanism. It acts as a preventative measure against financial stampedes and operational overloads, effectively preventing scenarios like a misconfigured or rogue AI agent making thousands of costly API calls within a matter of minutes, thereby safeguarding budgets and system stability.Observability: These components serve as the "eyes on the street," providing critical real-time insights into the AI landscape. A significant proportion of AI-related problems stem not from technical failures but from a profound lack of visibility. With comprehensive observability, the platform engineer gains precise knowledge of who is doing what, when, and how, enabling them to swiftly identify and address misbehaving agents or unexpected API spikes before they escalate into significant damage or costly incidents.Cost controls: These are the "bankers" of the AI Marshall Stack. They are designed to prevent financial overruns by setting explicit limits on AI resource consumption and preventing the shock of unexpectedly large cloud bills. By implementing proactive cost monitoring and control mechanisms, they ensure that AI initiatives remain within budgetary constraints, fostering responsible resource allocation. By meticulously constructing and deploying these interconnected systems, platform engineers are not merely averting chaos; they are actively fostering an environment where teams can build and innovate with unwavering confidence. The greater the trust users have in the underlying AI infrastructure and its governance, the more rapidly and boldly innovation can proceed. Governance, in essence, is the mechanism through which trust is scaled across an organization. Just as robust rules and well-defined structures allowed rudimentary frontier towns to evolve into flourishing, complex cities, comprehensive AI governance is the indispensable framework that will enable AI to transition from a series of disparate, one-off experiments into a cohesive, strategically integrated product strategy. Why the Platform Engineer Is the Right Person for the Job: The AI Marshal's Unique Advantage Platform engineers are uniquely and exceptionally well-suited to assume this critical role of the governance marshal. They possess the nuanced context of development cycles, the inherent influence within engineering organizations, and the technical toolkit necessary to implement and enforce AI governance effectively. They have lived through and shaped the eras of the "cowboy" and the "settler"; now, it is unequivocally their time to become the "marshal." The AI landscape, while transformative, is not inherently lawless. However, it desperately requires systematic enforcement and a foundational structure. It needs a leader to build the stable scaffolding that allows developers to move with agility and speed without the constant threat of crashing and burning. This vital undertaking is not about imposing control for the sake of control; rather, it is fundamentally about safeguarding everyone from the inevitable "oops moments" that can derail projects, compromise data, and exhaust budgets. It is about actively constructing a superior, inherently safer, and demonstrably smarter AI future for every stakeholder. Therefore, the call to action for platform engineers is clear and urgent: do not passively await others to define the rules of this new frontier. Seize the initiative. Embrace the role of the hero. Build a thriving, resilient AI town where innovation can flourish unencumbered, and where everyone can contribute and grow without the paralyzing fear of stepping on a hidden landmine. Final Thoughts AI doesn’t need to be feared. It just needs to be governed. And governance doesn’t mean slowing down—it means creating the structures that let innovation thrive. Platform engineers are in the perfect position to lead this shift. We’ve been cowboys. We’ve been settlers. Now it’s time to become marshals. So, to all the platform engineers out there: pick up your badge, gather your toolkit, and help tame the AI frontier. The future of safe, scalable, and trusted AI depends on it. Because the Wild West was never meant to last forever. Towns become cities. And with the right governance in place, AI can move from chaos to confidence — and unlock its full potential. Want to dive deeper into the AI Marshal Stack and see how platform engineers can tame the AI Wild West in practice? Watch my full PlatformCon 2025 session here: Discover how to move from cowboy experiments to marshal-led governance — and build the trusted AI foundations your organization needs. More

Trend Report

Kubernetes in the Enterprise

Over a decade in, Kubernetes is the central force in modern application delivery. However, as its adoption has matured, so have its challenges: sprawling toolchains, complex cluster architectures, escalating costs, and the balancing act between developer agility and operational control. Beyond running Kubernetes at scale, organizations must also tackle the cultural and strategic shifts needed to make it work for their teams.As the industry pushes toward more intelligent and integrated operations, platform engineering and internal developer platforms are helping teams address issues like Kubernetes tool sprawl, while AI continues cementing its usefulness for optimizing cluster management, observability, and release pipelines.DZone's 2025 Kubernetes in the Enterprise Trend Report examines the realities of building and running Kubernetes in production today. Our research and expert-written articles explore how teams are streamlining workflows, modernizing legacy systems, and using Kubernetes as the foundation for the next wave of intelligent, scalable applications. Whether you're on your first prod cluster or refining a globally distributed platform, this report delivers the data, perspectives, and practical takeaways you need to meet Kubernetes' demands head-on.

Refcard #387

Getting Started With CI/CD Pipeline Security

By Sudip Sengupta

CORE

Getting Started With CI/CD Pipeline Security

Refcard #216

Java Caching Essentials

By Granville Barnett

What Is End-to-End Testing?

Being a part of the software team, you would have heard about end-to-end or E2E testing. The testing team ideally prefers to have a round of end-to-end testing to ensure the functional working of the application. Every software application should undergo end-to-end testing to ensure it functions as specified. This testing approach builds confidence in the system and helps development teams determine whether the software is ready for production deployment. In this tutorial, I will guide you through what end-to-end testing is, why it’s important, and how effectively you can implement it in your software project. What Is End-to-End Testing? End-to-end testing refers to testing the software from the end user’s perspective. It verifies that all software modules function correctly under real-world conditions. The core purpose of end-to-end testing is to replicate the real-world user experience by testing the application’s workflow from beginning to end. Let’s take an example of the Parabank demo banking application, where different modules like registration, login, accounts, transactions, payment, and reports modules were built in isolation. Considering end-to-end testing, we should perform a comprehensive test of the end-user journey beginning from registration, then verifying the login functionality, the accounts module by creating a new bank account, performing transactions such as transferring money to different accounts, and checking the status report of the transaction. These tests mimic real user interactions, allowing us to identify issues within the application as it is used from start to finish. What Is the Goal of End-to-End Testing? The primary goal of end-to-end testing is to ensure that all software modules function correctly in real-world scenarios. Another key objective is to identify and resolve hidden issues before the software is released to production. For example, performing an end-to-end test of a loan application that allows users to manually fill in the details and check for the eligibility of loans. By performing end-to-end testing, we can ensure that the user will be able to complete the journey without any issues. By performing end-to-end testing, we not only check for the functionalities and features, but it also allows us to get feedback on the overall user experience. When to Perform End-to-End Testing? End-to-end testing is usually conducted after completing the functional and system testing. It is better to perform it before major releases to confirm that the application works from the end user’s perspective without any errors. It may help us uncover hidden issues as we combine all the modules and test the overall application from beginning to end, just as an end user would. It is recommended to integrate end-to-end tests into CI/CD pipelines to validate workflows and receive faster feedback on builds. Test Strategy Ideally, end-to-end testing should be performed at the end of the software development life cycle. The majority of the tests should be shifted to unit tests following the integration and service-level tests. Finally, the end-to-end testing should be performed. As per Google’s Testing blog, Google suggests a 70/20/10 split: that is, 70% unit tests, 20% integration tests, and 10% end-to-end tests. The specific combination may vary for each team, but it should generally maintain the shape of a pyramid. In short, unit tests form the base, integration tests come next, and end-to-end testing sits at the top of this structure, forming the shape of the pyramid. Different Stages of End-to-End Testing The following are the three phases of performing end-to-end testing: PlanningTestingTest closure Let’s learn about these phases in detail, one by one. Planning In the planning phase, the following points should be considered: Understand the business and functional requirementsCreate a test plan based on the requirement analysisCreate test cases for end-to-end testing scenarios A tester should gain knowledge of the application and understand the different test journeys within it. These test journeys should be designed from the end user’s point of view, covering the entire process from beginning to end. All the happy paths should be noted down, and accordingly, test cases should be designed. For example, from an e-commerce application point of view, a simple test journey would be as shown in the figure below: Similarly, other test journeys can also be prepared where the user adds the product to the cart and logs out of the application. Then, logs in again and continues from where he left off, and so on. We should also consider the following points in the planning phase to get an upper hand on the testing: Set up a production-like environment to simulate the real-world scenarioSetting up the test data, test strategy, and test cases for testing real-world scenariosDefine entry and exit criteria to have a clear objective for end-to-end testingGet the test cases, test data, entry, and exit criteria reviewed by the business analyst or product owner Testing The testing phase can be divided into two stages, namely, the prerequisites and test execution. Prerequisite In this stage, it should be ensured that: All the feature development should be completeAll the submodules and components of the application should be integrated and working fine as a systemSystem testing should be complete for all the related sub-systems in the applicationThe staging environment, designed to replicate the production setup, should be fully operational. This environment enables us to simulate real-world scenarios and effectively reproduce production-like conditions. It will allow seamless testing of the end-to-end scenarios. After completing the prerequisites, we can proceed to the test execution stage. Test Execution In this stage, the testing team should: Executes the test casesReport bugs in case of test failureRetest the bugs once it is fixedRerun all the end-to-end tests to ensure all tests are working as expected The end-to-end tests can be executed manually or using automation in the CI/CD pipelines. Executing end-to-end tests through an automated pipeline is the recommended approach, as it saves time and effort for the testing team while ensuring high-quality results in the shortest possible time. Test Closure In this stage, the following actions should be performed: Analysis of the test resultsTest report preparationEvaluate the exit criteriaPerform test closure The test closure stage in end-to-end testing involves finalizing test activities and documenting results. It ensures that all the test deliverables are complete. It also includes assessing test coverage and documenting key takeaways, for example, noting down known issues. Finally, a test closure report is prepared for the stakeholders. This report could prove to be of great help in Go/No-Go meetings. End-to-End API Testing Example Let’s take an example of the RESTful e-commerce APIs; there are a total of six main APIs in the RESTful e-commerce application as follows: Create Token (POST /auth)Add Order (POST /addOrder)Get Order (GET /getOrder)Update Order (PUT /updateOrder)Partial Update Order (PATCH /partialUpdateOrder)Delete Order (DELETE /deleteOrder) Before performing end-to-end testing on these APIs, we should first analyze their requirements, usage patterns, and technical specifications. These details will be useful in writing the end-to-end test cases as well as designing the automation test strategy. As per Swagger, the following functional points related to the APIs can be noted: The POST Add Order API is used to create new orders in the system, while the GET Order API retrieves an order using the provided order ID.The Create Token API generates a token that will be used in the Update and Delete APIs as a security aspect, so only registered users can update or delete their orders.The update and partial update APIs will be used for updating the orders.The delete API will be used for deleting the order. Considering these details, the following testing strategy can be used for end-to-end testing: Generate a new token by hitting the POST /auth API and saving it for further use.Create new orders using the POST /addOrder API.Retrieve the newly created order by passing the Order ID in the GET /getOrder API.Using the earlier generated token, update an existing order using the PUT /updateOrder API.Verify the partial update order functionality by updating an existing order using the PATCH /partialUpdateOrder API.Delete the existing order by using the DELETE /deleteOrder API.To verify that the order has been successfully deleted, hit the GET /getOrder API. Here, the status code 404 should be retrieved in the response, considering that the order has been deleted from the system. It can be seen that we used all the major APIs to perform end-to-end testing as a real-world scenario. Similarly, end-to-end testing can be carried out for a web or mobile application. It’s important to evaluate the application from an end user’s perspective, create relevant test scenarios, and have them reviewed by the team’s business analyst or product owner. Summary End-to-end testing is a comprehensive testing approach that validates the entire workflow of an application, from start to finish, to ensure all integrated components function as expected. It simulates a real-world scenario to identify the issues across different modules within the system and their dependencies. It simulates real user scenarios to identify issues across systems and dependencies. This helps ensure the application provides a smooth and reliable user experience and uncovers the issues early before the end users face them. Happy testing!

By Faisal Khatri

CORE

Automating Excel Workflows in Box Using Python, Box SDK, and OpenPyXL

In many organizations, MS Excel remains the go-to tool for storing and sharing structured data, whether it’s tracking project progress, managing audit logs, or maintaining employee or resource details. Yet, a surprisingly common challenge persists: data is still being copied and updated manually. Teams across different functions, especially management and DevOps, often find themselves entering or syncing data from one source into Excel spreadsheets manually and repeatedly. This not only consumes time but also introduces room for errors and inconsistencies. For example: A manager who regularly fetches data from a project board to Excel to track progress.A DevOps engineer who tracks resource utilization across environments.An auditor who needs to sync logs from internal tools into an Excel sheet stored in Box for compliance review. These tasks are ripe for automation. Box is a popular cloud storage provider used by several organizations to store files related to projects and employees. With the help of modern tools like the Box Python SDK, openpyxl, and pandas, it's possible to read and write Excel files directly in your code, no downloading or manual editing required. In this blog post, we’ll explore how to: Connect to Box using PythonRead Excel files stored in BoxUpdate and append new data to sheetsUpload the updated file back to Box This workflow is especially useful for DevOps, SREs, and team leads who need to keep operational data synced across systems, or for managers and analysts looking to automate routine Excel updates. Setting Up Your Box App Step 1: Create a Box Developer App Go to the Box Developer Console.Click Create New App.Choose "Custom App" as the app type.Give your app a name and description.Select "OAuth 2.0 with JWT (Server Authentication)". This will generate a new application that can interact with your Box account using server-side authentication. Step 2: Configure App Permissions Once your app is created: Go to your app’s Configuration tab.Under Application Scope, enable: Read and write all files and folders stored in the Box. Important: Once configured, you'll need to submit the app for authorisation by your Box Admin (if your account is part of a Box Enterprise). Until it’s approved, API calls will be limited or denied. Step 3: Generate and Download Configuration File Back in the Configuration tab: Scroll down to the App Credentials section.Click Generate a Public/Private Keypair.A config JSON file will be downloaded. This contains: Client ID and SecretJWT Public Key IDPrivate Key (for signing the JWT)Passphrase for the private keyEnterprise ID Step 4: Share the Box Folder/File With the Box App Automation User Go to your app's General settings. Under the Service account ID, copy the automation user ID.Share the box folder/file with this automation user ID. This will give read and write privileges to the box app. Keep this file safe and private — it grants full API access. Installing Required Libraries Before diving into code, let's install the required Python libraries. These will enable interaction with Box and manipulation of Excel files. pip install boxsdk openpyxl pandas Here’s a quick overview of what each library does: boxsdk: The official Python SDK for Box’s APIs. It allows you to authenticate, access files, upload/download content, and manage folders using the Box Developer API.openpyxl: A powerful library for reading and writing Excel .xlsx files in Python. It lets you work with individual cells, formulas, styles, and sheets.pandas: A data analysis library in Python. Useful when you want to process or filter Excel data in a tabular format using DataFrames. Authenticating With Box in Python To interact with Box via the SDK, you'll first need to authenticate your application. Python def init_box_client(): auth = JWTAuth.from_settings_file(‘<path/to/config.json>’) client = Client(auth) return client Once authenticated, the client can now access any file, folder, or metadata your app has permissions for (Yes, you need to add the app you created as an editor to the file you want to automate). Reading Excel Files from Box Once authenticated, you can access Excel files in Box either by their file ID or by searching for their name. 1. Accessing a File by ID ID is the last part of the URL of a box file. Python box_file = client.file(file_id) 2. Downloading to Memory Using BytesIO You don’t need to write the file to disk. You can load it directly into memory: Python import io from openpyxl import load_workbook file_stream = io.BytesIO() box_file.download_to(file_stream) file_stream.seek(0) # Rewind to beginning workbook = load_workbook(file_stream) 3. Accessing a Specific Sheet Python sheet = workbook["Sheet1"] Or dynamically get the first sheet: Python sheet = workbook.active 4. Reading into a Pandas DataFrame (Optional) If you'd prefer to use pandas for data analysis: Python import pandas as pd file_stream.seek(0) # Ensure pointer is at start df = pd.read_excel(file_stream, sheet_name="Sheet1") Overall, the read Excel function would look like this: Python def load_excelsheet_from_box(client, id, sheet_name): box_file_ref = client.file(id) box_file_metadata = box_file_ref.get() file_stream = io.BytesIO() box_file_ref.download_to(file_stream) file_stream.seek(0) workbook = load_workbook(file_stream) sheet = workbook[sheet_name] return [box_file_ref, box_file_metadata, workbook, sheet] Working With Excel Data With your Excel sheet loaded, you can now read and manipulate rows using openpyxl or pandas. 1. Iterating Over Rows With openpyxl Python for row in sheet.iter_rows(min_row=2, values_only=True): print(row) This will print data from the second row onward (assuming row 1 is headers). 2. Accessing Headers and Filtering You can extract headers using: Python headers = [cell.value for cell in sheet[1]] You might use this to filter or map column positions. 3. Using pandas for Heavy Processing If you want to filter, pivot, or merge data, load it into a DataFrame: Python df = pd.read_excel(file_stream) filtered_df = df[df["Status"] == "Active"] This approach is powerful when working with large Excel sheets or when doing analysis. Saving Changes Back to Box After modifying the Excel workbook, the next step is to save the updated file and upload it back to Box. 1. Saving to Memory We use BytesIO to avoid writing to disk: Python updated_stream = io.BytesIO() workbook.save(updated_stream) updated_stream.seek(0) 2. Uploading to Box Box allows you to replace the contents of a file using its ID: Python client.file(file_id).update_contents_with_stream(updated_stream) This creates a new version of the file in Box, preserving version history and allowing collaboration without risk of losing older data. End-to-End Example: Sync Missing Names Here’s a complete example that demonstrates reading from Box, checking for missing names, appending new entries, and uploading the updated file back to Box. Use Case Let’s say you want to track details of all employees who are onboarding to your platform. Python from boxsdk import Client, JWTAuth from openpyxl import load_workbook import io # Step 1: Authenticate auth = JWTAuth.from_settings_file("box_config.json") client = Client(auth) # Step 2: Download the Excel file from Box file_id = "1234567890" box_file = client.file(file_id) file_stream = io.BytesIO() box_file.download_to(file_stream) file_stream.seek(0) # Step 3: Load and read the Excel sheet workbook = load_workbook(file_stream) sheet = workbook["Sheet1"] existing_names = { (row[0] or "").strip().lower() for row in sheet.iter_rows(min_row=2, values_only=True) if row[0] } # Sample new data. This should come from your platforms’ api. new_rows = [ {"name": "Suresh", "id": "A001", "owner": "Team A"}, {"name": "Ramesh", "id": "B002", "owner": "Team B"}, ] # Step 4: Append new entries for entry in new_rows: name = (entry.get("name") or "").strip().lower() if name and name not in existing_names: sheet.append([entry["name"], entry.get("id", ""), entry.get("owner", "")]) # Step 5: Save and re-upload updated_stream = io.BytesIO() workbook.save(updated_stream) updated_stream.seek(0) box_file.update_contents_with_stream(updated_stream) Conclusion Automating Excel tasks with Box can save hours of manual effort each week. Whether you're a management team syncing reports or a DevOps team tracking infrastructure resources, this workflow helps: Reduce human errorImprove efficiencyEnsure data consistency By combining Box SDK, openpyxl, and pandas, you unlock a powerful set of tools to manipulate Excel files in the cloud, without even opening a UI. From daily reports to audit tracking, once set up, these automations are a game-changer for your productivity. You can refer to the Box documentation for detailed information about available Box APIs: https://developer.box.com/reference/https://github.com/box/box-python-sdk

By Sweetty P Devassy

*You* Can Shape Trend Reports: Join DZone's Database Systems Research

Hey, DZone Community! We have an exciting year of research ahead for our beloved Trend Reports. And once again, we are asking for your insights and expertise (anonymously if you wish) — readers just like you drive the content we cover in our Trend Reports. Check out the details for our research survey below. Database Systems Research With databases powering nearly every modern application nowadays, how are developers and organizations utilizing, managing, and evolving these systems — across usage, architecture, operations, security, and emerging trends like AI and real-time analytics? Take our short research survey (~10 minutes) to contribute to our upcoming Trend Report. Oh, and did we mention that anyone who takes the survey could be one of the lucky four to win an e-gift card of their choosing? We're diving into key topics such as: The databases and query languages developers rely onExperiences and challenges with cloud migrationPractices and tools for data security and observabilityData processing architectures and the role of real-time analyticsEmerging approaches like vector and AI-assisted databases Join the Database Systems Research Over the coming month, we will compile and analyze data from hundreds of respondents; results and observations will be featured in the "Key Research Findings" of our upcoming Trend Report. Your responses help inform the narrative of our Trend Reports, so we truly cannot do this without you. Stay tuned for each report's launch and see how your insights align with the larger DZone Community. We thank you in advance for your help! —The DZone Content and Community team

By DZone Editorial

The Dangerous Middle: Agile Roles That AI Will Erode First

TL; DR: Dangerous Middle and the Future of Scrum Masters and Agile Coaches AI is changing product and tech work. Peter Yang, a renowned product leader, argues that AI will split product roles into two groups: Generalists who can prototype end-to-end with AI, and specialists in the top 5% of their fields. Everyone else in the dangerous middle risks being squeezed. How does this apply to agile practitioners: Scrum Masters, Product Owners, Agile Coaches, and transformation leads? It does, with important nuances. Find Peter Yang’s article here: 5 Practical Steps to Future-Proof Your Career in the AI Era. The Reality of Process Facilitation Practitioners are exposed if their core value is running Sprint Planning, facilitating Retrospectives, or maintaining Jira backlogs. Tools now automate or support much of this work. We’ve seen this before. As teams learned to self-organize, demand for “process managers” fell. A similar shift is underway. Organizations may decide they don’t need practitioners whose skills stop at event facilitation. When other leaders can pull templates for Retrospectives, Sprint Planning, and backlog refinement, they ask, “What are we paying a Scrum Master for?” If you can’t answer in business terms, not process terms, you sit in the vulnerable middle that Yang describes. In the Scrum Anti-Patterns Guide, I documented patterns like “Scrum Master as Secretary” and “Product Owner as Backlog Administrator.” These map to the vulnerable middle roles in Yang’s framework. If your primary output is meeting notes and Jira tickets, you’re at risk. Where Agile Practitioners Hold an Edge The most valuable agile work isn’t framework rollout. It’s building organizational learning. This approach calls for integrative mastery: Combine data analysis, organizational psychology, systems thinking, and political skill to diagnose why empiricism isn’t happening even when teams “do Agile.” Example: Performance drops, Retrospectives show conflict, and leadership pushes for speed. The issue isn’t a better Retro or a lecture on “velocity.” Pressure fuels conflict, conflict lowers productivity, which in turn triggers more pressure. Solving it means presenting uncomfortable data to leaders, facilitating hard conversations, and keeping credibility while naming executive behavior as the constraint. Software can chart velocity, if need be, and analyze sentiment. It cannot run that conversation. Your edge is integration, not execution. The T-Shaped Agile Practitioner, Reframed Expanding your T-shaped skills as an agile practitioner is helpful, but not as “PMs learning to code.” Instead, use technology to enhance diagnosis while you keep interpretation: Use tools to analyze at scale: Processing Retrospective transcripts across teams to spot recurring impedimentsExamining organizational metrics for systemic signalsSurveying stakeholders to track sentiment trends. Reserve judgment for yourself: Tools might report that five teams cited “unclear requirements.” You determine whether that points to a Product Owner gap, stakeholder misalignment, or avoidance of technical-debt discussions. That call rests on pattern recognition earned over the years. An updated T-shape for agile practitioners, therefore, comprises: Depth: organizational change leadership, transformation strategy, executive coachingEnhanced breadth: Data analysis, sentiment detection, pattern recognition across large datasets, and administrative automation. The Agency Imperative High agency matters here, too, but it shows up differently in Agile than in Product: Diagnose problems leadership hasn’t named, e.g., “low productivity” rooted in organizational designDesign interventions without waiting for permission, e.g., convene management workshops to tackle systemic blockersChallenge comfortable narratives, e.g., a failed Scrum rollout caused by missing empowerment, not team “commitment.” Organizations tolerate facilitators when conditions are good. They value change leaders when transformation becomes urgent, and acceleration compresses that timeline. The Dangerous Middle The dangerous middle shows up as: Scrum Masters who know the Guide but lack change-leadership skill; I call them “Scrum Master as Secretary” in the anti-patterns bookProduct Owners who maintain Product Backlogs but don’t drive product decisions: The “Product Owner as Backlog Administrator” patternCoaches who teach frameworks but can’t navigate executive politics: “Agile Coach as Framework Preacher”Transformation leads who implement scaling by the book but can’t link agility to outcomes. These practitioners are competent in mechanics. That is what technology erodes first. If a playbook can capture it, software can replicate it. Reading unspoken dynamics, sensing performative commitment, and spotting structural incentives remain defensible. Another anti-pattern from the book: “Agile Coach as Framework Preacher.” These coaches teach SAFe or LeSS mechanics, but can’t explain why the organization isn’t getting more agile despite execution by the book. Framework knowledge without diagnostic capability sits in the middle. Implications for Leaders Use Yang’s lens to assess your teams: Audit roles against the polarization: Who is a facilitator versus a change leader? The former face increasing risk; the latter gain valueInvest in technical literacy: Give practitioners time and a clear expectation to experiment with new tools. Those who learn before displacement are the ones to keepRedefine “senior”: Senior means demonstrated organizational impact—transformations led, culture shifted, outcomes improved. Framework knowledge is table stakes; strategic influence differentiatesStop hiring for framework implementation: Hire for change leadership, organizational psychology, and influence without authority. Tools can handle mechanics; they fail on politics. A Practical Path Forward Adapt Yang’s five actions to escape the dangerous middle: Choose your positioning: become a top-tier specialist in organizational transformation or a broad generalist who blends facilitation, coaching, analysis, and research. Avoid the competent-only facilitator roleUse technology to expand diagnosis: automate Retrospective synthesis, metric analysis, and documentation. Spend the saved time on trust-building, political navigation, and tailored interventionsBuild tool-collaboration skills: Give precise instructions, provide examples, and review outputs criticallyDemonstrate agency in public: Write about your work, share frameworks, experiment, and document what you learnEliminate low-value tasks: Automate notes, status reports, decks, and documentation. Focus on emotional dynamics, psychological safety, and power structures. The Investment This shift to AI is not a blip. It changes how knowledge work gets done. Agile practitioners don’t need to become engineers. Use new tools to amplify strengths and focus on what software can’t do. That means: Experiment nowShare learnings with the communityMove your value from process to change leadershipBuild public credibility through visible results. Conclusion Yang’s core point generalizes: Agency is the agile practitioner’s moat. People who identify problems, design solutions, and execute without waiting will thrive. Agile has taught these principles for years. You can apply them to your own path, not just your teams, and save yourself from being left in the dangerous middle. New tools won’t replace agile practitioners who build learning organizations, navigate complex change, and question comfortable assumptions. However, they will erode the market for moderating “agile ceremonies by the book.”

By Stefan Wolpers

CORE

Is My Application's Authentication and Authorization Secure and Scalable?

Nowadays, most application requires authentication and authorization due to increased threat levels, and not only do they need to be secured, but also scalable due to increased traffic volume. It's not that the application doesn't have authentication and authorization in place, but the point is, does it provide security, scalability, and more features around this area? Authentication and authorization are a domain in themselves, and most developers/architects start by using a homegrown mechanism, which is not only less secure most of the time because of a lack of domain expertise, but also lots of time spent in non-core activity, and because of that, the product road-map gets a hit, and value addition in the product becomes slow. This blog will talk in detail about the common mistakes made in this area and how we can avoid or overcome them if we are already stuck. Evolution of the Application When a new business idea comes to create a new product/service, full focus is on faster value addition and reaching the market. Non-core requirement also comes in basic form, like an application that requires user authentication by simply providing the username and password. Over the period of time, based on feedback from the market, things get added slowly around authentication and authorization, and once the product/service becomes mature, then you realize that adding more and more around authentication and authorization becomes not only complex but time taking as well like multi-factor authentication, password policies, single sign on, external idp support, compliance requirement. Apart from that, sometimes there are different related products in the business unit/organisation, and due to teamwork in silos, the authentication and authorization mechanism behaves differently, and when a customer uses multiple products in the solution, it gives a bad impression, as doing the same thing in a different way in two different applications from the same business unit/organization. Last, these days monolithic applications are being transformed into microservices, and stateless authentication and authorization become the need of the hour from a scalability perspective, and it also enables developers/architects to have a dedicated service for identity and access management. What Are the Challenges? Key challenges when authentication and authorization are reinvented are: Complexity: As mentioned before, authentication and authorization is a domain itself and over the period of time it becomes complex to maintain because it never gets dedicated attention from the stakeholders and primary focus is on business value addition in the product, because of this, decisions are taken from time perspective not long run value perspective which leads to rigid workflows and less flexibility in the system.Difficult to implement: Since it's not a primary domain, it takes time to implement a feature requested by the customer or market needs.Lack of features: These days, ISVs are concerned about security, and they expect common features around authentication and authorization to be in place, which doesn't happen if homegrown, like single sign-on, OAuth/OIDC support, etc.Compliance: There are various compliance requirements, and meeting many of them is time-consuming, such as GDPR or other security-related compliance. Scalability issues: Since authentication and authorization are part of the application/service, and as per microservices guidelines, each domain should have a different microservice for better scalability. Roadmap acceleration: There is a hit on business value addition in the product roadmap, as most of the time is spent on non-core activities. What Should Be the Approach? If you are building a new application/service, then try to use an externalized authentication and authorization mechanism rather than inside the application/service. Also, don't re-invent it; rather, use either open source identity and access management solutions like Keycloak or paid offerings. There will be an initial investment, but it would be only around integrating it, and it would pay off as you can leverage many features/flexibility provided out of the box, and much better security as well. Choosing the Right Solution I think one of the key decisions to be taken is whether to go for an open source or a paid offering. It would depend on many factors like use cases requirements, budget approval, etc., but you need to consider the following things: Paid offering would provide better support and a better chance of influencing their roadmap.Open source won't require investment, but influencing the roadmap won't be easy unless you actively contribute. Integration Guidance When you integrate an external authentication and authorization system for JWT token verification, there are two high-level approaches: Centralized architecture: In this case, authentication and authorization verification activity happens at the gateway level, and after that, the request is not authenticated at the individual microservice level.Distributed architecture: In this case, authentication and authorization happen at the microservice level, basically as a sidecar container, i.e., outside the application but inside the same pod.Hybrid architecture: You may go for a hybrid mechanism where a certain part is verified at the gateway level and a certain part is verified at the microservice level. Tips Don't keep authentication and authorization-related work inside the application. You may use Kong at the gateway level for a centralized architecture and the OAuth2 proxy as a sidecar in case of a distributed architecture.Don't mix access to a resource with access to an instance of a resource. Access to the resource is part of authorisation, and access to an instance of the resource should be part of the application.

By Navin Kaushik

JavaScript Internals: Understanding Runtime Optimization and How to Write Performant Code

From a straightforward browser scripting language, JavaScript has morphed into an ultra-flexible and versatile technology that powers everything from dynamic front-end UIs and back-end services (Node.js) to automation scripts and IoT devices (with libraries like Johnny-Five). Yet, that flexibility introduces a lot of complexity in writing efficient, performant code. Fortunately, JavaScript engines working to execute your code employ a number of optimization strategies during runtime to improve performance. For these optimizations to be most effective, though, you as the developer must understand how the engines practically work and adopt coding practices that kowtow to their internal processes. This article will, for the most part, stick to the basics of how the engines work and what kinds of practical, everyday coding strategies you can utilize to just get more oomph out of your engine. The Journey of JavaScript Code Your JavaScript code executes after it passes through three primary stages in the JavaScript runtime: AST Generation (Abstract Syntax Tree) The code is parsed and translated into an Abstract Syntax Tree (AST). This structure stands between the source code and the machine language. The engine processes the AST, so the format is crucial. Bytecode Compilation The AST is then compiled into bytecode, a lower-level intermediate code that is closer to machine code, but still independent of the platform. This bytecode is run by the JavaScript engine. JIT Optimization When the bytecode runs, the JavaScript engine continually optimizes the code using Just-In-Time (JIT) compilation. The JIT compiler collects data about the runtime (like types and usage patterns) and creates efficient machine code tailored to the environment where the code is running. These phases are critical to comprehend if you're going to write fast JavaScript. It's not enough just to write code that works; you need to structure it in a way that lets the engine optimize it effectively. Then it will run faster and perform better. The Importance of JIT Compilation JavaScript is a language with dynamic typing, which means types are determined at runtime rather than at compile-time. This allows for flexibility in the language, but it poses certain challenges for JavaScript engines. When the engine compiles JavaScript to machine code, it has very little information about the kinds of variables or the types of function arguments that were used in the code. With so little information, the engine cannot generate highly optimized machine code at the outset. This is where the JIT compiler comes in and does its work. JIT compilers are capable of watching the code execute and gathering information at runtime about what kinds of variables and types are being used, and they then use this information to optimize the code as it runs. By optimizing the code based on actual usage, the JIT compiler can produce highly efficient machine code for the hot paths in your code — i.e., the frequently executed parts. But not every part of the code is eligible for this kind of optimization. Writing Optimization-Friendly Code One of the most crucial aspects of JavaScript programming that can be efficiently handled by the JIT compiler is maintaining consistency in your code, especially where types are concerned. When you create functions that take arguments of different types or forms, it makes it hard for the JIT compiler to guess what you really meant and thus what kinds of optimizations it can apply. Type Consistency: A Key to Performance To showcase the significance of type consistency, let's evaluate a basic instance. Imagine you possess the following function: JavaScript function get_x(obj) { return obj.x; } At first glance, this function seems to work in a very straightforward manner — it just returns the x property of an object. However, JavaScript engines have to evaluate the location of the x property during execution. This is because x could be either a direct property of the object we're passing in or something that our engine has to check for along the prototype chain. The engine also has to check to see if the property exists at all, which augments any overhead costs tied to the function's execution. Now consider that we're calling this function with objects of varying shape: JavaScript get_x({x: 3, y: 4}); get_x({x: 5, z: 6}); Here, the get_x function gets objects with different property sets (x, y, z) that make it tough for the engine to optimize. Each time the function is called, the engine must check to determine where the x property exists and whether it’s valid. However, if you standardize the structure of the objects being passed, even if some properties are undefined, the engine can optimize the function: JavaScript get_x({x: 3, y: 4, z: undefined}); get_x({x: 5, y: undefined, z: 6}); Now, the function is consistently receiving objects that maintain the same shape. This consistency allows the engine to optimize property access more effectively. When the engine accesses properties, it can now predict with a greater degree of certainty that certain properties will be present. For example, it knows that an "x" property will always be present — even if the engine also knows that the "x" property is undefined a significant amount of the time. Measuring Optimization in Action Modern JavaScript engines offer a way to monitor the optimization of your code. You can, for instance, use Node.js to observe the optimization workings of the engine on your code. JavaScript function get_x(obj) { return obj.x + 4; } // Run a loop to trigger optimization for (let i = 1; i <= 1000000; i++) { get_x({x: 3, y: 4}); } Executing this code with Node.js using the --trace-deopt and --trace-opt flags enables you to glimpse how the engine optimizes the function. JavaScript node --trace-deopt --trace-opt index.js In the output, you might see something like this: JavaScript [marking 0x063d930d45d9 <JSFunction get_x (sfi = 0x27245c029f41)> for optimization to TURBOFAN, ConcurrencyMode::kConcurrent, reason: small function] This message shows that the get_x function has been set aside for optimization and will be compiled with the very fast TURBOFAN compiler. Understanding Deoptimization Understanding deoptimization — when the JIT compiler gives up an optimized code path — is just as important as understanding optimization. Deoptimization occurs when the engine's assumptions about the types or structures of data no longer hold true. For instance, if you change the function to manage another object shape: JavaScript function get_x(obj) { return obj.x + 4; } // Initial consistent calls for (let i = 1; i <= 1000000; i++) { get_x({x: 3, y: 4}); } // Breaking optimization with different object shape get_x({z: 4}); The output may contain a message about deoptimization: JavaScript [bailout (kind: deopt-eager, reason: wrong map): begin. deoptimizing 0x1239489435c1 <JSFunction get_x...>] This means that the engine had to give up its optimized code path because the new object shape wasn't what it was expecting. The function get_x is no longer working on objects that have a reliable structure, and so the engine has reverted to a version of the function that is not as fast. Maintaining Optimization With Consistent Object Shapes To prevent deoptimization and to help the engine maintain optimizations, it's crucial to keep object shapes consistent. For instance: JavaScript function get_x(obj) { return obj.x + 4; } // All calls use the same object shape for (let i = 1; i <= 1000000; i++) { get_x({x: 3, y: 4, z: undefined}); } get_x({x: undefined, y: undefined, z: 4}); Here, the get_x function always gets objects with the same shape. This lets the engine maintain its optimizations since there's no need to deoptimize when an object's shape is consistent. Best Practices for Optimizable Code To ensure that the JavaScript code is maximally efficient and optimized by the JIT compiler, follow these best practices. Consistent object shapes: Make sure that the function receives the same set of properties from the object, even if some of the properties are undefined.Type stability: Maintain consistency in types for function parameters and return values. Don't switch between primitive types (e.g., string, number) in the same function.Direct property access: Optimizing direct property access (like obj.x) is easier than optimizing dynamic property access (like obj["x"]), so it's best to avoid dynamic lookups when you can.Focus on hot code paths: Focus your optimization activities on the parts of your code that are run the most. Profiling tools can help you direct your optimization efforts to the areas of your code that will benefit the most from those efforts.Minimize type variability: Refrain from dynamically altering the type or shape of objects and arrays. The engine can make better assumptions and optimizations when structures remain static. Conclusion To write performant JavaScript, you need more than just a clean syntax; you need a solid understanding of how the JavaScript engine optimizes and executes your code. This means maintaining type consistency, using consistent object shapes, and knowing how the JIT compiler works. Right now, it is essential to remember that if you start trying to make your code perform well at every conceivable point, you will end up making it perform poorly at some points that matter and wasting your time in the process. The point, then, is not to write optimized code but rather to write code that is simple to understand and reason about, hitting the few places where it needs to be performant in a straight path from start to finish.

By Akim Mamedov

Django Architecture vs FastAPI: A Learning Path

When choosing a backend for your next Python project, the comparison between Django and FastAPI often comes up. Many developers who have spent years in Django’s “batteries-included” world eventually experiment with FastAPI for its modern, async-first approach. A Reddit thread titled “Django Architecture versus FastAPI” captures this exact debate: a long-term Django user moving to FastAPI for asynchronous advantages. I am a contributor to FastOpp, an open-source FastAPI starter package for AI web applications. It uses pre-built admin components to give FastAPI functionality comparable to Django for AI-first applications. This article organizes lessons from building FastOpp, focusing on community insights, into a learning path. It contrasts both frameworks’ philosophies, pinpoints when each is the right fit, and outlines practical steps for experimenting or migrating. Django’s Architecture and Strengths Django is a mature, full-stack web framework based on the Model-Template-View (MTV) pattern. It provides ORM, templating, forms, authentication, and an admin interface out of the box. The emphasis is on productivity: strong defaults and conventions help developers build complex apps quickly. Strengths Large ecosystem of reusable apps and plugins.Integrated admin UI, auth, and migrations.Strong community and extensive documentation. Trade-Offs Primarily synchronous. While Django introduced some async support, core components remain synchronous.More suited to monolithic applications and CRUD-heavy projects.Additional effort needed when building API-first or high-concurrency services. FastAPI’s Architecture and Strengths FastAPI, released in 2018, is a modern Python framework optimized for APIs. It is async-first, type-hint aware, and automatically generates OpenAPI/Swagger documentation. Its design favors explicitness: developers choose their own ORM, auth, and middleware. Strengths High performance with async/await and minimal overhead.Auto-validation and type safety via Pydantic.Dependency injection and auto-generated API documentation. Trade-Ofs Minimal “batteries” included. Auth, ORM, and background tasks require external libraries.More architectural decisions fall to the developer.Ecosystem is smaller compared to Django. Code Contrast: ChatMessage API in FastOpp-Style FastAPI vs. Django REST Framework To make this comparison tangible, let’s adapt an example inspired by FastOpp, an opinionated FastAPI starter stack for AI web applications. We’ll implement a simple ChatMessage API. FastAPI (FastOpp-style) Python # models.py from sqlalchemy import Column, Integer, String, Text, DateTime, func from sqlalchemy.orm import declarative_base Base = declarative_base() class ChatMessage(Base): __tablename__ = "chat_messages" id = Column(Integer, primary_key=True, autoincrement=True) role = Column(String(20), nullable=False) # "user" | "assistant" | "system" content = Column(Text, nullable=False) session_id = Column(String(64), index=True) created_at = Column(DateTime(timezone=True), server_default=func.now()) Python # schemas.py from pydantic import BaseModel, Field from typing import Optional class ChatMessageIn(BaseModel): role: str = Field(pattern="^(user|assistant|system)$") content: str session_id: Optional[str] = None class ChatMessageOut(ChatMessageIn): id: int Python # routes/chat.py from fastapi import APIRouter, Depends from sqlalchemy.orm import Session from typing import List from db import SessionLocal from models import ChatMessage from schemas import ChatMessageIn, ChatMessageOut router = APIRouter(prefix="/api/chat", tags=["chat"]) def get_db(): db = SessionLocal() try: yield db finally: db.close() @router.get("/messages", response_model=List[ChatMessageOut]) async def list_messages(session_id: str, db: Session = Depends(get_db)): return db.query(ChatMessage).filter(ChatMessage.session_id == session_id).order_by(ChatMessage.id.asc()).all() @router.post("/messages", response_model=ChatMessageOut, status_code=201) async def create_message(payload: ChatMessageIn, db: Session = Depends(get_db)): msg = ChatMessage(**payload.model_dump()) db.add(msg) db.commit() db.refresh(msg) return msg Observation: This is modular and explicit. Each layer — models, schemas, routes — is wired manually, but you gain async support, type safety, and auto-generated docs. Django REST Framework (DRF) Python # models.py from django.db import models class ChatMessage(models.Model): ROLE_CHOICES = [("user", "user"), ("assistant", "assistant"), ("system", "system")] role = models.CharField(max_length=20, choices=ROLE_CHOICES) content = models.TextField() session_id = models.CharField(max_length=64, db_index=True) created_at = models.DateTimeField(auto_now_add=True) Python # serializers.py from rest_framework import serializers from .models import ChatMessage class ChatMessageSerializer(serializers.ModelSerializer): class Meta: model = ChatMessage fields = ["id", "role", "content", "session_id", "created_at"] Python # views.py from rest_framework import viewsets from rest_framework.response import Response from rest_framework.decorators import action from .models import ChatMessage from .serializers import ChatMessageSerializer class ChatMessageViewSet(viewsets.ModelViewSet): queryset = ChatMessage.objects.all().order_by("id") serializer_class = ChatMessageSerializer @action(detail=False, methods=["get"]) def by_session(self, request): session_id = request.query_params.get("session_id") qs = self.queryset.filter(session_id=session_id) if session_id else self.queryset.none() serializer = self.get_serializer(qs, many=True) return Response(serializer.data) Python # urls.py from rest_framework.routers import DefaultRouter from .views import ChatMessageViewSet router = DefaultRouter() router.register(r"chat/messages", ChatMessageViewSet, basename="chatmessage") urlpatterns = router.urls Observation: DRF delivers a lot of functionality with fewer moving parts: serialization, routing, and even a browsable API are built in. But concurrency and async remain limited compared to FastAPI. Insights From the Reddit Discussion The Reddit conversation highlights real-world migration motives: The original poster had been using Django for 10 years but left due to asynchronous limitations.FastAPI is attractive for performance-critical, modern workloads and for its explicit, “less magic” approach.Commenters note that Django remains powerful for traditional applications, especially where built-in features like admin and templating matter.Hybrid approaches are common: use Django for monolithic needs and FastAPI for performance-critical endpoints. This reflects what many teams practice in production today. A Learning Path: From Django to FastAPI Here is a phased roadmap for developers who know Django but want to explore or integrate FastAPI. PhaseGoalActivities0. Strengthen Django knowledgeUnderstand Django internalsStudy request lifecycle, middleware, ORM, async support, channels.1. Build an API with Django REST Framework (DRF)Learn Django’s approach to APIsCRUD endpoints, serializers, viewsets, permissions.2. Prototype in FastAPIGet comfortable with idiomsWrite async endpoints, Pydantic models, background tasks, explore auto-docs.3. Compare directlyContrast patternsRebuild selected DRF endpoints in FastAPI; compare validation, performance, error handling.4. Hybrid experimentCombine frameworksRun FastAPI services alongside Django, e.g., for high-throughput endpoints.5. BenchmarkTest performance under loadConcurrency benchmarks, database pooling, caching, async vs sync results.6. Selective migrationMove critical partsIncrementally replace Django endpoints with FastAPI while monitoring regression risk. When to Choose Which Django (With DRF) CRUD-heavy apps with relational data models.Need for admin UI, auth, templating, or rapid prototyping.Teams preferring conventions over configuration. FastAPI API-first or microservice architectures.High-concurrency or I/O-bound workloads.Preference for type safety and minimal middleware. Hybrid Keep Django for established modules.Spin up FastAPI for latency-sensitive services like ML inference or real-time APIs.Migrate gradually as needs evolve. Common Pitfalls in Hybrid or Migration Logic duplication: Extract business logic into shared libraries to avoid drift.Data consistency: If both frameworks share a database, carefully manage transactions and migrations.Authentication split: Standardize on JWT, OAuth, or another central auth service.Operational overhead: Two runtimes mean double monitoring and deployment complexity.Premature optimization: Validate Django bottlenecks before migrating—extra complexity must be justified. Conclusion The Reddit thread “Django Architecture versus FastAPI” captures a broader reality: Django is still excellent for monolithic, feature-rich applications, while FastAPI excels at modern, async-driven APIs. Many teams combine both, letting each framework play to its strengths. Your path doesn’t need to be binary. Start by reinforcing Django fundamentals, then prototype in FastAPI. Compare patterns, test performance, and — if justified — adopt a hybrid approach. With careful planning, you gain flexibility without locking into a single paradigm.

By Jesse Casman

CORE

MultiCloudJ: Building Cloud-Agnostic Applications in Java

According to a 2024 Gartner report, more than 92% of large enterprises now operate in multi-cloud environments. This reflects strategic priorities such as geographic scalability, high availability, regulatory compliance, and cost optimization. But with these benefits comes significant complexity. Each provider — AWS, GCP, Alibaba Cloud, and others — exposes its own APIs, semantics, and SDKs. As a result, development teams must reconcile divergent models for storage, databases, identity, and more. The outcome is often fragmented codebases filled with conditional logic, code forking, duplicated workflows, and costly rewrites when onboarding new providers. For large organizations, this slows delivery, increases operational risk, and erodes the developer experience. Ideally, enterprises would rely on a shared abstraction layer - one that standardizes semantics across providers while encapsulating provider-specific details. Previous efforts, such as Apache Jclouds, attempted to solve this for a subset of services, such as blobstore, compute, and load balancers, but struggled with a REST-based architecture that lagged behind evolving cloud features, ultimately landing in the Apache Attic. This created a gap in the Java ecosystem: the need for an actively maintained, extensible SDK with deep, provider-backed integration. MultiCloudJ, recently open-sourced by Salesforce, fills that gap. Built directly on official provider SDKs rather than raw REST APIs, it ensures compatibility with the latest features while offering cloud-neutral abstractions. With portable APIs, driver layers, and provider implementations, MultiCloudJ gives Java developers a modern, unified SDK for building cloud-agnostic applications. The Multi-Cloud Challenge Enterprises are rapidly adopting multi-cloud strategies to strengthen resilience, meet compliance needs, and avoid over-reliance on a single provider. The business motivations are clear: Resilience and availability: Enable failover, disaster recovery, and regional deployments for low latency.Compliance: Satisfy data sovereignty and regulatory requirements with regional providers.Cost optimization: Run workloads on the most cost-effective cloud infrastructure.Leverage: Reduce vendor lock-in and strengthen negotiation power with providers.Time-to-market: Accelerate rollouts by avoiding costly rewrites tied to a single provider’s SDK. Yet these benefits for the business often create challenges for developers. Each provider exposes unique SDKs and semantics, forcing teams to duplicate logic, maintain sprawling if/else, case switch branches, complete code forks, and manage inconsistent workflows. Onboarding a new provider typically requires expensive refactoring, while steep learning curves slow down delivery. In practice, the strategic promise of multi-cloud often translates into day-to-day friction for engineers, who spend more time reconciling SDK differences than building business value. The Need for a Unified Approach This growing gap between enterprise strategy and developer experience calls for a new abstraction: a way to standardize cloud access without sacrificing provider capabilities. By offering consistent interfaces across storage, databases, and messaging services, such an approach reduces duplication, simplifies onboarding, and allows teams to focus on business logic instead of cloud-specific rewrites. That need is what led to the creation of MultiCloudJ — an open-source Java SDK designed to make multi-cloud development practical and standardized. What Does MultiCloudJ Offer? MultiCloudJ is a modular Java SDK that exposes cloud-neutral interfaces for some of the most commonly used cloud services: Blob/Object storage: Backed by AWS S3, Google Cloud GCS, Alibaba OSSDocument stores: AWS DynamoDB, GCP Firestore, Alibaba TablestoreSecurity Token Service (STS): for credential delegationPub/Sub (message queues): coming soon.. By encapsulating native provider SDKs behind a uniform API, MultiCloudJ centralizes provider-specific complexity. Applications interact only with portable abstractions, while the SDK manages request translation to a specific provider, semantic differences, and provides a consistent response. Architecture Deep Dive MultiCloudJ follows a three-layer architecture: 1. Portable Layer (Public API) Developer-facing layer with portable methods (such upload(), download(), delete()) for blobstore. Java BucketClient client = BucketClient.builder("aws") .withBucket("exampleBucket") .withRegion("us-east-1"); UploadRequest req = new UploadRequest.Builder() .withKey("foo/bar.jpg") .build(); UploadResponse res = client.upload(req, "sample-content"); Switching to GCP or Alibaba requires changing only the cloud-specific values, such as the provider name from “aws” to “gcp” or Alibaba; bucket name, region name, and business logic remain unchanged. 2. Driver Layer (Abstraction and Coordination) Defines core operations and shared validation. Java public abstract class AbstractBlobStore { protected abstract UploadResponse doUpload(UploadRequest request); protected abstract void doDelete(String key); } This shields the public API from leaking provider details. 3. Provider Layer (Substrate Implementations) Implements cloud-specific logic using native SDKs. Java public class AWSBlobStore extends AbstractBlobStore { @Override protected UploadResponse doUpload(UploadRequest request) { PutObjectRequest putReq = PutObjectRequest.builder() .bucket(bucket) .key(request.getKey()) .build(); s3Client.putObject(putReq, RequestBody.fromString(request.getContent())); return new UploadResponse(...); } } Design Benefits Portability: Swap providers without rewriting application code.Unified, cloud-agnostic interfaces: Write once and interact with multiple clouds through consistent APIs.Extensibility: Onboard new providers or services without modifying existing code.Flexibility: Override defaults or inject custom implementations as needed.Uniform semantics: Normalize differences across providers to simplify development. Example – Object deletion: AWS S3 returns 200 OK when deleting a non-existent object, while Google Cloud Storage returns 404 Not Found. MultiCloudJ abstracts this by providing a consistent, predictable response.Example – Pagination: DynamoDB uses LastEvaluatedKey for continuation, while Firestore relies on the last document of a page. MultiCloudJ standardizes pagination into a uniform API.Reliability: Robust error handling and retry mechanisms ensure consistent, dependable operation. Note: MultiCloudJ strives for uniform semantics but does not replicate every provider-specific capability; there are always some rare exceptions. For example, global transactions are available in DynamoDB, Firestore, and CosmosDB, but not in Alibaba Tablestore. In such cases, the SDK throws explicit exceptions to signal unsupported features. Real-World Use Cases Global SaaS platforms: deploy to AWS in North America and Alibaba Cloud in AsiaHybrid cloud deployments: route workloads by business unit or geographyDisaster recovery: maintain a fallback provider for failoverCross-cloud replication: sync documents or blobs across providersMigration: shift workloads between providers with minimal application changes Getting Started for Developers Maven Dependency XML <dependency> <groupId>com.salesforce.multicloudj</groupId> <artifactId>docstore-client</artifactId> <version>0.2.2</version> </dependency> Switch docstore with blobstore, sts, or provider modules. Runtime Configuration Include provider-specific modules with Maven profiles: XML <dependency> <groupId>com.salesforce.multicloudj</groupId> <artifactId>docstore-gcp</artifactId> <version>0.2.2</version> </dependency> Example: Writing to a Document Store Java CollectionOptions opts = new CollectionOptions.CollectionOptionsBuilder() .withTableName("books") .withPartitionKey("title") .withSortKey("author") .build(); DocStoreClient client = DocStoreClient.builder("aws") .withRegion("us-west-2") .withCollectionOptions(opts) .build(); Book book = new Book("YellowBook", "Zoe", "Seattle", 3.99f, Map.of("Ch1", 5, "Ch2", 10), null); client.create(new Document(book)); Switching to Firestore or Tablestore requires only updating provider and resource configs. Lessons Learned in Building MultiCloudJ Conformance Testing Across Providers To ensure consistent behavior, all MultiCloudJ tests are written against abstract driver classes. This allows the same test suite to run across different providers, with each provider supplying its own configuration (credentials, resource names, etc.). This approach guarantees uniform behavior and quickly highlights any provider-specific deviations. Testing in CI Environments Running full integration tests in CI pipelines posed challenges, primarily because securely managing live credentials for multiple providers is risky. To solve this, the team used Wiremock as a proxy, recording and replaying provider responses. This enabled reliable CI testing without requiring actual provider credentials, while still validating real-world request/response flows. Handling Cloud-Specific Features MultiCloudJ generally avoids exposing features unique to a single provider, as doing so would compromise portability. However, in some cases, the team chose to implement provider-specific functionality at the SDK layer while keeping a consistent client-facing experience. For example, Google Cloud Storage lacks native multipart upload support, so MultiCloudJ implements it internally using composable objects — giving developers a seamless, uniform experience. Defining Meaningful Semantics Each provider has its own interpretation of common operations, making it critical to standardize semantics for developers. Some differences are simple, while others require more nuanced handling, such as pagination approaches for GCP Firestore, and no multi-part upload support in Google Cloud SDK natively. We need to design the common semantics very carefully to offer uniform behavior. Leverage the cloud provider SDK instead of reinventing the wheel. Leverage provider SDKs instead of reinventing the wheel. We considered reviving Apache jclouds, but building directly on raw REST APIs is impractical. Cloud providers’ SDKs already handle the heavy lifting - request signing, headers, authentication flows, TLS, error handling, and endpoint management. Recreating and maintaining all of that ourselves would be fragile, time-consuming, and unsustainable. Conclusion MultiCloudJ tackles one of the toughest challenges in enterprise cloud engineering: achieving portability without compromising capability. By abstracting core services through provider-backed APIs, it delivers a consistent developer experience and reduces the operational complexity of multi-cloud environments. For enterprises navigating compliance, cost, and resilience requirements, MultiCloudJ offers a practical and sustainable path forward. The project is open source under Salesforce’s GitHub organization and welcomes community contributions and feedback: github.com/salesforce/multicloudj.

By Sandeep Pal

PostgreSQL Full-Text Search vs. Pattern Matching: A Performance Comparison

A previous article explains why PostgreSQL full-text search (FTS) is not a good candidate for implementing a general find functionality, where the user is expected to provide a pattern to be looked up and matched against the fields of one or multiple entities. Considering the previously explained technical challenges, it is clear that FTS is great for semantic and language-aware search, although it cannot cover raw searches in detail for various use cases. In the field of software development, it isn’t uncommon for us to need to accept trade-offs. Actually, almost every decision that we make when architecting and designing a product is a compromise. The balance is tilted depending on the purpose, requirements, and specifics of the developed product so that the solution and the value delivery are ensured, and the customers are helped to accomplish their needs. With this statement in mind, to achieve success, you need to compromise. This article aims to compare the two methods, FTS and pattern matching, in terms of execution performance, of course, under a set of clear preconditions (trade-offs) that are assumed. The purpose is obviously to increase the objectivity related to these methods’ applicability and help programmers choose easily when needed. Preconditions Perform a pattern search on three different entities by looking up in multiple tables’ columns.The entities of interest are: telecom invoices, inventory items for which the invoices are issued, and orders of such inventory items.Entities that compose the result of the operation are displayed together.A ‘starts with’ matching against the aimed columns is acceptable.The fields of interest are: invoice number and comments, inventory number and comments, and order number and comments. Performance Analysis Assuming the PostgreSQL Database Server is up and running, one can connect to it and explore the three entities of interest. SQL select count(*) from inventories; -- 2_821_800 records select count(*) from invoice; -- 55_911 records select count(*) from orders; -- 30_564 records One can observe that the total number of searched entities is around 3 million, which is a pretty good sample for an objective analysis. The purpose isn’t necessarily to provide the implementation in detail, but to observe the performance of several strategies and make a comparison. In order to be able to make the analysis without having to modify the existing tables, the following materialized view is created. SQL CREATE MATERIALIZED VIEW IF NOT EXISTS mv_fts_entities AS SELECT contractid AS id, 'INVENTORY' AS type, extrainfo AS number, ( setweight(to_tsvector('simple', coalesce(extrainfo, '')), 'A') || setweight(to_tsvector('simple', coalesce(itemcomments, '')), 'B') ) AS search_vector FROM inventories UNION ALL SELECT i.invoiceid AS id, 'INVOICE' AS type, i.invoicenum AS number, ( setweight(to_tsvector('simple', i.invoicenum), 'A') || setweight(to_tsvector('simple', coalesce(i.comments, '')), 'B') || setweight(to_tsvector('simple', a.account), 'A') ) AS search_vector FROM invoice i LEFT JOIN account a on a.id = i.accountid UNION ALL SELECT orderid AS id, 'ORDER' AS type, ordernumber AS numnber, ( setweight(to_tsvector('simple', ordernumber), 'A') || setweight(to_tsvector('simple', coalesce(comments, '')), 'B') ) AS search_vector FROM orders; A few clarifications: The materialized view reunites all the records contained in the three tables of interest.id represents the unique identifier of each entity.type allows identifying the entity — INVENTORY, INVOICE, or ORDER.number is the field we’re mainly interested performing the pattern search on — inventory, invoice and order number respectively.search_vector contains the tsvector representation of the columns of interest and it represents PostgreSQL’s text search vector type that denotes the searchable content.setweight() – depending on the considered column, different weights are set when computing the lexemes, for instance, in case of orders, we consider the match to have a higher priority on order number than on comments.coalesce() handles the null values gracefully.For invoice records, a match is attempted on the invoice account as well, although the column designates an attribute of a different entity. The materialized view creation takes about three seconds. If interested in refreshing the content so that the data is up to date, one can issue the command below. SQL REFRESH MATERIALIZED VIEW mv_fts_entities; The materialized view refresh takes around 10 seconds. Additionally, the following indexes are created to improve performance in both cases. A GIN index on the search_vectorcolumn to improve full-text search SQL CREATE INDEX mv_fts_entities_search_idx ON mv_fts_entities USING GIN (search_vector); A B-Tree index on the number column to improve the ‘starts with’ pattern searching SQL CREATE INDEX mv_fts_entities_number_idx ON mv_fts_entities(number); Both operations take about five seconds. With the above items in place, let’s examine the performance in each of the following scenarios. 1. Full-Text Search SQL EXPLAIN ANALYZE SELECT id, type, number, search_vector, ts_rank(search_vector, query) as rank FROM mv_fts_entities, to_tsquery('simple', '101:*') query WHERE search_vector @@ query ORDER BY rank DESC LIMIT 10; This returns the following results and query plan: Plain Text +------+---------+---------------------+---------------------------------------------+---------+ |id |type |number |search_vector |rank | +------+---------+---------------------+---------------------------------------------+---------+ |162400|INVENTORY|KBBC24100 101ATI |'101ati':2A 'kbbc24100':1A |0.6079271| |13162 |INVOICE |M566274 |'10130bafy0':2A 'm566274':1A |0.6079271| |4880 |INVOICE |M554853 |'10130bafy0':2A 'm554853':1A |0.6079271| |55713 |INVOICE |M628493 |'10130bbt0':2A 'm628493':1A |0.6079271| |52525 |INVOICE |M623623 |'10130bfml0':2A 'm623623':1A |0.6079271| |35131 |INVOICE |4233816-IVG |'10111020':3A '4233816':1A 'ivg':2A |0.6079271| |34326 |INVOICE |4233312-IVG |'10111020':3A '4233312':1A 'ivg':2A |0.6079271| |34082 |INVOICE |4232587IVG |'10111020':2A '4232587ivg':1A |0.6079271| |46370 |INVOICE |101912352160142303323|'101912352160142303323':1A '9897901309489':2A|0.6079271| |132670|INVENTORY|KBBC75705 101ATI |'101ati':2A 'kbbc75705':1A |0.6079271| +------+---------+---------------------+---------------------------------------------+---------+ +-------------------------------------------------------------------------------------------------------------------------------------------------+ |QUERY PLAN | +-------------------------------------------------------------------------------------------------------------------------------------------------+ |Limit (cost=173.72..173.75 rows=10 width=29) (actual time=1.120..1.122 rows=10 loops=1) | | -> Sort (cost=173.72..174.08 rows=145 width=29) (actual time=1.119..1.120 rows=10 loops=1) | | Sort Key: (ts_rank(mv_fts_entities.search_vector, '''101'':*'::tsquery)) DESC | | Sort Method: top-N heapsort Memory: 25kB | | -> Bitmap Heap Scan on mv_fts_entities (cost=9.93..170.59 rows=145 width=29) (actual time=0.259..0.967 rows=914 loops=1) | | Recheck Cond: (search_vector @@ '''101'':*'::tsquery) | | Heap Blocks: exact=476 | | -> Bitmap Index Scan on mv_fts_entities_search_idx (cost=0.00..9.89 rows=145 width=0) (actual time=0.213..0.213 rows=914 loops=1)| | Index Cond: (search_vector @@ '''101'':*'::tsquery) | |Planning Time: 0.199 ms | |Execution Time: 1.141 ms | +-------------------------------------------------------------------------------------------------------------------------------------------------+ Key points: The Bitmap Index Scan is used for matchingUses Top-N heapsort for ordering (ranking) the resultsThe search time is around 1 ms 2. Case-Sensitive Wildcard Pattern Search SQL EXPLAIN ANALYZE SELECT id, type, number FROM mv_fts_entities WHERE number LIKE '101%' LIMIT 10; This returns the following results and query plan: Plain Text +------+---------+-----------+ |id |type |number | +------+---------+-----------+ |532 |ORDER |TEM1101 | |15642 |ORDER |CCON101 | |310983|INVENTORY|7037618101 | |36445 |INVOICE |83551101 | |1532 |ORDER |TEM2101 | |16642 |ORDER |CCON1101 | |546667|INVENTORY|P0000010101| |13180 |INVOICE |28071101 | |2529 |ORDER |TEM3101 | |17642 |ORDER |CCON2101 | +------+---------+-----------+ +---------------------------------------------------------------------------------------------------------------------+ |QUERY PLAN | +---------------------------------------------------------------------------------------------------------------------+ |Limit (cost=0.00..2268.43 rows=10 width=25) (actual time=0.395..2.265 rows=10 loops=1) | | -> Seq Scan on mv_fts_entities (cost=0.00..66011.44 rows=291 width=25) (actual time=0.393..2.262 rows=10 loops=1)| | Filter: ((number)::text ~~ '101%'::text) | | Rows Removed by Filter: 35657 | |Planning Time: 0.063 ms | |Execution Time: 2.277 ms | +---------------------------------------------------------------------------------------------------------------------+ Key points: The sequential scan is usedThe result set is obviously different from the one at point 1The search time is slightly longer, about 2.2 ms 3. Case-Insensitive Wildcard Pattern Search SQL EXPLAIN ANALYZE SELECT id, type, number FROM mv_fts_entities WHERE number ILIKE '101%' LIMIT 10; This returns the following results and query plan: Plain Text +------+---------+----------------------------+ |id |type |number | +------+---------+----------------------------+ |46370 |INVOICE |101912352160142303323 | |455785|INVENTORY|101t1zflivnminwdcolivomigf | |455782|INVENTORY|101t1zfclevohizhopclevoh62 | |455783|INVENTORY|101t1zfclmmohkfhoohlrdoh87 | |455784|INVENTORY|101t1zfgbtpm11111h00grblmimn| |455786|INVENTORY|101t1zfnilsmimndc0nltpmid0 | |455787|INVENTORY|101t1zfpthrmibiho6pthrmimn | |36819 |INVOICE |101912352160142393323 | |32931 |INVOICE |101912352160142392823 | |8002 |INVOICE |101912352121 | +------+---------+----------------------------+ +----------------------------------------------------------------------------------------------------------------------+ |QUERY PLAN | +----------------------------------------------------------------------------------------------------------------------+ |Limit (cost=0.00..2268.43 rows=10 width=25) (actual time=1.939..11.356 rows=10 loops=1) | | -> Seq Scan on mv_fts_entities (cost=0.00..66011.44 rows=291 width=25) (actual time=1.938..11.353 rows=10 loops=1)| | Filter: ((number)::text ~~* '101%'::text) | | Rows Removed by Filter: 35657 | |Planning Time: 0.098 ms | |Execution Time: 11.369 ms | +----------------------------------------------------------------------------------------------------------------------+ Key points: The sequential scan is usedThe result set is obviously different from the one at point 1, but also from the one at point 2The search time is longer than the ones before, about 11.5 ms 4. Case-Insensitive Regular Expression Search SQL EXPLAIN ANALYZE SELECT id, type, number FROM mv_fts_entities WHERE number ~* '^101' LIMIT 10; This returns the following results and query plan: Plain Text +------+---------+----------------------------+ |id |type |number | +------+---------+----------------------------+ |46370 |INVOICE |101912352160142303323 | |455785|INVENTORY|101t1zflivnminwdcolivomigf | |455782|INVENTORY|101t1zfclevohizhopclevoh62 | |455783|INVENTORY|101t1zfclmmohkfhoohlrdoh87 | |455784|INVENTORY|101t1zfgbtpm11111h00grblmimn| |455786|INVENTORY|101t1zfnilsmimndc0nltpmid0 | |455787|INVENTORY|101t1zfpthrmibiho6pthrmimn | |36819 |INVOICE |101912352160142393323 | |32931 |INVOICE |101912352160142392823 | |8002 |INVOICE |101912352121 | +------+---------+----------------------------+ +---------------------------------------------------------------------------------------------------------------------+ |QUERY PLAN | +---------------------------------------------------------------------------------------------------------------------+ |Limit (cost=0.00..2268.43 rows=10 width=25) (actual time=0.943..6.265 rows=10 loops=1) | | -> Seq Scan on mv_fts_entities (cost=0.00..66011.44 rows=291 width=25) (actual time=0.942..6.262 rows=10 loops=1)| | Filter: ((number)::text ~* '^101'::text) | | Rows Removed by Filter: 35657 | |Planning Time: 0.088 ms | |Execution Time: 6.277 ms | +---------------------------------------------------------------------------------------------------------------------+ Key points: The sequential scan is used.The result set is obviously different from the one at point 1, but identical to the previous one.The search time is longer than the ones using the full-text search, but shorter than the pattern searching, about 6.2 ms. The results obtained in terms of execution time can be summarized as follows: FTS << Case Sens. Pattern Search < Case Insens. Regex Search < Case Insens. Pattern Search The experiment in this article and the results obtained make FTS a candidate worth considering even for pattern searching when its known limitations in the scenarios of interest are acceptable. Moreover, its configuration flexibility in terms of tsvector computation and its speed of execution make it superior in comparison to other solutions, of course, under the presented circumstances. Resources Pattern Searching and PostgreSQL Full-Text Search: Understanding the MismatchThe picture was taken at an immersive exhibition in Bucharest, Romania.

By Horatiu Dan

Adobe Service Runtime: Keep Calm and Shift Down!

Microservices at Adobe Adobe’s transformation from desktop applications to cloud offerings triggered an explosion of microservices. Be it Acrobat, Photoshop, or Adobe Experience Cloud, they are all powered by suites of microservices mainly written in Java. With so many microservices created, every developer had to go through the same painful processes, i.e., security, compliance, scalability, resiliency, etc., to create a production-grade microservice. That was the genesis of Adobe Service Runtime. What Is ASR? ASR or Adobe Service Runtime is an implementation of the Microservice Chassis Pattern. More than 80% of Adobe’s microservices use ASR foundational libraries. It offers cross-cutting concerns that every production-grade microservice is expected to have. Highlights of the cross-cutting concerns included in ASR libraries: Foundational libraries for Java and Python These libraries offer log masking, request stitching, breadcrumb trail, exception handling, async invocation, resiliency, etc.À la carte libsASR connector libs/SDKs to talk to internal Adobe servicesBlessed base containers are the security-blessed containers that accelerate container adoption for applications in any language.Bootstrapping code and infrastructure templates for fast-tracking getting started.Opinionated build system — i.e., how to build a Java application, run tests, launch debug setups, and package into containers.Secure defaults for configurables to ensure teams get started with baselines that have been tested to work. Having cross-cutting concerns as a single chassis helps organizations produce production-grade microservices at scale, just like an automobile manufacturer’s production line. Why ASR? Large organizations often have heavy compliance, security, resiliency, and scalability requirements. ASR provides a collection of foundational libraries, components, tools, and best practices (12 factor). This enables rapid development of four 9s-capable, innovative, and secure services. It also enables a container-first deployment system. Value Proposition We did a study on internal teams to derive the value proposition of ASR. Category Task With ASR Without ASR Velocity Picking frameworks and libraries, getting them to work, setting up project structure and build system, and resolving dependency and build issues so you can start focusing on business logic. Less than 1 hour 1 - 2 weeks Velocity Implementing capabilities like log masking, req stitching, etc. All capabilities are available 'out of the box'. 4-6 weeks Security Legal and security reviews of core code and libraries (not including business logic) 2-3 days 3-6 weeks Community A strong community that empowers decentralized decision-making on feature priorities for service teams. Common framework makes it easy to share code and developers between projects. Diverse frameworks make it hard to share code across projects. Using ASR saved Developers time and improved security posture by not reinventing the wheel. Benchmarks RPS We did some benchmarking to see if ASR has any overhead over vanilla applications. We ran a ten-minute Gatling script to simulate 500 users, for example. AppRequests/second (RPS)ASR % overheadResponse times (p95)ASR % overhead Non-ASR 21678.506n/a 46 n/a ASR 23969.3837% 48 4% ASR Filters Some cross-cutting concerns are offered as Filters, which can add some overhead. Our baseline comparison is the mean requests/sec of 20474.225. Sections below show the performance change with individual filters disabled. ASR logging filter The cost of disabling this is that the ASR service won't log incoming requests and outgoing responses Performance: mean requests/sec 21260.143, a 3.8% improvementASR exception filter The cost of disabling this is that stack traces can escape in exceptions, an ASSET violationPerformance: Mean requests/sec 20988.308, a 2.5% improvementASR request ID filter The cost of disabling this is that the ASR service won't have a unique request ID per request for tracking.Performance: mean requests/sec 21354.042, a 4.3% improvementASR request response filter The cost of disabling this is that the ASR service won't automatically validate the Authorization header in the incoming request (if com.adobe.asr.enable-authorization-header-validation is set to true)Performance: mean requests/sec 20896.923, a 2% improvement The benchmarks reveal that using ASR adds minimal overhead when compared to the functionalities it offers. Security CVE scans often uncover millions of vulnerabilities across codebases in large organizations. If Adobe developers had to manually fix each one, they would spend most of their time on patching rather than building features. By providing secure defaults and hardened components, ASR serves as a foundational library that reduces vulnerability exposure and saves developers valuable time. CVEs The Log4J incident is a testament to the success of ASR. When the CVE was published, users of ASR had to upgrade to just one version of ASR. Non-ASR repos had to scramble to migrate their libs off of Log4j. This clearly demonstrated the significant multiplier impact ASR has created within the company. Sensitive Data in Logs Log masking is another popular feature that is often recreated across the orgs. ASR comes with a modular log masking library that masks sensitive information. Logs that contain credit card, SSN, or any Adobe-defined sensitive info by default are automatically masked. Developers can also extend it to customize masking for additional use cases. This ensures consistent protection of PII across all applications. ASR Connectors and Resiliency ASR has connectors, which can be used to consume APIs exposed by other services inside Adobe. ASR connectors are application environment aware, i.e, a connector will automatically pick the right root URL of the service based on the app environment. For example, if the Application is running in the stage environment, the identity connector will use the identity stage URL; when the application is running in the prod environment, the identity connector will use the prod URL. This is possible due to the AutoConfiguration that ASR provides for all the connectors. One of the challenges with microservices is that different SLAs are honored by services. Your service might have a higher standard, and you must often tolerate other services. By using ASR connectors, microservices get fault-tolerant communication out of the box. ASR connectors leverage Resilience4j to achieve this. Every connector comes with Resiliency features like bulkhead threadpool, circuit breakers, retries, exponential backoff, etc. By using ASR connectors, the posture of a microservice is greatly enhanced. There are guardrails in the thread pool that ensure there is no avalanche of threads. By using retries by default, the stress on Adobe's network is greatly reduced when the availability of the dependent service is degraded. This is a classic example of how pushing the cross-cutting concerns down to a common layer unlocks a lot of value and reduces redundancies. ASR Adoption at Adobe Almost every Java service at Adobe uses at least one of ASR’s libraries. The full suite of ASR is used by 80% or roughly 7000+ services at Adobe and continues to grow. With the growing need to make products more agentic, we see a strong need for libraries that support such use cases. ASR can be a powerful multiplier in enabling harm and bias guardrails, which are highly relevant to both the company and the industry today. Keep Calm and Shift Down! Inspired by shift left, shift down is a paradigm in platform engineering. A lot of cross-cutting concerns must be managed and provided by the platform out of the box. The users of the platform can focus on their functionalities without having to worry about the baseline standards set by Adobe. ASR enables shift down philosophy at scale. Security teams and executives keep calm due to the centralization of best practices and the at-scale adoption of ASR. Developers are at ease due to overhead being handled at the foundational layer. Every company interested in developer productivity and operational excellence should adopt a shift-down strategy like ASR. Over the years, the ROI keeps compounding and helps companies move fast on paved roads that power the developer journeys.

By Anirudh Mathad