Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service
Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.
A Guide to Data-Driven Design and Architecture
Reflections From a DBA
Data Pipelines
Enter the modern data stack: a technology stack designed and equipped with cutting-edge tools and services to ingest, store, and process data. No longer are we using data only to drive business decisions; we are entering a new era where cloud-based systems and tools are at the heart of data processing and analytics. Data-centric tools and techniques — like warehouses and lakes, ETL/ELT, observability, and real-time analytics — are democratizing the data we collect. The proliferation of and growing emphasis on data democratization results in increased and nuanced ways in which data platforms can be used. And of course, by extension, they also empower users to make data-driven decisions with confidence.In our 2023 Data Pipelines Trend Report, we further explore these shifts and improved capabilities, featuring findings from DZone-original research and expert articles written by practitioners from the DZone Community. Our contributors cover hand-picked topics like data-driven design and architecture, data observability, and data integration models and techniques.
Design Patterns
Threat Modeling
GitHub Actions has a large ecosystem of high-quality third-party actions, plus built-in support for executing build steps inside Docker containers. This means it's easy to run end-to-end tests as part of a workflow, often only requiring a single step to run testing tools with all the required dependencies. In this post, I show you how to run browser tests with Cypress and API tests with Postman as part of a GitHub Actions workflow. Getting Started GitHub Actions is a hosted service, so all you need to get started is a GitHub account. All other dependencies, like Software Development Kits (SDKs) or testing tools, are provided by the Docker images or GitHub Actions published by testing platforms. Running Browser Tests With Cypress Cypress is a browser automation tool that lets you interact with web pages in much the same way an end user would, for example by clicking on buttons and links, filling in forms, and scrolling the page. You can also verify the content of a page to ensure the correct results are displayed. The Cypress documentation provides an example first test which has been saved to the junit-cypress-test GitHub repo. The test is shown below: describe('My First Test', () => { it('Does not do much!', () => { expect(true).to.equal(true) }) }) This test is configured to generate a JUnit report file in the cypress.json file: { "reporter": "junit", "reporterOptions": { "mochaFile": "cypress/results/results.xml", "toConsole": true } } The workflow file below executes this test with the Cypress GitHub Action, saves the generated video file as an artifact, and processes the test results. You can find an example of this workflow in the junit-cypress-test repository: name: Cypress on: push: workflow_dispatch: jobs: build: runs-on: ubuntu-latest steps: - name: Checkout uses: actions/checkout@v1 - name: Cypress run uses: cypress-io/github-action@v2 - name: Save video uses: actions/upload-artifact@v2 with: name: sample_spec.js.mp4 path: cypress/videos/sample_spec.js.mp4 - name: Report uses: dorny/test-reporter@v1 if: always() with: name: Cypress Tests path: cypress/results/results.xml reporter: java-junit fail-on-error: true The official Cypress GitHub action is called to execute tests with the default options: - name: Cypress run uses: cypress-io/github-action@v2 Cypress generates a video file capturing the browser as the tests are run. You save the video file as an artifact to be downloaded and viewed after the workflow completes: - name: Save video uses: actions/upload-artifact@v2 with: name: sample_spec.js.mp4 path: cypress/videos/sample_spec.js.mp4 The test results are processed by the dorny/test-reporter action. Note that test reporter has the ability to process Mocha JSON files, and Cypress uses Mocha for reporting, so an arguably more idiomatic solution would be to have Cypress generate Mocha JSON reports. Unfortunately, there is a bug in Cypress that prevents the JSON reporter from saving results as a file. Generating JUnit report files is a useful workaround until this issue is resolved: - name: Report uses: dorny/test-reporter@v1 if: always() with: name: Cypress Tests path: cypress/results/results.xml reporter: java-junit fail-on-error: true Here are the results of the test: The video file artifact is listed in the Summary page: Not all testing platforms provide a GitHub action, in which case you can execute steps against a standard Docker image. This is demonstrated in the next section. Running API Tests With Newman Unlike Cypress, Postman does not provide an official GitHub action. However, you can use the postman/newman Docker image directly inside a workflow. You can find an example of the workflow in the junit-newman-test repository: name: Cypress on: push: workflow_dispatch: jobs: build: runs-on: ubuntu-latest steps: - name: Checkout uses: actions/checkout@v1 - name: Run Newman uses: docker://postman/newman:latest with: args: run GitHubTree.json --reporters cli,junit --reporter-junit-export results.xml - name: Report uses: dorny/test-reporter@v1 if: always() with: name: Cypress Tests path: results.xml reporter: java-junit fail-on-error: true The uses property for a step can either be the name of a published action, or can reference a Docker image directly. In this example, you run the postman/newman docker image, with the with.args parameter defining the command-line arguments: - name: Run Newman uses: docker://postman/newman:latest with: args: run GitHubTree.json --reporters cli,junit --reporter-junit-export results.xml The resulting JUnit report file is then processed by the dorny/test-reporter action: - name: Report uses: dorny/test-reporter@v1 if: always() with: name: Cypress Tests path: results.xml reporter: java-junit fail-on-error: true Here are the results of the test: Behind the scenes, GitHub Actions executes the supplied Docker image with a number of standard environment variables relating to the workflow and with volume mounts that allow the Docker container to persist changes (like the report file) on the main file system. The following is an example of the command to execute a step in a Docker image: /usr/bin/docker run --name postmannewmanlatest_fefcec --label f88420 --workdir /github/workspace --rm -e INPUT_ARGS -e HOME -e GITHUB_JOB -e GITHUB_REF -e GITHUB_SHA -e GITHUB_REPOSITORY -e GITHUB_REPOSITORY_OWNER -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RETENTION_DAYS -e GITHUB_RUN_ATTEMPT -e GITHUB_ACTOR -e GITHUB_WORKFLOW -e GITHUB_HEAD_REF -e GITHUB_BASE_REF -e GITHUB_EVENT_NAME -e GITHUB_SERVER_URL -e GITHUB_API_URL -e GITHUB_GRAPHQL_URL -e GITHUB_WORKSPACE -e GITHUB_ACTION -e GITHUB_EVENT_PATH -e GITHUB_ACTION_REPOSITORY -e GITHUB_ACTION_REF -e GITHUB_PATH -e GITHUB_ENV -e RUNNER_OS -e RUNNER_NAME -e RUNNER_TOOL_CACHE -e RUNNER_TEMP -e RUNNER_WORKSPACE -e ACTIONS_RUNTIME_URL -e ACTIONS_RUNTIME_TOKEN -e ACTIONS_CACHE_URL -e GITHUB_ACTIONS=true -e CI=true -v "/var/run/docker.sock":"/var/run/docker.sock" -v "/home/runner/work/_temp/_github_home":"/github/home" -v "/home/runner/work/_temp/_github_workflow":"/github/workflow" -v "/home/runner/work/_temp/_runner_file_commands":"/github/file_commands" -v "/home/runner/work/junit-newman-test/junit-newman-test":"/github/workspace" postman/newman:latest run GitHubTree.json --reporters cli,junit --reporter-junit-export results.xml This is a complex command, but there are a few arguments that we're interested in. The -e arguments define environment variables for the container. You can see that dozens of workflow environment variables are exposed. The --workdir /github/workspace argument overrides the working directory of the Docker container, while the -v "/home/runner/work/junit-newman-test/junit-newman-test":"/github/workspace" argument mounts the workflow workspace to the /github/workspace directory inside the container. This has the effect of mounting the working directory inside the Docker container, which exposes the checked-out files, and allows any newly created files to persist once the container is shutdown: Because every major testing tool provides a supported Docker image, the process you used to run Newman can be used to run most other testing platforms. Conclusion GitHub Actions has enjoyed widespread adoption among developers, and many platforms provide supported actions for use in workflows. For those cases where there is no suitable action available, GitHub Actions provides an easy way to execute a standard Docker image as part of a workflow. In this post, you learned how to run the Cypress action to execute browser-based tests and how to run the Newman Docker image to execute API tests. Happy deployments!
Seasoned software engineers long for the good old days when web development was simple. You just needed a few files and a server to get up and running. No complicated infrastructure, no endless amount of frameworks and libraries, and no build tools. Just some ideas and some code hacked together to make an app come to life. Whether or not this romanticized past was actually as great as we think it was, developers today agree that software engineering has gotten complicated. There are too many choices with too much setup involved. In response to this sentiment, many products are providing off-the-shelf starter kits and zero-config toolchains to try to abstract away the complexity of software development. One such startup is Zipper, a company that offers an online IDE where you can create applets that run as serverless TypeScript functions in the cloud. With Zipper, you don’t have to spend time worrying about your toolchain — you can just start writing code and deploy your app within minutes. Today, we’ll be looking at a ping pong ranking app I built — once in 2018 with jQuery, MongoDB, Node.js, and Express; and once in 2023 with Zipper. We’ll examine the development process for each and see just how easy it is to build a powerful app using Zipper. Backstory First, a little context: I love to play ping pong. Every office in which I’ve worked has had a ping pong table, and for many years ping pong was an integral part of my afternoon routine. It’s a great game to relax, blow off some steam, strengthen friendships with coworkers, and reset your brain for a half hour. Those who played ping pong every day began to get a feel for who was good and who wasn’t. People would talk. A handful of people were known as the best in the office, and it was always a challenge to take them on. Being both highly competitive and a software engineer, I wanted to build an app to track who was the best ping pong player in the office. This wouldn’t be for bracket-style tournaments, but just for recording the games that were played every day by anybody. With that, we’d have a record of all the games played, and we’d be able to see who was truly the best. This was 2018, and I had a background in the MEAN/MERN stack (MongoDB, Express, Angular, React, and Node.js) and experience with jQuery before that. After dedicating a week’s worth of lunch breaks and nights to this project, I had a working ping-pong ranking app. I didn’t keep close track of my time spent working on the app, but I’d estimate it took about 10–20 hours to build. Here’s what that version of the app looked like. There was a login and signup page: Office Competition Ranking System — Home page The login page asked for your username and password to authenticate: Office Competition Ranking System — Login page Once authenticated, you could record your match by selecting your opponent and who won: Office Competition Ranking System — Record game results page You could view the leaderboard to see the current office rankings. I even included an Elo rating algorithm like they use in chess: Office Competition Ranking System — Leaderboard page Finally, you could click on any of the players to see their specific game history of wins and losses: Office Competition Ranking System — Player history page That was the app I created back in 2018 with jQuery, MongoDB, Node.js, and Express. And, I hosted it on an AWS EC2 server. Now let’s look at my experience recreating this app in 2023 using only Zipper. About Zipper Zipper is an online tool for creating applets. It uses TypeScript and Deno, so JavaScript and TypeScript users will feel right at home. You can use Zipper to build web services, web UIs, scheduled jobs, and even Slack or GitHub integrations. Zipper even includes auth. In short, what I find most appealing about Zipper is how quickly you can take an idea from conception to execution. It’s perfect for side projects or internal-facing apps to quickly improve a business process. Demo App Here’s the ping-pong ranking app I built with Zipper in just three hours. And that includes time reading through the docs and getting up to speed with an unfamiliar platform! First, the app requires authentication. In this case, I’m requiring users to sign in to their Zipper account: Ping pong ranking app — Authentication page Once authenticated, users can record a new ping-pong match: Ping pong ranking app — Record a new match page They can view the leaderboard: Ping pong ranking app — Leaderboard page And they can view the game history for any individual player: Ping pong ranking app — Player history page Not bad! The best part is that I didn’t have to create any of the UI components for this app. All the inputs and table outputs were handled automatically. And, the auth was created for me just by checking a box in the app settings! You can find the working app hosted publicly on Zipper. Ok, now let’s look at how I built this. Creating a New Zipper App First, I created my Zipper account by authenticating with GitHub. Then, on the main dashboard page, I clicked the Create Applet button to create my first applet. Create your first applet Next, I gave my applet a name, which became its URL. I also chose to make my code public and required users to sign in before they could run the applet. Applet configuration Then I chose to generate my app using AI, mostly because I was curious how it would turn out! This was the prompt I gave it: “I’d like to create a leaderboard ranking app for recording wins and losses in ping pong matches. Users should be able to log into the app. Then they should be able to record a match showing who the two players were and who won and who lost. Users should be able to see the leaderboard for all the players, sorted with the best players displayed at the top and the worst players displayed at the bottom. Users should also be able to view a single player to see all of their recorded matches and who they played and who won and who lost.” Applet initialization I might need to get better at prompt engineering because the output didn’t include all the features or pages I wanted. The AI-generated code included two files: a generic “hello world” main.ts file, and a view-player.ts file for viewing the match history of an individual player. main.ts file generated by AI view-player.ts file generated by AI So, the app wasn’t perfect from the get-go, but it was enough to get started. Writing the Ping Pong App Code I knew that Zipper would handle the authentication page for me, so that left three pages to write: A page to record a ping-pong match A page to view the leaderboard A page to view an individual player’s game history Record a New Ping Pong Match I started with the form to record a new ping-pong match. Below is the full main.ts file. We’ll break it down line by line right after this. TypeScript type Inputs = { playerOneID: string; playerTwoID: string; winnerID: string; }; export async function handler(inputs: Inputs) { const { playerOneID, playerTwoID, winnerID } = inputs; if (!playerOneID || !playerTwoID || !winnerID) { return "Error: Please fill out all input fields."; } if (playerOneID === playerTwoID) { return "Error: PlayerOne and PlayerTwo must have different IDs."; } if (winnerID !== playerOneID && winnerID !== playerTwoID) { return "Error: Winner ID must match either PlayerOne ID or PlayerTwo ID"; } const matchID = Date.now().toString(); const matchInfo = { matchID, winnerID, loserID: winnerID === playerOneID ? playerTwoID : playerOneID, }; try { await Zipper.storage.set(matchID, matchInfo); return `Thanks for recording your match. Player ${winnerID} is the winner!`; } catch (e) { return `Error: Information was not written to the database. Please try again later.`; } } export const config: Zipper.HandlerConfig = { description: { title: "Record New Ping Pong Match", subtitle: "Enter who played and who won", }, }; Each file in Zipper exports a handler function that accepts inputs as a parameter. Each of the inputs becomes a form in UI, with the input type being determined by the TypeScript type that you give it. After doing some input validation to ensure that the form was correctly filled out, I stored the match info in Zipper’s key-value storage. Each Zipper applet gets its own storage instance that any of the files in your applet can access. Because it’s a key-value storage, objects work nicely for values since they can be serialized and deserialized as JSON, all of which Zipper handles for you when reading from and writing to the database. At the bottom of the file, I’ve added a HandlerConfig to add some title and instruction text to the top of the page in the UI. With that, the first page is done. Ping pong ranking app — Record a new match page Leaderboard Next up is the leaderboard page. I’ve reproduced the leaderboard.ts file below in full: TypeScript type PlayerRecord = { playerID: string; losses: number; wins: number; }; type PlayerRecords = { [key: string]: PlayerRecord; }; type Match = { matchID: string; winnerID: string; loserID: string; }; type Matches = { [key: string]: Match; }; export async function handler() { const allMatches: Matches = await Zipper.storage.getAll(); const matchesArray: Match[] = Object.values(allMatches); const players: PlayerRecords = {}; matchesArray.forEach((match: Match) => { const { loserID, winnerID } = match; if (players[loserID]) { players[loserID].losses++; } else { players[loserID] = { playerID: loserID, losses: 0, wins: 0, }; } if (players[winnerID]) { players[winnerID].wins++; } else { players[winnerID] = { playerID: winnerID, losses: 0, wins: 0, }; } }); return Object.values(players); } export const config: Zipper.HandlerConfig = { run: true, description: { title: "Leaderboard", subtitle: "See player rankings for all recorded matches", }, }; This file contains a lot more TypeScript types than the first file did. I wanted to make sure my data structures were nice and explicit here. After that, you see our familiar handler function, but this time without any inputs. That’s because the leaderboard page doesn’t need any inputs; it just displays the leaderboard. We get all of our recorded matches from the database, and then we manipulate the data to get it into an array format of our liking. Then, simply by returning the array, Zipper creates the table UI for us, even including search functionality and column sorting. No UI work is needed! Finally, at the bottom of the file, you’ll see a description setup that’s similar to the one on our main page. You’ll also see the run: true property, which tells Zipper to run the handler function right away without waiting for the user to click the Run button in the UI. Ping pong ranking app — Leaderboard page Player History Alright, two down, one to go. Let’s look at the code for the view-player.ts file, which I ended up renaming to player-history.ts: TypeScript type Inputs = { playerID: string; }; type Match = { matchID: string; winnerID: string; loserID: string; }; type Matches = { [key: string]: Match; }; type FormattedMatch = { matchID: string; opponent: string; result: "Won" | "Lost"; }; export async function handler({ playerID }: Inputs) { const allMatches: Matches = await Zipper.storage.getAll(); const matchesArray: Match[] = Object.values(allMatches); const playerMatches = matchesArray.filter((match: Match) => { return playerID === match.winnerID || playerID === match.loserID; }); const formattedPlayerMatches = playerMatches.map((match: Match) => { const formattedMatch: FormattedMatch = { matchID: match.matchID, opponent: playerID === match.winnerID ? match.loserID : match.winnerID, result: playerID === match.winnerID ? "Won" : "Lost", }; return formattedMatch; }); return formattedPlayerMatches; } export const config: Zipper.HandlerConfig = { description: { title: "Player History", subtitle: "See match history for the selected player", }, }; The code for this page looks a lot like the code for the leaderboard page. We include some types for our data structures at the top. Next, we have our handler function which accepts an input for the player ID that we want to view. From there, we fetch all the recorded matches and filter them for only matches in which this player participated. After that, we manipulate the data to get it into an acceptable format to display, and we return that to the UI to get another nice auto-generated table. Ping pong ranking app — Player history page Conclusion That’s it! With just three handler functions, we’ve created a working app for tracking our ping-pong game history. This app does have some shortcomings that we could improve, but we’ll leave that as an exercise for the reader. For example, it would be nice to have a dropdown of users to choose from when recording a new match, rather than entering each player’s ID as text. Maybe we could store each player’s ID in the database and then display those in the UI as a dropdown input type. Or, maybe we’d like to turn this into a Slack integration to allow users to record their matches directly in Slack. That’s an option too! While my ping pong app isn’t perfect, I hope the takeaway here is how easy it is to get up and running with a product like Zipper. You don’t have to spend time agonizing over your app’s infrastructure when you have a simple idea that you just want to see working in production. Just get out there, start building, and deploy!
Developers craft software that both delights consumers and delivers innovative applications for enterprise users. This craft requires more than just churning out heaps of code; it embodies a process of observing, noticing, interviewing, brainstorming, reading, writing, and rewriting specifications; designing, prototyping, and coding to the specifications; reviewing, refactoring and verifying the software; and a virtuous cycle of deploying, debugging and improving. At every stage of this cycle, developers consume and generate two things: code and text. Code is text, after all. The productivity of the developers is limited by real-world realities, challenges with timelines, unclear requirements, legacy codebase, and more. To overcome these obstacles and still meet the deadlines, developers have long relied on adding new tools to their toolbox. For example, code generation tools such as compilers, UI generators, ORM mappers, API generators, etc. Developers have embraced these tools without reservation, progressively evolving them to offer more intelligent functionalities. Modern compilers do more than just translate; they rewrite and optimize the code automatically. SQL, developed fifty years ago as a declarative language with a set of composable English templates, continues to evolve and improve data access experience and developer productivity. Developers have access to an endless array of tools to expand their toolbox. The Emergence of GenAI GenAI is a new, powerful tool for the developer toolbox. GenAI, short for Generative AI, is a subset of AI capable of taking prompts and then autonomously creating many forms of content — text, code, images, videos, music, and more — that imitate and often mirror the quality of human craftsmanship. Prompts are instructions in the form of expository writing. Better prompts produce better text and code. The seismic surge surrounding GenAI, supported with technologies such as ChatGPT and copilot, positions 2023 to be heralded as the “Year of GenAI.” GenAI’s text generation capability is expected to revolutionize every aspect of developer experience and productivity. Impact on Developers Someone recently noted, “In 2023, natural language has emerged as the fastest programming language.” While the previous generation of tools focused on incremental improvement to productivity for writing code and improving code quality, GenAI tools promise to revolutionize these and every other aspect of developer work. ChatGPT can summarize a long requirement specification, give you the delta of what changed between the two versions, or help you come up with a checklist of a specific task. For coding, the impact is dramatic. Since these models have been trained on the entire internet, billions of parameters, and trillions of tokens, they’ve seen a lot of code. By writing a good prompt, you make it to write a big piece of code, design the APIs, and refactor the code. And in just one sentence, you can ask ChatGPT to rewrite everything in a brand-new language. All these possibilities were simply science fiction just a few years ago. It makes the mundane tasks disappear, hard tasks easier, and difficult tasks possible. Developers are relying more on ChatGPT to explain new concepts and clarify confusing ideas. Apparently, this trend has reduced the traffic to StackOverflow, a popular Q&A site for developers, anywhere between 16% to 50%, on various measures! Developers choose the winning tool. But there’s a catch. More than one, in fact. The GenAI tools of the current generation, although promising, are unaware of your goals and objectives. These tools, developed through training on a vast array of samples, operate by predicting the succeeding token, one at a time, rooted firmly in the patterns they have previously encountered. Their answer is guided and constrained by the prompt. To harness their potential effectively, it becomes imperative to craft detailed, expository-style prompts. This nudges the technology to produce output that is closer to the intended goal, albeit with a style and creativity that is bounded by their training data. They excel in replicating styles they have been exposed to but fall short in inventing unprecedented ones. Multiple companies and groups are busy with training LLMs for specific tasks to improve their content generation. I recommend heeding the advice of Sathya Nadella, Microsoft’s CEO, who suggests it is prudent to treat the content generated by GenAI as a draft, requiring thorough review to ensure its clarity and accuracy. The onus falls on the developer to delineate between routine tasks and those demanding creativity — a discernment that remains beyond GenAI’s reach, at least for now. Despite this, with justifiable evidence, GenAI promises improved developer experience and productivity. OpenAI’s ChatGPT raced to 100 million users in a record time. Your favorite IDEs have plugins to exploit it. Microsoft has promised to use GenAI in all its products, including its revitalized search offering, bing.com. Google has answered with its own suite of services and products; Facebook and others have released multiple models to help developers progress. It’s a great time to be a developer. The revolution has begun promptly. At Couchbase, we’ve introduced generative AI capabilities into our Database as a Service Couchbase Capella to significantly enhance developer productivity and accelerate time to market for modern applications. The new capability called Capella iQ enables developers to write SQL++ and application-level code more quickly by delivering recommended sample code.
This is an article from DZone's 2023 Database Systems Trend Report.For more: Read the Report Hearing the vague statement, "We have a problem with the database," is a nightmare for any database manager or administrator. Sometimes it's true, sometimes it's not, and what exactly is the issue? Is there really a database problem? Or is it a problem with networking, an application, a user, or another possible scenario? If it is a database, what is wrong with it? Figure 1: DBMS usage Databases are a crucial part of modern businesses, and there are a variety of vendors and types to consider. Databases can be hosted in a data center, in the cloud, or in both for hybrid deployments. The data stored in a database can be used in various ways, including websites, applications, analytical platforms, etc. As a database administrator or manager, you want to be aware of the health and trends of your databases. Database monitoring is as crucial as databases themselves. How good is your data if you can't guarantee its availability and accuracy? Database Monitoring Considerations Database engines and databases are systems hosted on a complex IT infrastructure that consists of a variety of components: servers, networking, storage, cables, etc. Database monitoring should be approached holistically with consideration of all infrastructure components and database monitoring itself. Figure 2: Database monitoring clover Let's talk more about database monitoring. As seen in Figure 2, I'd combine monitoring into four pillars: availability, performance, activity, and compliance. These are broad but interconnected pillars with overlap. You can add a fifth "clover leaf" for security monitoring, but I include that aspect of monitoring into activity and compliance, for the same reason capacity planning falls into availability monitoring. Let's look deeper into monitoring concepts. While availability monitoring seems like a good starting topic, I will deliberately start with performance since performance issues may render a database unavailable and because availability monitoring is "monitoring 101" for any system. Performance Monitoring Performance monitoring is the process of capturing, analyzing, and alerting to performance metrics of hardware, OS, network, and database layers. It can help avoid unplanned downtimes, improve user experience, and help administrators manage their environments efficiently. Native Database Monitoring Most, if not all, enterprise-grade database systems come with a set of tools that allow database professionals to examine internal and/or external database conditions and the operational status. These are system-specific, technical tools that require SME knowledge. In most cases, they are point-in-time performance data with limited or non-existent historical value. Some vendors provide additional tools to simplify performance data collection and analysis. With an expansion of cloud-based offerings (PaaS or IaaS), I've noticed some improvements in monitoring data collection and the available analytics and reporting options. However, native performance monitoring is still a set of tools for a database SME. Enterprise Monitoring Systems Enterprise monitoring systems (EMSs) offer a centralized approach to keeping IT systems under systematic review. Such systems allow monitoring of most IT infrastructure components, thus consolidating supervised systems with a set of dashboards. There are several vendors offering comprehensive database monitoring systems to cover some or all your monitoring needs. Such solutions can cover multiple database engines or be specific to a particular database engine or a monitoring aspect. For instance, if you only need to monitor SQL servers and are interested in the performance of your queries, then you need a monitoring system that identifies bottlenecks and contentions. Let's discuss environments with thousands of database instances (on-premises and in a cloud) scattered across multiple data centers across the globe. This involves monitoring complexity growth with a number of monitored devices, database type diversity, and geographical locations of your data centers and actual data that you monitor. It is imperative to have a global view of all database systems under one management and an ability to identify issues, preferably before they impact your users. EMSs are designed to help organizations align database monitoring with IT infrastructure monitoring, and most solutions include an out-of-the-box set of dashboards, reports, graphs, alerts, useful tips, and health history and trends analytics. They also have pre-set industry-outlined thresholds for performance counters/metrics that should be adjusted to your specific conditions. Manageability and Administrative Overhead Native database monitoring is usually handled by a database administrator (DBA) team. If it needs to be automated, expanded, or have any other modifications, then DBA/development teams would handle that. This can be efficiently managed by DBAs in a large enterprise environment on a rudimental level for internal DBA specific use cases. Bringing in a third-party system (like an EMS) requires management. Hypothetically, a vendor has installed and configured monitoring for your company. That partnership can continue, or internal personnel can take over EMS management (with appropriate training). There is no "wrong" approach — it solely depends on your company's operating model and is assessed accordingly. Data Access and Audit Compliance Monitoring Your databases must be secure! Unauthorized access to sensitive data could be as harmful as data loss. Data breaches, malicious activities (intentional or not) — no company would be happy with such publicity. That brings us to audit compliance and data access monitoring. There are many laws and regulations around data compliance. Some are common between industries, some are industry-specific, and some are country-specific. For instance, SOX compliance is required for all public companies in numerous countries, and US healthcare must follow HIPAA regulations. Database management teams must implement a set of policies, procedures, and processes to enforce laws and regulations applicable to their company. Audit reporting could be a tedious and cumbersome process, but it can and should be automated. While implementing audit compliance and data access monitoring, you can improve your database audit reporting, as well — it's virtually the same data set. What do we need to monitor to comply with various laws and regulations? These are normally mandatory: Access changes and access attempts Settings and/or objects modifications Data modifications/access Database backups Who should be monitored? Usually, access to make changes to a database or data is strictly controlled: Privileged accounts – usually DBAs; ideally, they shouldn't be able to access data, but that is not always possible in their job so activity must be monitored Service accounts – either database or application service accounts with rights to modify objects or data "Power" accounts – users with rights to modify database objects or data "Lower" accounts – accounts with read-only activity As with performance monitoring, most database engines provide a set of auditing tools and mechanisms. Another option is third-party compliance software, which uses database-native auditing, logs, and tracing to capture compliance-related data. It provides audit data storage capabilities and, most importantly, a set of compliance reports and dashboards to adhere to a variety of compliance policies. Compliance complexity directly depends on regulations that apply to your company and the diversity and size of your database ecosystem. While we monitor access and compliance, we want to ensure that our data is not being misused. An adequate measure should be in place for when unauthorized access or abnormal data usage is detected. Some audit compliance monitoring systems provide means to block abnormal activities. Data Corruption and Threats Database data corruption is a serious issue that could lead to a permanent loss of valuable data. Commonly, data corruption occurs due to hardware failures, but it could be due to database bugs or even bad coding. Modern database engines have built-in capabilities to detect and sometimes prevent data corruption. Data corruption will generate an appropriate error code that should be monitored and highlighted. Checking database integrity should be a part of the periodical maintenance process. Other threats include intentional or unintentional data modification and ransomware. While data corruption and malicious data modification can be detected by DBAs, ransomware threats fall outside of the monitoring scope for database professionals. It is imperative to have a bulletproof backup to recover from those threats. Key Database Performance Metrics Database performance metrics are extremely important data points that measure the health of database systems and help database professionals maintain efficient support. Some of the metrics are specific to a database type or vendor, and I will generalize them as "internal counters." Availability The first step in monitoring is to determine if a device or resource is available. There is a thin line between system and database availability. A database could be up and running, but clients may not be able to access it. With that said, we need to monitor the following metrics: Network status – Can you reach the database over the network? If yes, what is the latency? While network status may not commonly fall into the direct responsibility of a DBA, database components have configuration parameters that might be responsible for a loss of connectivity. Server up/down Storage availability Service up/down – another shared area between database and OS support teams Whether the database is online or offline CPU, Memory, Storage, and Database Internal Metrics The next important set of server components which could, in essence, escalate into an availability issue are CPU, memory, and storage. The following four performance areas are tightly interconnected and affect each other: Lack of available memory High CPU utilization Storage latency or throughput bottleneck Set of database internal counters which could provide more content to utilization issues For instance, lack of memory may force a database engine to read and write data more frequently, creating contention on the IO system. 100% CPU utilization could often cause an entire database server to stop responding. Numerous database internal counters can help database professionals analyze use trends and identify an appropriate action to mitigate potential impact. Observability Database observability is based on metrics, traces, and logs — what we supposedly collected based on the discussion above. There are a plethora of factors that may affect system and application availability and customer experience. Database performance metrics are just a single set of possible failure points. Supporting the infrastructure underneath a database engine is complex. To successfully monitor a database, we need to have a clear picture of the entire ecosystem and the state of its components while monitoring. Relevant performance data collected from various components can be a tremendous help in identifying and addressing issues before they occur. The entire database monitoring concept is data driven, and it is our responsibility to make it work for us. Monitoring data needs to tell us a story that every consumer can understand. With database observability, this story can be transparent and provide a clear view of your database estate. Balanced Monitoring As you could gather from this article, there are many points of failure in any database environment. While database monitoring is the responsibility of database professionals, it is a collaborative effort of multiple teams to ensure that your entire IT ecosystem is operational. So what's considered "too much" monitoring and when is it not enough? I will use DBAs' favorite phrase: it depends. Assess your environment – It would be helpful to have a configuration management database. If you don't, create a full inventory of your databases and corresponding applications: database sizes, number of users, maintenance schedules, utilization times — as many details as possible. Assess your critical systems – Outline your critical systems and relevant databases. Most likely those will fall into a category of maximum monitoring: availability, performance, activity, and compliance. Assess your budget – It's not uncommon to have a tight cash flow allocated to IT operations. You may or may not have funds to purchase a "we-monitor-everything" system, and certain monitoring aspects would have to be developed internally. Find a middle ground – Your approach to database monitoring is unique to your company's requirements. Collecting monitoring data that has no practical or actionable applications is not efficient. Defining actionable KPIs for your database monitoring is a key to finding a balance — monitor what your team can use to ensure systems availability, stability, and satisfied customers. Remember: Successful database monitoring is data-driven, proactive, continuous, actionable, and collaborative. This is an article from DZone's 2023 Database Systems Trend Report.For more: Read the Report
In this blog, you will take a closer look at Podman Desktop, a graphical tool when you are working with containers. Enjoy! Introduction Podman is a container engine, just as Docker is. Podman commands are to be executed by means of a CLI (Command Line Interface), but it would come in handy when a GUI would be available. That is exactly the purpose of Podman Desktop! As stated on the Podman Desktop website: “Podman Desktop is an open source graphical tool enabling you to seamlessly work with containers and Kubernetes from your local environment.” In the next sections, you will execute most of the commands as executed in the two previous posts. If you are new to Podman, it is strongly advised to read those two posts first before continuing. Is Podman a Drop-in Replacement for Docker? Podman Equivalent for Docker Compose Sources used in this blog can be found on GitHub. Prerequisites Prerequisites for this blog are: Basic Linux knowledge, Ubuntu 22.04 is used during this blog; Basic Podman knowledge, see the previous blog posts; Podman version 3.4.4 is used in this blog because that is the version available for Ubuntu although the latest stable release is version 4.6.0 at the time of writing. Installation and Startup First of all, Podman Desktop needs to be installed, of course. Go to the downloads page. When using the Download button, a flatpak file will be downloaded. Flatpak is a framework for distributing desktop applications across various Linux distributions. However, this requires you to install flatpak. A tar.gz file is also available for download, so use this one. After downloading, extract the file to /opt: Shell $ sudo tar -xvf podman-desktop-1.2.1.tar.gz -C /opt/ In order to start Podman Desktop, you only need to double-click the podman-desktop file. The Get Started with Podman Desktop screen is shown. Click the Go to Podman Desktop button, which will open the Podman Desktop main screen. As you can see from the screenshot, Podman Desktop detects that Podman is running but also that Docker is running. This is already a nice feature because this means that you can use Podman Desktop for Podman as well as for Docker. At the bottom, a Docker Compatibility warning is shown, indicating that the Docker socket is not available and some Docker-specific tools will not function correctly. But this can be fixed, of course. In the left menu, you can find the following items from top to bottom: the dashboard, the containers, the pods, the images, and the volumes. Build an Image The container image you will try to build consists out of a Spring Boot application. It is a basic application containing one Rest endpoint, which returns a hello message. There is no need to build the application. You do need to download the jar-file and put it into a target directory at the root of the repository. The Dockerfile you will be using is located in the directory podman-desktop. Choose in the left menu the Images tab. Also note that in the screenshot, both Podman images and Docker images are shown. Click the Build an Image button and fill it in as follows: Containerfile path: select file podman-desktop/1-Dockerfile. Build context directory: This is automatically filled out for you with the podman-desktop directory. However, you need to change this to the root of the repository; otherwise, the jar-file is not part of the build context and cannot be found by Podman. Image Name: docker.io/mydeveloperplanet/mypodmanplanet:0.0.1-SNAPSHOT Container Engine: Podman Click the Build button. This results in the following error: Shell Uploading the build context from <user directory>/mypodmanplanet...Can take a while... Error:(HTTP code 500) server error - potentially insufficient UIDs or GIDs available in user namespace (requested 262143:262143 for /var/tmp/libpod_builder2108531042/bError:Error: (HTTP code 500) server error - potentially insufficient UIDs or GIDs available in user namespace (requested 262143:262143 for /var/tmp/libpod_builder2108531042/build/.git): Check /etc/subuid and /etc/subgid: lchown /var/tmp/libpod_builder2108531042/build/.git: invalid argument This error sounds familiar because the error was also encountered in a previous blog. Let’s try to build the image via the command line: Shell $ podman build . --tag docker.io/mydeveloperplanet/mypodmanplanet:0.0.1-SNAPSHOT -f podman-desktop/1-Dockerfile The image is built without any problem. An issue has been raised for this problem. At the time of writing, building an image via Podman Desktop is not possible. Start a Container Let’s see whether you can start the container. Choose in the left menu the Containers tab and click the Create a Container button. A choice menu is shown. Choose an Existing image. The Images tab is shown. Click the Play button on the right for the mypodmanplanet image. A black screen is shown, and no container is started. Start the container via CLI: Shell $ podman run -p 8080:8080 --name mypodmanplanet -d docker.io/mydeveloperplanet/mypodmanplanet:0.0.1-SNAPSHOT The running container is now visible in Podman Desktop. Test the endpoint, and this functions properly. Shell $ curl http://localhost:8080/hello Hello Podman! Same conclusion as for building the image. At the time of writing, it is not possible to start a container via Podman Desktop. What is really interesting is the actions menu. You can view the container logs. The Inspect tab shows you the details of the container. The Kube tab shows you what the Kubernetes deployment yaml file will look like. The Terminal tab gives you access to a terminal inside the container. You can also stop, restart, and remove the container from Podman Desktop. Although starting the container did not work, Podman Desktop offers some interesting features that make it easier to work with containers. Volume Mount Remove the container from the previous section. You will create the container again, but this time with a volume mount to a specific application.properties file, which will ensure that the Spring Boot application runs on port 8082 inside the container. Execute the following command from the root of the repository: Shell $ podman run -p 8080:8082 --volume ./properties/application.properties:/opt/app/application.properties:ro --name mypodmanplanet -d docker.io/mydeveloperplanet/mypodmanplanet:0.0.1-SNAPSHOT The container is started successfully, but an error message is shown in Podman Desktop. This error will show up regularly from now on. Restarting Podman Desktop resolves the issue. An issue has been filed for this problem. Unfortunately, the issue cannot be reproduced consistently. The volume is not shown in the Volumes tab, but that’s because it is an anonymous volume. Let’s create a volume and see whether this shows up in the Volumes tab. Shell $ podman volume create myFirstVolume myFirstVolume The volume is not shown in Podman Desktop. It is available via the command line, however. Shell $ podman volume ls DRIVER VOLUME NAME local myFirstVolume Viewing volumes is not possible with Podman Desktop at the time of writing. Delete the volume. Shell $ podman volume rm myFirstVolume myFirstVolume Create Pod In this section, you will create a Pod containing two containers. The setup is based on the one used for a previous blog. Choose in the left menu the Pods tab and click the Play Kubernetes YAML button. Select the YAML file Dockerfiles/hello-pod-2-with-env.yaml. Click the Play button. The Pod has started. Check the Containers tab, and you will see the three containers which are part of the Pod. Verify whether the endpoints are accessible. Shell $ curl http://localhost:8080/hello Hello Podman! $ curl http://localhost:8081/hello Hello Podman! The Pod can be stopped and deleted via Podman Desktop. Sometimes, Podman Desktop stops responding after deleting the Pod. After a restart of Podman Desktop, the Pod can be deleted without experiencing this issue. Conclusion Podman Desktop is a nice tool with some fine features. However, quite some bugs were encountered when using Podman Desktop (I did not create an issue for all of them). This might be due to the older version of Podman, which is available for Ubuntu, but then I would have expected that an incompatibility warning would be raised when starting Podman Desktop. However, it is a nice tool, and I will keep on using it for the time being.
A poison pill is a message deliberately sent to a Kafka topic, designed to consistently fail when consumed, regardless of the number of consumption attempts. Poison Pill scenarios are frequently underestimated and can arise if not properly accounted for. Neglecting to address them can result in severe disruptions to the seamless operation of an event-driven system. The poison pill for various reasons: The failure of deserialization of the consumed bytes from the Kafka topic on the consumer side. Incompatible serializer and deserializer between the message producer and consumer Corrupted records Data/Message was still being produced to the same Kafka topic even if the producer altered the key or value serializer. A different producer began publishing messages to the Kafka topic using a different key or value serializer. The consumer configured the wrong key or value deserializer, which is not at all compatible with the serializer on the message producer side. The consequences of poison pills if not handled properly: Consumer shutdown. When a consumer receives a poison pill message from the topic, it stops processing and terminates. If we surround the message consumption code with a try/catch block inside the consumer, log files get flooded with error messages and stack traces and, eventually excessive disk space consumption on the system or nodes in the cluster. The poison pill message will block the partition of the topic, stopping the processing of any additional messages. As a result, the processing of the message will be tried again, most likely extremely quickly, placing a heavy demand on the system’s resources. To prevent poison messages in Apache Kafka, we need to design our Kafka consumer application and Kafka topic-handling strategy in a way that can handle and mitigate the impact of problematic or malicious messages. Proper serialization: Use a well-defined and secure serialization format for your messages, such as Avro or JSON Schema. This can help prevent issues related to the deserialization of malformed or malicious messages by consumers. Message validation: Ensure that messages being produced to Kafka topics are validated to meet expected formats and constraints before they are published. This can be done by implementing strict validation rules or schemas for the messages. Messages that do not conform to these rules should be rejected at the producer level. Timeouts and deadlines: Set timeouts and processing deadlines for your consumers. If a message takes too long to process, consider it a potential issue and handle it accordingly. This can help prevent consumers from getting stuck on problematic messages. Consumer restart strategies: Consider implementing strategies for automatically restarting consumers who encounter errors or become unresponsive. Tools like Apache Kafka Streams and Kafka Consumer Groups provide mechanisms for handling consumer failures and rebalancing partitions. Versioned topics: When evolving your message schemas, create new versions of topics rather than modifying existing ones. This allows for backward compatibility and prevents consumers from breaking due to changes in message structure. When message loss is unacceptable, a code fix will be required to specifically handle the poison pill message. Besides, we can configure a dead letter queue (DLQ) and send the poison poll messages to it for retrying or analyzing the root cause. If the message or event loss is acceptable to a certain extent, then by executing the built-in kafka-consumer-groups.sh script from the terminal, we can reset the consumer offset either to “–to-latest” or to a specific time. Thus, by executing this, all the messages, including the poison pill will be skipped that have not been consumed so far. But we need to make sure that the consumer group is not active. Otherwise, the offsets of a consumer or consumer group won’t be changed. Shell kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group ourMessageConsumerGroup --reset-offsets --to-latest –-topic myTestTopic –execute Or a specific time Shell kafka-consumer-groups.sh –-bootstrap-server localhost:9092 –-group ourMessageConsumerGroup –-reset-offsets –to-datetime 2023-07-20T00:00:00.000 –-topic myTestTopic –execute Hope you have enjoyed this read. Please like and share if you feel this composition is valuable.
Concurrent programming is the art of juggling multiple tasks in a software application effectively. In the realm of Java, this means threading — a concept that has been both a boon and a bane for developers. Java's threading model, while powerful, has often been considered too complex and error-prone for everyday use. Enter Project Loom, a paradigm-shifting initiative designed to transform the way Java handles concurrency. In this blog, we'll embark on a journey to demystify Project Loom, a groundbreaking project aimed at bringing lightweight threads, known as fibers, into the world of Java. These fibers are poised to revolutionize the way Java developers approach concurrent programming, making it more accessible, efficient, and enjoyable. But before we dive into the intricacies of Project Loom, let's first understand the broader context of concurrency in Java. Understanding Concurrency in Java Concurrency is the backbone of modern software development. It allows applications to perform multiple tasks simultaneously, making the most of available resources, particularly in multi-core processors. Java, from its inception, has been a go-to language for building robust and scalable applications that can efficiently handle concurrent tasks. In Java, concurrency is primarily achieved through threads. Threads are lightweight sub-processes within a Java application that can be executed independently. These threads enable developers to perform tasks concurrently, enhancing application responsiveness and performance. However, traditional thread management in Java has its challenges. Developers often grapple with complex and error-prone aspects of thread creation, synchronization, and resource management. Threads, while powerful, can also be resource-intensive, leading to scalability issues in applications with a high thread count. Java introduced various mechanisms and libraries to ease concurrent programming, such as the java.util.concurrent package, but the fundamental challenges remained. This is where Project Loom comes into play. What Is Project Loom? Project Loom is an ambitious endeavor within the OpenJDK community that aims to revolutionize Java concurrency by introducing lightweight threads, known as fibers. These fibers promise to simplify concurrent programming in Java and address many of the pain points associated with traditional threads. The primary goal of Project Loom is to make concurrency more accessible, efficient, and developer-friendly. It achieves this by reimagining how Java manages threads and by introducing fibers as a new concurrency primitive. Fibers are not tied to native threads, which means they are lighter in terms of resource consumption and easier to manage. One of the key driving forces behind Project Loom is reducing the complexity associated with threads. Traditional threads require careful management of thread pools, synchronization primitives like locks and semaphores, and error-prone practices like dealing with thread interruption. Fibers simplify this by providing a more lightweight and predictable model for concurrency. Moreover, Project Loom aims to make Java more efficient by reducing the overhead associated with creating and managing threads. In traditional thread-based concurrency, each thread comes with its own stack and requires significant memory resources. Fibers, on the other hand, share a common stack, reducing memory overhead and making it possible to have a significantly larger number of concurrent tasks. Project Loom is being developed with the idea of being backward-compatible with existing Java codebases. This means that developers can gradually adopt fibers in their applications without having to rewrite their entire codebase. It's designed to seamlessly integrate with existing Java libraries and frameworks, making the transition to this new concurrency model as smooth as possible. Fibers: The Building Blocks of Lightweight Threads Fibers are at the heart of Project Loom. They represent a new concurrency primitive in Java, and understanding them is crucial to harnessing the power of lightweight threads. Fibers, sometimes referred to as green threads or user-mode threads, are fundamentally different from traditional threads in several ways. First and foremost, fibers are not tied to native threads provided by the operating system. In traditional thread-based concurrency, each thread corresponds to a native thread, which can be resource-intensive to create and manage. Fibers, on the other hand, are managed by the Java Virtual Machine (JVM) itself and are much lighter in terms of resource consumption. One of the key advantages of fibers is their lightweight nature. Unlike traditional threads, which require a separate stack for each thread, fibers share a common stack. This significantly reduces memory overhead, allowing you to have a large number of concurrent tasks without exhausting system resources. Fibers also simplify concurrency by eliminating some of the complexities associated with traditional threads. For instance, when working with threads, developers often need to deal with issues like thread interruption and synchronization using locks. These complexities can lead to subtle bugs and make code harder to reason about. Fibers provide a more straightforward model for concurrency, making it easier to write correct and efficient code. To work with fibers in Java, you'll use the java.lang.Fiber class. This class allows you to create and manage fibers within your application. You can think of fibers as lightweight, cooperative threads that are managed by the JVM, and they allow you to write highly concurrent code without the pitfalls of traditional thread management. Getting Started With Project Loom Before you can start harnessing the power of Project Loom and its lightweight threads, you need to set up your development environment. At the time of writing, Project Loom was still in development, so you might need to use preview or early-access versions of Java to experiment with fibers. Here are the steps to get started with Project Loom: Choose the right Java version: Project Loom features might not be available in the stable release of Java. You may need to download an early-access version of Java that includes Project Loom features. Check the official OpenJDK website for the latest releases and versions that support Project Loom. Install and configure your development environment: Download and install the chosen Java version on your development machine. Configure your IDE (Integrated Development Environment) to use this version for your Project Loom experiments. Import Project Loom libraries: Depending on the Java version you choose, you may need to include Project Loom libraries in your project. Refer to the official documentation for instructions on how to do this. Create a simple fiber: Start by creating a basic Java application that utilizes fibers. Create a simple task that can run concurrently using a fiber. You can use the java.lang.Fiber class to create and manage fibers within your application. Compile and run your application: Compile your application and run it using the configured Project Loom-enabled Java version. Observe how fibers operate and how they differ from traditional threads. Experiment and learn: Explore more complex scenarios and tasks where fibers can shine. Experiment with asynchronous programming, I/O-bound operations, and other concurrency challenges using fibers. Benefits of Lightweight Threads in Java Project Loom's introduction of lightweight threads, or fibers, into the Java ecosystem brings forth a myriad of benefits for developers and the applications they build. Let's delve deeper into these advantages: Efficiency: Fibers are more efficient than traditional threads. They are lightweight, consuming significantly less memory, and can be created and destroyed with much less overhead. This efficiency allows you to have a higher number of concurrent tasks without worrying about resource exhaustion. Simplicity: Fibers simplify concurrent programming. With fibers, you can write code that is easier to understand and reason about. You'll find yourself writing less boilerplate code for thread management, synchronization, and error handling. Scalability: The reduced memory footprint of fibers translates to improved scalability. Applications that need to handle thousands or even millions of concurrent tasks can do so more efficiently with fibers. Responsiveness: Fibers enhance application responsiveness. Tasks that would traditionally block a thread can now yield control to the fiber scheduler, allowing other tasks to run in the meantime. This results in applications that feel more responsive and can better handle user interactions. Compatibility: Project Loom is designed to be backward-compatible with existing Java codebases. This means you can gradually adopt fibers in your applications without a full rewrite. You can incrementally update your code to leverage lightweight threads where they provide the most benefit. Resource utilization: Fibers can improve resource utilization in applications that perform asynchronous I/O operations, such as web servers or database clients. They allow you to efficiently manage a large number of concurrent connections without the overhead of traditional threads. Reduced complexity: Code that deals with concurrency often involves complex patterns and error-prone practices. Fibers simplify these complexities, making it easier to write correct and efficient concurrent code. It's important to note that while Project Loom promises significant advantages, it's not a one-size-fits-all solution. The choice between traditional threads and fibers should be based on the specific needs of your application. However, Project Loom provides a powerful tool that can simplify many aspects of concurrent programming in Java and deserves consideration in your development toolkit. Project Loom Best Practices Now that you have an understanding of Project Loom and the benefits it offers, let's dive into some best practices for working with fibers in your Java applications: Choose the right concurrency model: While fibers offer simplicity and efficiency, they may not be the best choice for every scenario. Evaluate your application's specific concurrency requirements to determine whether fibers or traditional threads are more suitable. Limit blocking operations: Fibers are most effective in scenarios with a high degree of concurrency and tasks that may block, such as I/O operations. Use fibers for tasks that can yield control when waiting for external resources, allowing other fibers to run. Avoid thread synchronization: One of the advantages of fibers is reduced reliance on traditional synchronization primitives like locks. Whenever possible, use non-blocking or asynchronous techniques to coordinate between fibers, which can lead to more efficient and scalable code. Keep error handling in mind: Exception handling in fibers can be different from traditional threads. Be aware of how exceptions propagate in fiber-based code and ensure you have proper error-handling mechanisms in place. Use thread pools: Consider using thread pools with fibers for optimal resource utilization. Thread pools can efficiently manage the execution of fibers while controlling the number of active fibers to prevent resource exhaustion. Stay updated: Project Loom is an evolving project, and new features and improvements are regularly introduced. Stay updated with the latest releases and documentation to take advantage of the latest enhancements. Experiment and benchmark: Before fully adopting fibers in a production application, experiment with different scenarios and benchmark the performance to ensure that fibers are indeed improving your application's concurrency. Profile and debug: Familiarize yourself with tools and techniques for profiling and debugging fiber-based applications. Tools like profilers and debuggers can help you identify and resolve performance bottlenecks and issues. Project Loom and Existing Libraries/Frameworks One of the remarkable aspects of Project Loom is its compatibility with existing Java libraries and frameworks. As a developer, you don't have to discard your existing codebase to leverage the benefits of fibers. Here's how Project Loom can coexist with your favorite Java tools: Java standard library: Project Loom is designed to seamlessly integrate with the Java standard library. You can use fibers alongside existing Java classes and packages without modification. Concurrency libraries: Popular Java concurrency libraries, such as java.util.concurrent, can be used with fibers. You can employ thread pools and other concurrency utilities to manage and coordinate fiber execution. Frameworks and web servers: Java frameworks and web servers like Spring, Jakarta EE, and Apache Tomcat can benefit from Project Loom. Fibers can improve the efficiency of handling multiple client requests concurrently. Database access: If your application performs database access, fibers can be used to efficiently manage database connections. They allow you to handle a large number of concurrent database requests without excessive resource consumption. Third-party libraries: Most third-party libraries that are compatible with Java can be used in conjunction with Project Loom. Ensure that you're using Java versions compatible with Project Loom features. Asynchronous APIs: Many Java libraries and frameworks offer asynchronous APIs that align well with fibers. You can utilize these APIs to write non-blocking, efficient code. Project Loom's compatibility with existing Java ecosystem components is a significant advantage. It allows you to gradually adopt fibers where they provide the most value in your application while preserving your investment in existing code and libraries. Future of Project Loom As Project Loom continues to evolve and make strides in simplifying concurrency in Java, it's essential to consider its potential impact on the future of Java development. Here are some factors to ponder: Increased adoption: As developers become more familiar with fibers and their benefits, Project Loom could see widespread adoption. This could lead to the creation of a vast ecosystem of libraries and tools that leverage lightweight threads. Enhancements and improvements: Project Loom is still in development, and future releases may bring further enhancements and optimizations. Keep an eye on the project's progress and be ready to embrace new features and improvements. Easier concurrency education: With the simplification of concurrency, Java newcomers may find it easier to grasp the concepts of concurrent programming. This could lead to a more significant talent pool of Java developers with strong concurrency skills. Concurrency-driven architectures: Project Loom's efficiency and ease of use might encourage developers to design and implement more concurrency-driven architectures. This could result in applications that are highly responsive and scalable. Feedback and contributions: Get involved with the Project Loom community by providing feedback, reporting issues, and even contributing to the project's development. Your insights and contributions can shape the future of Project Loom. Conclusion In this journey through Project Loom, we've explored the evolution of concurrency in Java, the introduction of lightweight threads known as fibers, and the potential they hold for simplifying concurrent programming. Project Loom represents a significant step forward in making Java more efficient, developer-friendly, and scalable in the realm of concurrent programming. As you embark on your own exploration of Project Loom, remember that while it offers a promising future for Java concurrency, it's not a one-size-fits-all solution. Evaluate your application's specific needs and experiment with fibers to determine where they can make the most significant impact. The world of Java development is continually evolving, and Project Loom is just one example of how innovation and community collaboration can shape the future of the language. By embracing Project Loom, staying informed about its progress, and adopting best practices, you can position yourself to thrive in the ever-changing landscape of Java development.
This is an article from DZone's 2023 Database Systems Trend Report.For more: Read the Report In today's rapidly evolving digital landscape, businesses across the globe are embracing cloud computing to streamline operations, reduce costs, and drive innovation. At the heart of this digital transformation lies the critical role of cloud databases — the backbone of modern data management. With the ever-growing volume of data generated for business, education, and technology, the importance of scalability, security, and cloud services has become paramount in choosing the right cloud vendor. In this article, we will delve into the world of primary cloud vendors, taking an in-depth look at their offerings and analyzing the crucial factors that set them apart: scalability, security, and cloud services for cloud databases. Armed with this knowledge, businesses can make informed decisions as they navigate the vast skies of cloud computing and select the optimal vendor to support their unique data management requirements. Scaling in the Cloud One of the fundamental advantages of cloud databases is their ability to scale in response to increasing demands for storage and processing power. Scalability can be achieved in two primary ways: horizontally and vertically. Horizontal scaling, also known as scale-out, involves adding more servers to a system, distributing the load across multiple nodes. Vertical scaling, or scale-up, refers to increasing the capacity of existing servers by adding more resources such as CPU, memory, and storage. Benefits of Scalability By distributing workloads across multiple servers or increasing the resources available on a single server, cloud databases can optimize performance and prevent bottlenecks, ensuring smooth operation even during peak times. Scalability allows organizations to adapt to sudden spikes in demand or changing requirements without interrupting services. By expanding or contracting resources as needed, businesses can maintain uptime and avoid costly outages. By scaling resources on-demand, organizations can optimize infrastructure costs, paying only for what they use. This flexible approach allows for more efficient resource allocation and cost savings compared to traditional on-premises infrastructure. Examples of Cloud Databases With Scalability Several primary cloud vendors offer scalable cloud databases designed to meet the diverse needs of organizations. The most popular releases encompass database platforms from licensed versions to open source, such as MySQL and PostgreSQL. In public clouds, there are three major players in the arena: Amazon, Microsoft Azure, and Google. The major cloud vendors offer managed cloud databases in various flavors of both licensed and open-source database platforms. These databases are easily scalable in storage and compute resources, but all controlled through service offerings. Scalability is about more power in the cloud, although some cloud databases are able to scale out, too. Figure 1: Scaling up behind the scenes in the cloud Each cloud vendor provides various high availability and scalability options with minimal manual intervention, allowing organizations to scale instances up or down and add replicas for read-heavy workloads or maintenance offloading. Securing Data in the Cloud As organizations increasingly embrace cloud databases to store and manage their sensitive data, ensuring robust security has become a top priority. While cloud databases offer numerous advantages, they also come with potential risks, such as data breaches, unauthorized access, and insider threats. In this section, we will explore the security features that cloud databases provide and discuss how they help mitigate these risks. Common Security Risks Data breaches aren't a question of if, but a question of when. Unauthorized access to sensitive data can lead to data access by those who shouldn't, potentially resulting in reputational damage, financial losses, and regulatory penalties. It shouldn't surprise anyone that cloud databases can be targeted by cybercriminals attempting to gain unauthorized access to data. This risk makes it essential to implement strict access controls at all levels — cloud, network, application, and database. As much as we don't like to think about it, disgruntled employees or other insiders can pose a significant threat to organizations' data security, as they may have legitimate access to the system but misuse it for malicious or unintentional abuse. Security Features in Cloud Databases One of the largest benefits of a public cloud vendor is the numerous first-party and partner security offerings, which can offer better security for cloud databases. Cloud databases offer robust access control mechanisms, such as role-based access control (RBAC) and multi-factor authentication (MFA), to ensure that only authorized users can access data. These features help prevent unauthorized access and reduce the risk of insider threats. Figure 2: Database security in the public cloud The second most implemented protection method is encryption and data level protection. To protect data from unauthorized access, cloud databases provide various encryption methods. These different levels and layers of encryption help secure data throughout its lifecycle. Encryption comes in three main methods: Encryption at rest protects data stored on a disk by encrypting it using strong encryption algorithms. Encryption in transit safeguards data as it travels between the client and the server or between different components within the database service. Encryption in use encrypts data while it is being processed or used by the database, ensuring that data remains secure even when in memory. Compliance and Regulations Cloud database providers often adhere to strict compliance standards and regulations, such as the General Data Protection Regulation (GDPR), the Health Insurance Portability and Accountability Act (HIPAA), and the Payment Card Industry Data Security Standard (PCI-DSS). Compliance with these regulations helps ensure that organizations meet their legal and regulatory obligations, further enhancing data security. Integrating cloud databases with identity and access management (IAM) services, such as AWS Identity and Access Management, Azure Active Directory, and Google Cloud Identity, helps enforce strict security and access control policies. This integration ensures that only authorized users can access and interact with the cloud database, enhancing overall security. Cloud Services and Databases Cloud databases not only provide efficient storage and management of data but can also be seamlessly integrated with various other cloud services to enhance their capabilities. By leveraging these integrations, organizations can access powerful tools for insights, analytics, security, and quality. In this section, we will explore some popular cloud services that can be integrated with cloud databases and discuss their benefits. Cloud Machine Learning Services Machine learning services in the cloud enable organizations to develop, train, and deploy machine learning models using their cloud databases as data sources. These services can help derive valuable insights and predictions from stored data, allowing businesses to make data-driven decisions and optimize processes. With today's heavy investment in artificial intelligence (AI), no one should be surprised that Cloud Services for AI are at the top of the services list. AI services in the cloud, such as natural language processing, computer vision, and speech recognition, can be integrated with cloud databases to unlock new capabilities. These integrations enable organizations to analyze unstructured data, automate decision-making, and improve user experiences. Cloud Databases and Integration Integrating cloud databases with data warehouse solutions, such as Amazon Redshift, Google BigQuery, Azure Synapse Analytics, and Snowflake, allows organizations to perform large-scale data analytics and reporting. This combination provides a unified platform for data storage, management, and analysis, enabling businesses to gain deeper insights from their data. Along with AI and machine learning, cloud databases can be integrated with business intelligence (BI) tools like Tableau, Power BI, and Looker to create visualizations and dashboards. By connecting BI tools to cloud databases, organizations can easily analyze and explore data, empowering them to make informed decisions based on real-time insights. Data streaming and integrating cloud databases with services like Amazon Kinesis, Azure Stream Analytics, and Google Cloud Pub/Sub enable organizations to process and analyze data in real time, providing timely insights and improving decision-making. By integrating cloud databases with monitoring and alerting services, such as Amazon CloudWatch, Azure Monitor, and Google Cloud Monitoring, organizations can gain insights into the health and performance of their databases. These services allow businesses to set up alerts, monitor key performance indicators (KPIs), and troubleshoot issues in real time. Data Pipelines and ETL Services Data pipelines and ETL services are the final services from the category of integration, such as AWS Glue, Azure Data Factory, and Google Cloud Data Fusion, that can be integrated with relational cloud databases to automate data ingestion, transformation, and loading processes, ensuring seamless data flow between systems. Conclusion The scalability of cloud databases is an essential factor for organizations looking to manage their growing data needs effectively. Along with scalability, security plays a critical aspect of cloud databases, and it is crucial for organizations to understand the features and protections offered by their chosen provider. By leveraging robust access control, encryption, and compliance measures, businesses can significantly reduce the risks associated with data breaches, unauthorized access, and insider threats, ensuring that their sensitive data remains secure and protected in the cloud. Finally, to offer the highest return on investment, integrating cloud databases with other services unlocks the powerful analytics and insights available in the public cloud. By leveraging these integrations, organizations can enhance the capabilities of their cloud databases and optimize their data management processes, driving innovation and growth in the digital age. This is an article from DZone's 2023 Database Systems Trend Report.For more: Read the Report
This is an article from DZone's 2023 Database Systems Trend Report.For more: Read the Report Database design is a critical factor in microservices and cloud-native solutions because a microservices-based architecture results in distributed data. Instead of data management happening in a single process, multiple processes can manipulate the data. The rise of cloud computing has made data even more distributed. To deal with this complexity, several data management patterns have emerged for microservices and cloud-native solutions. In this article, we will look at the most important patterns that can help us manage data in a distributed environment. The Challenges of Database Design for Microservices and the Cloud Before we dig into the specific data management patterns, it is important to understand the key challenges with database design for microservices and the cloud: In a microservices architecture, data is distributed across different nodes. Some of these nodes can be in different data centers in completely different geographic regions of the world. In this situation, it is tough to guarantee consistency of data across all the nodes. At any given point in time, there can be differences in the state of data between various nodes. This is also known as the problem of eventual consistency. Since the data is distributed, there's no central authority that manages data like in single-node monolithic systems. It's important for the various participating systems to use a mechanism (e.g., consensus algorithms) for data management. The attack surface for malicious actors is larger in a microservices architecture since there are multiple moving parts. This means we need to establish a more robust security posture while building microservices. The main promise of microservices and the cloud is scalability. While it becomes easier to scale the application processes, it is not so easy to scale the database nodes horizontally. Without proper scalability, databases can turn into performance bottlenecks. Diving Into Data Management Patterns Considering the associated challenges, several patterns are available to manage data in microservices and cloud-native applications. The main job of these patterns is to facilitate the developers in addressing the various challenges mentioned above. Let's look at each of these patterns one by one. Database per Service As the name suggests, this pattern proposes that each microservices manages its own data. This implies that no other microservices can directly access or manipulate the data managed by another microservice. Any exchange or manipulation of data can be done only by using a set of well-defined APIs. The figure below shows an example of a database-per-service pattern. Figure 1: Database-per-service pattern At face value, this pattern seems quite simple. It can be implemented relatively easily when we are starting with a brand-new application. However, when we are migrating an existing monolithic application to a microservices architecture, the demarcation between services is not so clear. Most of the functionality is written in a way where different parts of the system access data from other parts informally. Two main areas that we need to focus on when using a database-per-service pattern: Defining bounded contexts for each service Managing business transactions spanning multiple microservices Shared Database The next important pattern is the shared database pattern. Though this pattern supports microservices architecture, it adopts a much more lenient approach by using a shared database accessible to multiple microservices. For existing applications transitioning to a microservices architecture, this is a much safer pattern, as we can slowly evolve the application layer without changing the database design. However, this approach takes away some benefits of microservices: Developers across teams need to coordinate schema changes to tables. Runtime conflicts may arise when multiple services are trying to access the same database resources. CQRS and Event Sourcing In the command query responsibility segregation (CQRS) pattern, an application listens to domain events from other microservices and updates a separate database for supporting views and queries. We can then serve complex aggregation queries from this separate database while optimizing the performance and scaling it up as needed. Event sourcing takes it a bit further by storing the state of the entity or the aggregate as a sequence of events. Whenever we have an update or an insert on an object, a new event is created and stored in the event store. We can use CQRS and event sourcing together to solve a lot of challenges around event handling and maintaining separate query data. This way, you can scale the writes and reads separately based on their individual requirements. Figure 2: Event sourcing and CQRS in action together On the downside, this is an unfamiliar style of building applications for most developers, and there are more moving parts to manage. Saga Pattern The saga pattern is another solution for handling business transactions across multiple microservices. For example, placing an order on a food delivery app is a business transaction. In the saga pattern, we break this business transaction into a sequence of local transactions handled by different services. For every local transaction, the service that performs the transaction publishes an event. The event triggers a subsequent transaction in another service, and the chain continues until the entire business transaction is completed. If any particular transaction in the chain fails, the saga rolls back by executing a series of compensating transactions that undo the impact of all the previous transactions. There are two types of saga implementations: Orchestration-based saga Choreography-based saga Sharding Sharding helps in building cloud-native applications. It involves separating rows of one table into multiple different tables. This is also known as horizontal partitioning, but when the partitions reside on different nodes, they are known as shards. Sharding helps us improve the read and write scalability of the database. Also, it improves the performance of queries because a particular query must deal with fewer records as a result of sharding. Replication Replication is a very important data management pattern. It involves creating multiple copies of the database. Each copy is identical and runs on a different server or node. Changes made to one copy are propagated to the other copies. This is known as replication. There are several types of replication approaches, such as: Single-leader replication Multi-leader replication Leaderless replication Replication helps us achieve high availability and boosts reliability, and it lets us scale out read operations since read requests can be diverted to multiple servers. Figure 3 below shows sharding and replication working in combination. Figure 3: Using sharding and replication together Best Practices for Database Design in a Cloud-Native Environment While these patterns can go a long way in addressing data management issues in microservices and cloud-native architecture, we also need to follow some best practices to make life easier. Here are a few best practices: We must try to design a solution for resilience. This is because faults are inevitable in a microservices architecture, and the design should accommodate failures and recover from them without disrupting the business. We must implement proper migration strategies when transitioning to one of the patterns. Some of the common strategies that can be evaluated are schema first versus data first, blue-green deployments, or using the strangler pattern. Don't ignore backups and well-tested disaster recovery systems. These things are important even for single-node databases. However, in a distributed data management approach, disaster recovery becomes even more important. Constant monitoring and observability are equally important in microservices or cloud-native applications. For example, techniques like sharding can lead to unbalanced partitions and hotspots. Without proper monitoring solutions, any reactions to such situations may come too late and may put the business at risk. Conclusion We can conclude that good database design is absolutely vital in a microservices and cloud-native environment. Without proper design, an application will face multiple problems due to the inherent complexity of distributed data. Multiple data management patterns exist to help us deal with data in a more reliable and scalable manner. However, each pattern has its own challenges and set of advantages and disadvantages. No pattern fits all the possible scenarios, and we should select a particular pattern only after managing the various trade-offs. This is an article from DZone's 2023 Database Systems Trend Report.For more: Read the Report
This is an article from DZone's 2023 Database Systems Trend Report.For more: Read the Report Good database design is essential to ensure data accuracy, consistency, and integrity and that databases are efficient, reliable, and easy to use. The design must address the storing and retrieving of data quickly and easily while handling large volumes of data in a stable way. An experienced database designer can create a robust, scalable, and secure database architecture that meets the needs of modern data systems. Architecture and Design A modern data architecture for microservices and cloud-native applications involves multiple layers, and each one has its own set of components and preferred technologies. Typically, the foundational layer is constructed as a storage layer, encompassing one or more databases such as SQL, NoSQL, or NewSQL. This layer assumes responsibility for the storage, retrieval, and management of data, including tasks like indexing, querying, and transaction management. To enhance this architecture, it is advantageous to design a data access layer that resides above the storage layer but below the service layer. This data access layer leverages technologies like object-relational mapping or data access objects to simplify data retrieval and manipulation. Finally, at the topmost layer lies the presentation layer, where the information is skillfully presented to the end user. The effective transmission of data through the layers of an application, culminating in its presentation as meaningful information to users, is of utmost importance in a modern data architecture. The goal here is to design a scalable database with the ability to handle a high volume of traffic and data while minimizing downtime and performance issues. By following best practices and addressing a few challenges, we can meet the needs of today's modern data architecture for different applications. Figure 1: Layered architecture Considerations By taking into account the following considerations when designing a database for enterprise-level usage, it is possible to create a robust and efficient system that meets the specific needs of the organization while ensuring data integrity, availability, security, and scalability. One important consideration is the data that will be stored in the database. This involves assessing the format, size, complexity, and relationships between data entities. Different types of data may require specific storage structures and data models. For instance, transactional data often fits well with a relational database model, while unstructured data like images or videos may require a NoSQL database model. The frequency of data retrieval or access plays a significant role in determining the design considerations. In read-heavy systems, implementing a cache for frequently accessed data can enhance query response times. Conversely, the emphasis may be on lower data retrieval frequencies for data warehouse scenarios. Techniques such as indexing, caching, and partitioning can be employed to optimize query performance. Ensuring the availability of the database is crucial for maintaining optimal application performance. Techniques such as replication, load balancing, and failover are commonly used to achieve high availability. Additionally, having a robust disaster recovery plan in place adds an extra layer of protection to the overall database system. As data volumes grow, it is essential that the database system can handle increased loads without compromising performance. Employing techniques like partitioning, sharding, and clustering allows for effective scalability within a database system. These approaches enable the efficient distribution of data and workload across multiple servers or nodes. Data security is a critical consideration in modern database design, given the rising prevalence of fraud and data breaches. Implementing robust access controls, encryption mechanisms for sensitive personally identifiable information, and conducting regular audits are vital for enhancing the security of a database system. In transaction-heavy systems, maintaining consistency in transactional data is paramount. Many databases provide features such as appropriate locking mechanisms and transaction isolation levels to ensure data integrity and consistency. These features help to prevent issues like concurrent data modifications and inconsistencies. Challenges Determining the most suitable tool or technology for our database needs can be a challenge due to the rapid growth and evolving nature of the database landscape. With different types of databases emerging daily and even variations among vendors offering the same type, it is crucial to plan carefully based on your specific use cases and requirements. By thoroughly understanding our needs and researching the available options, we can identify the right tool with the appropriate features to meet our database needs effectively. Polyglot persistence is a consideration that arises from the demand of certain applications, leading to the use of multiple SQL or NoSQL databases. Selecting the right databases for transactional systems, ensuring data consistency, handling financial data, and accommodating high data volumes pose challenges. Careful consideration is necessary to choose the appropriate databases that can fulfill the specific requirements of each aspect while maintaining overall system integrity. Integrating data from different upstream systems, each with its own structure and volume, presents a significant challenge. The goal is to achieve a single source of truth by harmonizing and integrating the data effectively. This process requires comprehensive planning to ensure compatibility and future-proofing the integration solution to accommodate potential changes and updates. Performance is an ongoing concern in both applications and database systems. Every addition to the database system can potentially impact performance. To address performance issues, it is essential to follow best practices when adding, managing, and purging data, as well as properly indexing, partitioning, and implementing encryption techniques. By employing these practices, you can mitigate performance bottlenecks and optimize the overall performance of your database system. Considering these factors will contribute to making informed decisions and designing an efficient and effective database system for your specific requirements. Advice for Building Your Architecture Goals for a better database design should include efficiency, scalability, security, and compliance. In the table below, each goal is accompanied by its corresponding industry expectation, highlighting the key aspects that should be considered when designing a database for optimal performance, scalability, security, and compliance. GOALS FOR DATABASE DESIGN Goal Industry Expectation Efficiency Optimal performance and responsiveness of the database system, minimizing latency and maximizing throughput. Efficient handling of data operations, queries, and transactions. Scalability Ability to handle increasing data volumes, user loads, and concurrent transactions without sacrificing performance. Scalable architecture that allows for horizontal or vertical scaling to accommodate growth. Security Robust security measures to protect against unauthorized access, data breaches, and other security threats. Implementation of access controls, encryption, auditing mechanisms, and adherence to industry best practices and compliance regulations. Compliance Adherence to relevant industry regulations, standards, and legal requirements. Ensuring data privacy, confidentiality, and integrity. Implementing data governance practices and maintaining audit trails to demonstrate compliance. Table 1 When building your database architecture, it's important to consider several key factors to ensure the design is effective and meets your specific needs. Start by clearly defining the system's purpose, data types, volume, access patterns, and performance expectations. Consider clear requirements that provide clarity on the data to be stored and the relationships between the data entities. This will help ensure that the database design aligns with quality standards and conforms to your requirements. Also consider normalization, which enables efficient storage use by minimizing redundant data, improves data integrity by enforcing consistency and reliability, and facilitates easier maintenance and updates. Selecting the right database model or opting for polyglot persistence support is crucial to ensure the database aligns with your specific needs. This decision should be based on the requirements of your application and the data it handles. Planning for future growth is essential to accommodate increasing demand. Consider scalability options that allow your database to handle growing data volumes and user loads without sacrificing performance. Alongside growth, prioritize data protection by implementing industry-standard security recommendations and ensuring appropriate access levels are in place and encourage implementing IT security measures to protect the database from unauthorized access, data theft, and security threats. A good back-up system is a testament to the efficiency of a well-designed database. Regular backups and data synchronization, both on-site and off-site, provide protection against data loss or corruption, safeguarding your valuable information. To validate the effectiveness of your database design, test the model using sample data from real-world scenarios. This testing process will help validate the performance, reliability, and functionality of the database system you are using, ensuring it meets your expectations. Good documentation practices play a vital role in improving feedback systems and validating thought processes and implementations during the design and review phases. Continuously improving documentation will aid in future maintenance, troubleshooting, and system enhancement efforts. Primary and secondary keys contribute to data integrity and consistency. Use indexes to optimize database performance by indexing frequently queried fields and limiting the number of fields returned in queries. Regularly backing up the database protects against data loss during corruption, system failure, or other unforeseen circumstances. Data archiving and purging practices help remove infrequently accessed data, reducing the size of the active dataset. Proper error handling and logging aid in debugging, troubleshooting, and system maintenance. Regular maintenance is crucial for growing database systems. Plan and schedule regular backups, perform performance tuning, and stay up to date with software upgrades to ensure optimal database performance and stability. Conclusion Designing a modern data architecture that can handle the growing demands of today's digital world is not an easy job. However, if you follow best practices and take advantage of the latest technologies and techniques, it is very much possible to build a scalable, flexible, and secure database. It just requires the right mindset and your commitment to learning and improving with a proper feedback loop. Additional reading: Semantic Modeling for Data: Avoiding Pitfalls and Breaking Dilemmas by Panos Alexopoulos Learn PostgreSQL: Build and manage high-performance database solutions using PostgreSQL 12 and 13 by Luca Ferrari and Enrico Pirozzi Designing Data-Intensive Applications by Martin Kleppmann This is an article from DZone's 2023 Database Systems Trend Report.For more: Read the Report
How To Visualize Temporal.io Workflows
September 29, 2023 by
From Pure Agile to Agile Testing Practices
September 29, 2023 by
Revolutionizing IoT With Digital Temperature Sensors
September 29, 2023 by
Configuring RaptorX Multi-Level Caching With Presto
September 29, 2023 by
Explainable AI: Making the Black Box Transparent
May 16, 2023 by
Enhancing ASP.NET Core Web API Responses With Consistent and Predictable Wrapper Classes
September 29, 2023 by
How To Visualize Temporal.io Workflows
September 29, 2023 by
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
Enhancing ASP.NET Core Web API Responses With Consistent and Predictable Wrapper Classes
September 29, 2023 by
Untangling Deadlocks Caused by Java’s "parallelStream"
September 29, 2023 by
From Pure Agile to Agile Testing Practices
September 29, 2023 by
Implementing BCDR Testing in Your Software Development Lifecycle
September 29, 2023 by
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
Untangling Deadlocks Caused by Java’s "parallelStream"
September 29, 2023 by
Configuring RaptorX Multi-Level Caching With Presto
September 29, 2023 by
Five IntelliJ Idea Plugins That Will Change the Way You Code
May 15, 2023 by