Migrate, Modernize and Build Java Web Apps on Azure: This live workshop will cover methods to enhance Java application development workflow.
Modern Digital Website Security: Prepare to face any form of malicious web activity and enable your sites to optimally serve your customers.
Software design and architecture focus on the development decisions made to improve a system's overall structure and behavior in order to achieve essential qualities such as modifiability, availability, and security. The Zones in this category are available to help developers stay up to date on the latest software design and architecture trends and techniques.
Cloud architecture refers to how technologies and components are built in a cloud environment. A cloud environment comprises a network of servers that are located in various places globally, and each serves a specific purpose. With the growth of cloud computing and cloud-native development, modern development practices are constantly changing to adapt to this rapid evolution. This Zone offers the latest information on cloud architecture, covering topics such as builds and deployments to cloud-native environments, Kubernetes practices, cloud databases, hybrid and multi-cloud environments, cloud computing, and more!
Containers allow applications to run quicker across many different development environments, and a single container encapsulates everything needed to run an application. Container technologies have exploded in popularity in recent years, leading to diverse use cases as well as new and unexpected challenges. This Zone offers insights into how teams can solve these challenges through its coverage of container performance, Kubernetes, testing, container orchestration, microservices usage to build and deploy containers, and more.
Integration refers to the process of combining software parts (or subsystems) into one system. An integration framework is a lightweight utility that provides libraries and standardized methods to coordinate messaging among different technologies. As software connects the world in increasingly more complex ways, integration makes it all possible facilitating app-to-app communication. Learn more about this necessity for modern software development by keeping a pulse on the industry topics such as integrated development environments, API best practices, service-oriented architecture, enterprise service buses, communication architectures, integration testing, and more.
A microservices architecture is a development method for designing applications as modular services that seamlessly adapt to a highly scalable and dynamic environment. Microservices help solve complex issues such as speed and scalability, while also supporting continuous testing and delivery. This Zone will take you through breaking down the monolith step by step and designing a microservices architecture from scratch. Stay up to date on the industry's changes with topics such as container deployment, architectural design patterns, event-driven architecture, service meshes, and more.
Performance refers to how well an application conducts itself compared to an expected level of service. Today's environments are increasingly complex and typically involve loosely coupled architectures, making it difficult to pinpoint bottlenecks in your system. Whatever your performance troubles, this Zone has you covered with everything from root cause analysis, application monitoring, and log management to anomaly detection, observability, and performance testing.
The topic of security covers many different facets within the SDLC. From focusing on secure application design to designing systems to protect computers, data, and networks against potential attacks, it is clear that security should be top of mind for all developers. This Zone provides the latest information on application vulnerabilities, how to incorporate security earlier in your SDLC practices, data governance, and more.
Containers
The proliferation of containers in recent years has increased the speed, portability, and scalability of software infrastructure and deployments across all kinds of application architectures and cloud-native environments. Now, with more and more organizations migrated to the cloud, what's next? The subsequent need to efficiently manage and monitor containerized environments remains a crucial task for teams. With organizations looking to better leverage their containers — and some still working to migrate out of their own monolithic environments — the path to containerization and architectural modernization remains a perpetual climb. In DZone's 2023 Containers Trend Report, we will explore the current state of containers, key trends and advancements in global containerization strategies, and constructive content for modernizing your software architecture. This will be examined through DZone-led research, expert community articles, and other helpful resources for designing and building containerized applications.
A Comprehensive Approach to Performance Monitoring and Observability
Observability Maturity Model
This blog post focuses on optimizing the size of JVM Docker images. It explores various techniques such as multi-stage builds, jlink, jdeps, and experimenting with base images. By implementing these optimizations, deployments can be faster, and resource usage can be optimized. The Problem Since Java 11, there is no pre-bundled JRE provided. As a result, basic Dockerfiles without any optimization can result in large image sizes. In the absence of a provided JRE, it becomes necessary to explore techniques and optimizations to reduce the size of JVM Docker images. Now, let's take a look at the simplest version of the Dockerfile for our application and see what's wrong with it. The project we will use in all the examples is Spring Petclinic. The simplest Dockerfile for our project looks like this: NOTE: Do not forget to build your JAR file. Dockerfile FROM eclipse-temurin:17 VOLUME /tmp COPY target/spring-petclinic-3.1.0-SNAPSHOT.jar app.jar After we have built the JAR file of our project, let's build our Dockerfile image and compare the sizes of our JAR file and the created Docker image. Dockerfile docker build -t spring-pet-clinic/jdk -f Dockerfile . docker image ls spring-pet-clinic/jdk # REPOSITORY TAG IMAGE ID CREATED SIZE # spring-pet-clinic/jdk latest 3dcd0ab89c3d 23 minutes ago 465MB If we look at the SIZE column, we can see that the size of our Docker image is 465MB! That's a lot, you might think, but maybe it's because our JAR is pretty big? In order to verify this, let's take a look at the size of our JAR file using the following command: Dockerfile ls -lh target/spring-petclinic-3.1.0-SNAPSHOT.jar | awk '{print $9, $5}' # target/spring-petclinic-3.1.0-SNAPSHOT.jar 55M According to the output of our command, you can see that the size of our JAR file is only 55MB. If we compare it to the size of a built Docker image, our JAR file is almost nine times smaller! Let's move on to analyze the reasons and how to make it smaller. What Are the Reasons for Big Docker Images, and How To Reduce Them? Before we move on to the optimization of our Docker image, we need to find out what exactly is causing it to be so relatively large. To do this, we will use a tool called Dive which is used for exploring a docker image, layer contents, and discovering ways to shrink the size of your Docker/OCI image. To install Dive, follow the guide in their README: Now, let’s find out why our Docker image has such a size by exploring layers by using this command: dive spring-pet-clinic/jdk (instead of spring-pet-clinic/jdk use your Docker image name). Its output may feel a little bit overwhelming, but don’t worry, we will explore its output together. For our purpose, we are mostly interested only in the top left part, which is the layers of our Docker image. We can navigate between layers by using the “arrow” buttons. Now, let’s find out which layers our Docker image consists of. Remember, these are the layers of Docker image built from our basic Dockerfile. The first layer is our operating system. By default, it is Ubuntu. In the next one, it installs tzdata, curl, wget, locales, and some more different utils, which takes 50MB! The third layer, as you can see from the screenshot above, is our entire Eclipse Temurin 17 JDK, and it takes 279MB, which is pretty big. And the last one is our built JAR, which takes 58MB. Now that we understand what our Docker image consists of, we can see that a big part of our Docker image includes the entire JDK and things such as timezones, locales, and different utilities, which is unnecessary. The first optimization for our Docker images is to use jlink tool included in Java 9 along with modularity. With jlink, we can create a custom Java runtime that includes only the necessary components, resulting in a smaller final image. Now, let's take a look at our new Dockerfile incorporating the jlink tool, which, in theory, should be smaller than the previous one. Dockerfile # Example of custom Java runtime using jlink in a multi-stage container build FROM eclipse-temurin:17 as jre-build # Create a custom Java runtime RUN $JAVA_HOME/bin/jlink \ --add-modules ALL-MODULE-PATH \ --strip-debug \ --no-man-pages \ --no-header-files \ --compress=2 \ --output /javaruntime # Define your base image FROM debian:buster-slim ENV JAVA_HOME=/opt/java/openjdk ENV PATH "${JAVA_HOME}/bin:${PATH}" COPY --from=jre-build /javaruntime $JAVA_HOME # Continue with your application deployment RUN mkdir /opt/app COPY target/spring-petclinic-3.1.0-SNAPSHOT.jar /opt/app/app.jar CMD ["java", "-jar", "/opt/app/app.jar"] To understand how our new Dockerfile works, let's walk through it: We use multi-stage Docker build in this Dockerfile and it consists of 2 stages. For the first stage, we use the same base image as in the previous Dockerfile. Also, we employ jlink tool to create a custom JRE, including all Java modules using —add-modules ALL-MODULE-PATH The second stage uses the debian:buster-slim base image and sets the environment variables for JAVA_HOME and PATH. It copies the custom JRE created in the first stage to the image. The Dockerfile then creates a directory for the application, copies the application JAR file into it, and specifies a command to run the Java application when the container starts. Let’s now build our container image and find out how much smaller it has become. Dockerfile docker build -t spring-pet-clinic/jlink -f Dockerfile_jlink . docker image ls spring-pet-clinic/jlink # REPOSITORY TAG IMAGE ID CREATED SIZE # spring-pet-clinic/jlink latest e7728584dea5 1 hours ago 217MB Our new container image is 217MB in size, which is two times smaller than our previous one. Stripping Container Image Size, Even More, Using Java Dependency Analysis Tool (Jdeps) What if I told you that the size of our container image can be made even smaller? When paired with jlink, you can also use the Java Dependency Analysis Tool (jdeps), which was first introduced in Java 8, to understand the static dependencies of your applications and libraries. In our previous example, for the jlink —add-modules parameter, we set ALL-MODULE-PATH which adds all existing Java modules in our custom JRE, and obviously, we don’t need to include every module. This way we can use jdeps to analyze the project's dependencies and remove any unused ones, further reducing the image size. Let’s take a look at how to use jdeps in our Dockerfile: Dockerfile # Example of custom Java runtime using jlink in a multi-stage container build FROM eclipse-temurin:17 as jre-build COPY target/spring-petclinic-3.1.0-SNAPSHOT.jar /app/app.jar WORKDIR /app # List jar modules RUN jar xf app.jar RUN jdeps \ --ignore-missing-deps \ --print-module-deps \ --multi-release 17 \ --recursive \ --class-path 'BOOT-INF/lib/*' \ app.jar > modules.txt # Create a custom Java runtime RUN $JAVA_HOME/bin/jlink \ --add-modules $(cat modules.txt) \ --strip-debug \ --no-man-pages \ --no-header-files \ --compress=2 \ --output /javaruntime # Define your base image FROM debian:buster-slim ENV JAVA_HOME=/opt/java/openjdk ENV PATH "${JAVA_HOME}/bin:${PATH}" COPY --from=jre-build /javaruntime $JAVA_HOME # Continue with your application deployment RUN mkdir /opt/server COPY --from=jre-build /app/app.jar /opt/server/ CMD ["java", "-jar", "/opt/server/app.jar"] Even without going into details, you can see that our Dockerfile has become much larger. Now let's analyze each piece and what it is responsible for: We still use multi-stage Docker build. Copy our built Java app and set WORKDIR to /app. Unpacks the JAR file, making its contents accessible for jdeps tool. The second RUN instruction runs jdeps tool on the extracted JAR file to analyze its dependencies and create a list of required Java modules. Here's what each option does: --ignore-missing-deps: Ignores any missing dependencies, allowing the analysis to continue. --print-module-deps: Specifies that the analysis should print the module dependencies. --multi-release 17: Indicates that the application JAR is compatible with multiple Java versions, in our case, Java 17. --recursive: Performs a recursive analysis to identify dependencies at all levels. --class-path 'BOOT-INF/lib/*': Defines the classpath for the analysis, instructing "jdeps" to look in the "BOOT-INF/lib" directory within the JAR file. app.jar > modules.txt: Redirects the output of the "jdeps" command to a file named "modules.txt," which will contain the list of Java modules required by the application. Then, we replace the ALL-MODULE-PATH value for —add-modules jlink parameter with $(cat modules.txt) to include only necessary modules # Define your base image section stays the same as in the previous Dockerfile. # Continue with your application deployment was modified to COPY out JAR file from the previous stage. The only thing left to do is to see how much the container image has shrunk using our latest Dockerfile: Dockerfile docker build -t spring-pet-clinic/jlink_jdeps -f Dockerfile_jdeps . docker image ls spring-pet-clinic/jlink_jdeps # REPOSITORY TAG IMAGE ID CREATED SIZE # spring-pet-clinic/jlink_jdeps latest d24240594f1e 3 hours ago 184MB So, by using only the modules we need to run our application, we reduced the size of our container image by 33MB, not a lot, but still nice. Conclusion Let's take another look, using Dive, at how our Docker images have shrunk after our optimizations. Instead of using the entire JDK, in this case, we built our custom JRE using jlink tool and using debian-slim base image. Which significantly reduced our image size. And, as you can see, we don’t have unnecessary stuff, such as timezones, locales, big OS, and entire JDK. We include only what we use and need. Dockerfile_jlink Here, we went even further and passed only used Java modules to our JRE, making the built JRE even smaller, thus reducing the size of the entire final image. Dockerfile_jdeps In conclusion, reducing the size of JVM Docker images can significantly optimize resource usage and speed up deployments. Employing techniques like multi-stage builds, jlink, jdeps, and experimenting with base images can make a substantial difference. While the size reduction might seem minimal in some cases, the cumulative effect can be significant, especially in environments where multiple containers are running. Thus, optimizing Docker images should be a key consideration in any application development and deployment process.
Hello everyone! In this article, I want to share my knowledge and opinion about the data types that are often used as an identifier. Today we will touch on two topics at once. These are measurements of search speed by key and data types for the key on the database side. I will use a PostgreSQL database and a demo Java service to compare query speeds. UUID and ULID Why do we need some kind of incomprehensible types for IDs? I won’t talk about distributed systems, connectivity of services, sensitive data, and the like. If someone is interested in this, they can Google it - at the moment we are interested in performance. As the name suggests, we will talk about two types of keys: UUID and ULID. UUID has long been known to everyone, but ULID may be unfamiliar to some. The main advantage of ULID is that it is monotonically increasing and is a sortable type. Naturally, these are not all the differences. Personally, I also like the fact that there are no special characters in it. A small digression, I noticed a long time ago that many teams use the varchar(36) data type to store UUID in the PostgreSQL database and I don’t like this, since this database has a corresponding data type for UUID. A little later, we will see which type is preferable on the database side. Therefore, we will look not only at a comparison of the two data types on the backend side but also at the difference when storing UUID in different formats on the database side. Comparison So let's start comparing things. The UUID is 36 characters long and takes up 128 bits of memory. The ULID is 26 characters long and also takes up 128 bits of memory. For my examples, I created two tables in the database with three fields: SQL CREATE TABLE test.speed_ulid ( id varchar(26) PRIMARY KEY, name varchar(50), created timestamp ); CREATE TABLE test.speed_uuid ( id varchar(36) PRIMARY KEY, name varchar(50), created timestamp ); For the first comparison, I stored the UUID in varchar(36) format, as is often done. In the database, I recorded 1,000,000 in each of the tables. The test case will consist of 100 requests using identifiers previously pulled from the database; that is, when calling the test method, we will access the database 100 times and retrieve the entity by key. The connection will be created and warmed up before measurement. We will conduct two test runs and then 10 effective iterations. For your convenience, I will provide a link to the Java code at the end of the article. Sorry, but the measurements were taken on a standard MacBook Pro laptop and not on a dedicated server, but I don't believe there will be a significant difference in the results other than increased time spent on network traffic between the database and the backend. Here is some background information: # CPU I9-9980HK # CPU count: 16 # RAM: 32GB # JMH version: 1.37 # VM version: JDK 11.0.12, Java HotSpot(TM) 64-Bit Server VM, 11.0.12+8-LTS-237 # DB: PostgreSQL 13.4, build 1914, 64-bit Queries that will be used to obtain an entity by key: SQL SELECT * FROM test.speed_ulid where id = ? SELECT * FROM test.speed_uuid where id = ? Measurement Results Let's look at the measurement results. Let me remind you that each table has 1,000,000 rows. Both Types of Identifiers Are Stored in the Database as varchar I ran this test several times, and the result was about the same: either the ULID was a little faster, or the UUID. In percentage terms, the difference is practically zero. Well, you can disagree that there is no difference between these types. I would say that it is not possible to use other data types on the database side. UUID as uuid, ULID as varchar in DB For the next test, I changed the data type from varchar(36) to uuid in the test.speed_uuid table. In this case, the difference is obvious: 4.5% in favor of UUID. As you can see, it makes sense to use the uuid data type on the database side in the case of a type of the same name on the service side. The index for this format is very well optimized in PostgreSQL and shows good results. Well, now we can definitely part ways. Or not? If you look at the index search query plan, you can see the following ((id)::text = '01HEE5PD6HPWMBNF7ZZRF8CD9R'::text) in the case when we use varchar. In general, comparing two text variables is a rather slow operation, so maybe there is no need to store the ID in this format. Or are there other ways to speed up key comparison? First, let's create another index of the kind “hash” for the table with ULID. SQL create index speed_ulid_id_index on test.speed_ulid using hash (id); Let's look at the execution plan for our query: We will see that the database uses a hash index, and not a btree in this case. Let's run our test and see what happens. varchar + index(hash) for ULID, uuid for UUID This combination gave an increase of 2.3% relative to uuid and its cheating index. I'm not sure that keeping two indexes on one field can somehow be justified. So it's worth considering whether there's more you can do. And here it’s worth looking into the past and remembering how uuid or some other string identifiers used to be stored. That's right: either text or a byte array. So let's try this option: I removed all the indexes for the ULID, cast it to bytea , and recreated the primary key. bytea for ULID, uuid for UUID As a result, we got approximately the same result as in the previous run with an additional index, but I personally like this option better. Measurement result with 2,000,000 rows in the database: Measurement result with 3,000,000 rows in the database: I think there is no point in continuing measurements further. The pattern remains: ULID saved as bytea slightly outperforms UUID saved as uuid in DB. If we take the data from the first measurements, it is obvious that with the help of small manipulations, you can increase performance by about 9% if you use varchar. So, if you have read this far, I assume the article was interesting to you and you have already drawn some conclusions for yourself. It is worth noting that the measurements were made under ideal conditions for both the backend part and the database. We did not have any parallel processes running that write something to the database, change records, or perform complex calculations on the back-end side. Сonclusions Let's go over the material. What did you learn that was useful? Do not neglect the uuid data type on the PostgreSQL side. Perhaps someday extensions for ULID will appear in this database, but for now, we have what we have. Sometimes it is worth creating an additional index of the desired type manually, but there is an overhead to consider. If you are not afraid of unnecessary work - namely, writing your own converters for types - then you should try bytea if there is no corresponding type for your identifier on the database side. What type of data should be used for the primary key and in what format should it be stored? I don’t have a definite answer to these questions: it all depends on many factors. It is also worth noting that a competent choice of data type for ID, and not only for it, can at some point play an important role in your project. I hope this article was useful to you. Good luck! Project on GitHub
In contemporary web development, a recurring challenge revolves around harmonizing the convenience and simplicity of using a database with a web application. My name is Viacheslav Aksenov, and in this article, I aim to explore several of the most popular approaches for integrating databases and web applications within the Kubernetes ecosystem. These examples are examined within the context of a testing environment, where constraints are more relaxed. However, these practices can serve as a foundation applicable to production environments as well. One Service, One Database. Why? Running a database alongside a microservice aligns with the principles outlined in the Twelve-Factor App methodology. One key factor is "Backing Services" (Factor III), which suggests treating databases, message queues, and other services as attached resources to be attached or detached seamlessly. By co-locating the database with the microservice, we adhere to the principle of having a single codebase that includes the application and its dependencies, making it easier to manage, scale, and deploy. Additionally, it promotes encapsulation and modularity, allowing the microservice to be self-contained and portable across different environments, following the principles of the Twelve-Factor App. This approach enhances the maintainability and scalability of the entire application architecture. For this task, you can leverage various tools, and one example is using KubeDB. What Is KubeDB? KubeDB is an open-source project that provides a database management framework for Kubernetes, an open-source container orchestration platform. KubeDB simplifies the deployment, management, and scaling of various database systems within Kubernetes clusters. We used the following benefits from using this tool: Database operators: Postgres operator to simplify the process of deploying and managing database instances on Kubernetes. Monitoring and alerts: KubeDB integrates with monitoring and alerting tools like Prometheus and Grafana, enabling you to keep an eye on the health and performance of your database instances. Security: KubeDB helps you set up secure access to your databases using authentication mechanisms and secrets management. And it is very easy to set up the deployment. deployment.yaml: YAML apiVersion: kubedb.com/v1alpha2 kind: PostgreSQL metadata: name: your-postgresql spec: version: "11" storageType: Durable storage: storageClassName: <YOUR_STORAGE_CLASS> accessModes: - ReadWriteOnce resources: requests: storage: 1Gi terminationPolicy: WipeOut databaseSecret: secretName: your-postgresql-secret databaseURLFromSecret: true replicas: 1 users: - name: <YOUR_DB_USER> passwordSecret: secretName: your-postgresql-secret passwordKey: password databaseName: <YOUR_DB_NAME> Then, you can use the credentials and properties of this database to connect your service's pod to it with deployment.yaml: YAML apiVersion: apps/v1 kind: Deployment metadata: name: your-microservice spec: replicas: 1 selector: matchLabels: app: your-microservice template: metadata: labels: app: your-microservice spec: containers: - name: your-microservice-container image: your-microservice-image:tag ports: - containerPort: 80 env: - name: DATABASE_URL value: "postgres://<YOUR_DB_USER>:<YOUR_DB_PASSWORD>@<YOUR_DB_HOST>:<YOUR_DB_PORT>/<YOUR_DB_NAME>" --- apiVersion: v1 kind: Service metadata: name: your-microservice-service spec: selector: app: your-microservice ports: - protocol: TCP port: 80 targetPort: 80 And if, for some reason, you are not ready to use KubeDB or don't require the full functional of their product, you can use the Postgresql container as a sidecar for your test environment. Postgres Container as a Sidecar In the context of Kubernetes and databases like PostgreSQL, a sidecar is a separate container that runs alongside the main application container within a pod. The sidecar pattern is commonly used to enhance or extend the functionality of the main application container without directly impacting its core logic. Let's see the example of a configuration for a small Spring Boot Kotlin service that handles cat names. deployment.yaml: YAML apiVersion: apps/v1 kind: Deployment metadata: name: cat-svc labels: app: cat-svc spec: replicas: 1 selector: matchLabels: app: cat-svc template: metadata: labels: app: cat-svc type: http spec: containers: - name: cat-svc image: cat-svc:0.0.1 ports: - name: http containerPort: 8080 protocol: TCP readinessProbe: httpGet: path: /actuator/health port: 8080 initialDelaySeconds: 30 timeoutSeconds: 10 periodSeconds: 10 livenessProbe: httpGet: path: /actuator/health port: 8080 initialDelaySeconds: 60 timeoutSeconds: 10 periodSeconds: 30 env: - name: PLACES_DATABASE value: localhost:5432/cats - name: POSTGRES_USER value: pwd - name: POSTGRES_PASSWORD value: postgres - name: cat-postgres image: postgres:11.1 ports: - name: http containerPort: 5432 protocol: TCP env: - name: POSTGRES_USER value: pwd - name: POSTGRES_PASSWORD value: postgres - name: POSTGRES_DB value: cats Dockerfile FROM gradle:8.3.0-jdk17 COPY . . EXPOSE 8080 CMD ["gradle", "bootRun"] And for local run, it is possible to use docker-compose with the following configuration. docker-compose.yaml: YAML version: '3.8' services: cat-postgres: image: postgres:12.13 restart: always ports: - "5432:5432" environment: POSTGRES_PASSWORD: postgres POSTGRES_USER: postgres POSTGRES_DB: cats # volumes: # - ./init.sql:/docker-entrypoint-initdb.d/create_tables.sql - if you want to run any script before an app # - ./db-data/:/var/lib/postgresql/data/ service: image: cat-svc:0.0.1 restart: always ports: - '8080:8080' environment: SPRING_PROFILES_ACTIVE: prod PLACES_DATABASE: cat-postgres:5432/cats POSTGRES_PASSWORD: postgres POSTGRES_USER: postgres Migrations The big thing that has to be decided before using this approach is the migration question. The best option in this approach is to delegate the migration process to any tool that can work within your app infrastructure. For example, for Java World, you could use Flyway or Liquibase. Flyway is a popular open-source database migration tool. It allows you to version control your database schema and apply changes in a structured manner. Flyway supports multiple databases, including PostgreSQL, MySQL, and Oracle. Liquibase is an open-source database migration tool that supports tracking, managing, and applying database changes. It provides a way to define database changes using XML, YAML, or SQL, and it supports various databases. Pros of Using a PostgreSQL Sidecar in Kubernetes Separation of concerns: Sidecars allow you to separate specific functionalities (e.g., database migrations, backups) from the main application logic. Сompliance with microservice architecture. Simplified deployment: Sidecars can be deployed and managed alongside the main application using the same deployment configurations, simplifying the overall deployment process. You don't need to support separated database for testing the environment. And it leads to decreasing the complexity of tests (you don't need to think about collisions while you are running many CI with tests for the same table) Cons of Using a PostgreSQL Sidecar in Kubernetes Resource overhead: Running additional containers consumes resources (CPU, memory) on the node, which may impact the overall performance and resource utilization of the Kubernetes cluster. It's best to use as few resources as possible. Startup order: The main application may become dependent on the sidecar for certain functionalities, potentially leading to issues if there are discrepancies or version mismatches between the main application and the sidecar. Arranging containers in a specific order without additional configuration can be somewhat challenging. However, this shouldn't pose a problem in test environments due to the quick startup of the PostgreSQL container. In most scenarios, the PostgreSQL container will initiate before any of your business applications. Even if the application attempts to run before PostgreSQL is ready, it will encounter a failure and be automatically restarted by the default Kubernetes mechanism until the database becomes available. Learning curve: Adopting the sidecar pattern may require a learning curve for development and operations teams, particularly if they are new to the concept of containerized sidecar architectures. Once the setup is complete, new team members should encounter no issues with this approach. Conclusion In conclusion, the choice between using KubDB and the PostgreSQL sidecar approach for integrating web applications and databases in a test environment ultimately depends on your specific requirements and preferences. KubDB offers a comprehensive solution with Kubernetes-native features, streamlining the management of databases alongside web services. On the other hand, the PostgreSQL sidecar approach provides flexibility and fine-grained control over how databases and web applications interact. Whether you opt for the simplicity and seamless integration provided by KubDB or the customization potential inherent in the sidecar pattern, both approaches lay a solid foundation for test environments. The key lies in understanding the unique demands of your project and selecting the method that aligns best with your development workflow, scalability needs, and overall application architecture. Whichever path you choose, the insights gained from exploring these approaches in a test setting can pave the way for a robust and efficient integration strategy in your production environment.
In today's data-driven landscape, organizations are increasingly turning to robust solutions like AWS Data Lake to centralize vast amounts of structured and unstructured data. AWS Data Lake, a scalable and secure repository, allows businesses to store data in its native format, facilitating diverse analytics and machine learning tasks. One of the popular tools to query this vast reservoir of information is Amazon Athena, a serverless, interactive query service that makes it easy to analyze data directly in Amazon S3 using standard SQL. However, as the volume of data grows exponentially, performance challenges can emerge. Large datasets, complex queries, and suboptimal table structures can lead to increased query times and costs, potentially undermining the very benefits that these solutions promise. This article delves specifically into the details of how to harness the power of partition projections to address these performance challenges. Before diving into the advanced concept of partition projections in Athena, it's essential to grasp the foundational idea of partitions, especially in the context of a data lake. What Are Partitions in AWS Data Lake? In the realm of data storage and retrieval, a partition refers to a division of a table's data based on the values of one or more columns. Think of it as organizing a vast bookshelf (your data) into different sections (partitions) based on genres (column values). By doing so, when you're looking for a specific type of book (data), you only need to search in the relevant section (partition) rather than the entire bookshelf. In a data lake, partitions are typically directories that contain data files. Each directory corresponds to a specific value or range of values from the partitioning column(s). Why Are Partitions Important? Efficiency: Without partitions, querying vast datasets would involve scanning every single file, which is both time-consuming and costly. With partitions, only the relevant directories are scanned, significantly reducing the amount of data processed. Cost Savings: In cloud environments like AWS, where you pay for the amount of data scanned, partitions can lead to substantial cost reductions. Scalability: As data grows, so does the importance of partitions. They ensure that even as your data lake swells with more data, retrieval times remain manageable. Challenges With Partitions While partitions offer numerous benefits, they aren't without challenges: Maintenance: As new data comes in, new partitions might need to be created, and existing ones might need updates. Optimal Partitioning: Too few partitions can mean you're still scanning a lot of unnecessary data. Conversely, too many partitions can lead to a large number of small files, which can also degrade performance. With this foundational understanding of partitions in a data lake, we can now delve deeper into the concept of partition projections in Athena and how they aim to address some of these challenges. What Are Partition Projections? Partition pruning is a technique where only the relevant metadata, specific to a query, is selected, eliminating unnecessary data. This method often makes queries run faster. Athena employs this strategy for all tables that have partitioned columns. In a typical scenario, when Athena processes queries, it first communicates with the AWS Glue Data Catalog by making a GetPartitions request, after which it performs partition pruning. However, if a table has an extensive set of partitions, this call can slow things down. To avoid this expensive operation on a highly partitioned table, AWS has introduced the technique of partition projections. With partition projection, Athena doesn't need to make the GetPartitions call. Instead, the configuration provided in partition projection equips Athena with all it needs to create the partitions on its own. Benefits of Partition Projections Improved Query Performance: By reducing the amount of data scanned, queries run faster and more efficiently. Reduced Costs: With Athena, you pay for the data you scan. By scanning less data, costs are minimized. Simplified Data Management: Virtual partitions eliminate the need for continuous partition maintenance tasks, such as adding new partitions when new data arrives. Setting Up Partition Projections To utilize partition projections: 1. Define Projection Types: Athena supports several projection types, including `integer,` `enum,` `date,` and `injected.` Each type serves a specific use case, like generating a range of integers or dates. 2. Specify Projection Configuration: This involves defining the rules and patterns for your projections. For instance, for a date projection, you'd specify the start date, end date, and the date format. 3. Modify Table Properties: Once projections are defined, modify your table properties in Athena to use these projections. An Example Use-Case Let us take an example where our data is stored in the data lake and is partitioned by customer_id and dt. The data is stored in parquet format, which is a columnar data format. s3://my-bucket/data/customer_id/yyyy-MM-dd/*.parquet In our example, let us have data for one year, i.e., 365 days and 100 customers. This would result in 365*100=36500 partitions on the data-lake. Let us benchmark the queries on this table with and without partition projections enabled. Let us get the count of all the records for the entire year for five customers. Query SQL SELECT count(*) FROM "analytic_test"."customer_events" where dt >= '2022-01-01' and customer_id IN ('Customer_001', 'Customer_002', 'Customer_003', 'Customer_004', 'Customer_005') Without Partition Projection Without partition projections enabled, the total query runtime is 7.3 seconds. Out of that, it spends 78% in planning and 20% executing the query. Query Results Planning: 78% = 5.6 secondsExecution 20% = 1.46 With Partition Projections Now, let us enable partition projection for this table. Take a look at all the table properties that are suffixed with "partition.*". In this example, since we had two partitions, dt and cutsomer_id. We will use date type projection, and for customer_id, we will use enum type projection. For enum types, you can build an automation job to update the table property whenever there are newer records for it. SQL CREATE EXTERNAL TABLE `customer_events`( `event_id` bigint COMMENT '', `event_text` string COMMENT '') PARTITIONED BY ( `customer_id` string COMMENT '', `dt` string COMMENT '') ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 's3://my-bucket/data/events.customer_events' TBLPROPERTIES ( 'has_encrypted_data'='false', 'parquet.compression'='SNAPPY', 'transient_lastDdlTime'='1698737433', 'projection.enabled'='true', 'projection.dt.type'='date', 'projection.dt.range'='NOW-1YEARS,NOW', 'projection.dt.format'='yyyy-MM-dd', 'projection.dt.interval'='1', 'projection.dt.interval.unit'='DAYS', 'projection.customer_id.type'='enum', 'projection.customer_id.values'='Customer_001,Customer_002,Customer_003,Customer_004,Customer_005,Customer_006,Customer_007,Customer_008,Customer_009,Customer_010,Customer_011,Customer_012,Customer_013,Customer_014,Customer_015,Customer_016,Customer_017,Customer_018,Customer_019,Customer_020,Customer_021,Customer_022,Customer_023,Customer_024,Customer_025,Customer_026,Customer_027,Customer_028,Customer_029,Customer_030,Customer_031,Customer_032,Customer_033,Customer_034,Customer_035,Customer_036,Customer_037,Customer_038,Customer_039,Customer_040,Customer_041,Customer_042,Customer_043,Customer_044,Customer_045,Customer_046,Customer_047,Customer_048,Customer_049,Customer_050,Customer_051,Customer_052,Customer_053,Customer_054,Customer_055,Customer_056,Customer_057,Customer_058,Customer_059,Customer_060,Customer_061,Customer_062,Customer_063,Customer_064,Customer_065,Customer_066,Customer_067,Customer_068,Customer_069,Customer_070,Customer_071,Customer_072,Customer_073,Customer_074,Customer_075,Customer_076,Customer_077,Customer_078,Customer_079,Customer_080,Customer_081,Customer_082,Customer_083,Customer_084,Customer_085,Customer_086,Customer_087,Customer_088,Customer_089,Customer_090,Customer_091,Customer_092,Customer_093,Customer_094,Customer_095,Customer_096,Customer_097,Customer_098,Customer_099,Customer_100') Query results Planning: 1.69 secondsExecution: 0.6 seconds Results We can see a roughly 70% improvement in the query performance. This is because Athena avoids a remote call to AWS glue to fetch the partitions, as with this feature, it is able to project the values for these partitions. Limitations and Considerations While powerful, partition projections do not solve all the problems. Complex Setups: Setting up projections requires a deep understanding of your data and the patterns it follows. Not Always the Best Fit: For datasets that don't follow predictable patterns or have irregular updates, traditional partitioning might be more suitable. Conclusion AWS's introduction of Partition Projections in Athena is a testament to their commitment to improving user experience and efficiency. By leveraging this feature, organizations can achieve faster query performance with minimal configuration changes. As with all tools, understanding its strengths and limitations is key to harnessing its full potential.
In my previous article, I hinted at explaining how Ansible can be used to expose applications running inside a high availability K8s cluster to the outside world. This post will show how this can be achieved using a K8s ingress controller and load balancer. This example uses the same setup as last time around: virtual machines running under the default Windows Hypervisor (Hyper-V). To make room for the addition of a proxy each VM had to give up some RAM. With the exception of the initial master and the Ansible runner, each of the remaining nodes received an allocation of 2000MB. A new version of the sample project is available at GitHub with a new playbook called k8s_boot.yml. This yaml boots up the entire cluster instead of having to run multiple playbooks one after the other. It configures the cluster according to the specification of the inventory file. The flow of execution can be better, but I changed the underlying copybooks as little as possible so readers of previous posts can still find their way. Since the architecture of this post might seem complex at first encounter, an architectural diagram is included towards the very bottom to clarify the landscape. Master and Commanders In the previous article I alluded to the fact that a high availability cluster requires multiple co-masters to provide backup should the current master act up. We will start off by investigating how this redundancy is used to establish high availability. The moment a co-master loses comms with the master, it nominates itself to become the next master. Each of the remaining masters then has to acknowledge its claim upon receiving news of its candidacy. However, another co-master can also notice the absence of the current master before receiving word of a candidacy and nominating itself. Should 50% of the vote be the requirement to assume control, it is possible for two control planes to each attract 50% and think itself the master. Such a cluster will go split-brained with two masters orchestrating a bunch of very confused worker nodes. For this reason, K8s implements the raft protocol from, which follows the typical requirement that a candidate should receive a quorum of 50%+1 before it gains the respect to boss all and sundry. Consequently, a high availability K8s cluster should always comprise of an unequal number of masters. For the project, this means that the inventory should always contain an equal number of co-masters, with the initial master then assuring the inequality. The bootup playbook imports the older k8s_comasters.yml playbook into its execution to prepare and execute the well-known "kubeadm join" command on each of the co-masters: kubeadm join k8scp:6443 --token 9ei28c.b496t8c4vbjea94h --discovery-token-ca-cert-hash sha256:3ae7abefa454d33e9339050bb26dcf3a31dc82f84ab91b2b40e3649cbf244076 --control-plane --certificate-key 5d89284dee1717d0eff2b987f090421fb6b077c07cf21691089a369781038c7b Joining workers nodes to the cluster uses a similar join command but omits the --control-plane switch, as can be seen in k8s_workers.yml, also imported during bootup. After running the bootup playbook, the cluster will comprise both control-plane and worker nodes: Control At All Times At this point in time, all nodes refer to the original master by hostname, as can be seen from the "kube init" command that starts the first master: kubeadm init --pod-network-cidr 10.244.0.0/16 --control-plane-endpoint k8scp:6443 --upload-certs Clearly, this node is currently the single point of failure of the cluster. Should it fall away, the cluster's nodes will lose contact with each other. The Ansible scripts mitigate for this by installing the kube config to all masters so kubectl commands can be run from any master by such designated user. Changing the DNS entry to map k8scp to one of the other control planes will hence restore service. While this is easy to do using the host file, additional complexities can arise when using proper DNS servers. Kubernetes orthodoxy, consequently, has that a load balancer should be put in front of the cluster to spread traffic across each of the master nodes. A control plane that falls out will be removed from the duty roster by the proxy. None will be the wiser. HAProxy fulfills this role perfectly. The Ansible tasks that make this happen are: - name: Install HAProxy become: true ansible.builtin.apt: name: haproxy=2.0.31-0ubuntu0.2 state: present - name: Replace line in haproxy.cfg1. become: true lineinfile: dest: /etc/haproxy/haproxy.cfg regexp: 'httplog' line: " option tcplog" - name: Replace line in haproxy.cfg2. become: true lineinfile: dest: /etc/haproxy/haproxy.cfg regexp: 'mode' line: " mode tcp" - name: Add block to haproxy.cfg1 become: true ansible.builtin.blockinfile: backup: false path: /etc/haproxy/haproxy.cfg block: |- frontend proxynode bind *:80 bind *:6443 stats uri /proxystats default_backend k8sServers backend k8sServers balance roundrobin server cp {{ hostvars['host1']['ansible_host'] }:6443 check {% for item in comaster_names -%} server {{ item } {{ hostvars[ item ]['ansible_host'] }:6443 check {% endfor -%} listen stats bind :9999 mode http stats enable stats hide-version stats uri /stats - name: (Re)Start HAProxy service become: true ansible.builtin.service: name: haproxy enabled: true state: restarted The execution of this series of tasks is triggered by the addition of a dedicated server to host HAProxy to the inventory file. Apart from installing and registering HAProxy as a system daemon, this snippet ensures that all control-plane endpoints are added to the duty roster. Not shown here is that the DNS name (k8scp) used in the "kubeadm join" command above is mapped to the IP address of the HAProxy during bootup. Availability and Accessibility Up to this point, everything we have seen constitutes the overhead required for high-availability orchestration. All that remains is to do a business Deployment and expose a K8s service to track its pods on whichever node they may be scheduled on: kubectl create deployment demo --image=httpd --port=80 kubectl expose deployment demo Let us scale this deployment to two pods, each running an instance of the Apache web server: This two-pod deployment is fronted by the demo Service. The other Service (kubernetes) is automatically created and allows access to the API server of the control plane. In a previous DZone article, I explained how this API can be used for service discovery. Both services are of type ClusterIP. This is a type of load balancer, but its backing httpd pods will only be accessible from within the cluster, as can be seen from the absence of an external ip. Kubernetes provides various other service types, such as NodePort and LoadBalancer, to open up pods and containers for outside access. A NodePort opens up access to the service on each Node. Although it is possible for clients to juggle IP addresses should a node fall out, the better way is to use a LoadBalancer. Unfortunately, Kubernetes does not provide an instance as it is typically provided by cloud providers. Similarly, an on-premise or bare-metal cluster has to find and run its own one. Alternatively, its clients have to make do as best they can by using NodePorts or implementing its own discovery mechanism. We will follow the first approach by using MetalLB to slot K8s load balancing into our high availability cluster. This is a good solution, but it is not the best solution. Since every K8s deployment will be exposed behind its own LoadBalancer/Service, clients calling multiple services within the same cluster will have to register the details of multiple load balancers. Kubernetes provides the Ingress API type to counter this. It enables clients to request service using the HTTP(S) routing rules of the Ingress, much the way a proxy does it. Enough theory! It is time to see how Ansible can declare the presence of an Ingress Controller and LoadBalancer: - hosts: masters gather_facts: yes connection: ssh vars_prompt: - name: "metal_lb_range" prompt: "Enter the IP range from which the load balancer IP can be assigned?" private: no default: 192.168.68.200-192.168.69.210 tasks: - name: Installing Nginx Ingress Controller become_user: "{{ ansible_user }" become_method: sudo # become: yes command: kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.0.5/deploy/static/provider/cloud/deploy.yaml run_once: true - name: Delete ValidatingWebhookConfiguration become_user: "{{ ansible_user }" become_method: sudo # become: yes command: kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission run_once: true - name: Install Metallb1. become_user: "{{ ansible_user }" become_method: sudo become: yes shell: 'kubectl -n kube-system get configmap kube-proxy -o yaml > /home/{{ ansible_user }/kube-proxy.yml' - name: Install Metallb2. become_user: "{{ ansible_user }" become_method: sudo become: yes command: kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.11/config/manifests/metallb-native.yaml - name: Prepare L2Advertisement. become_user: "{{ ansible_user }" become_method: sudo copy: dest: "~/l2advertisement.yml" content: | apiVersion: metallb.io/v1beta1 kind: L2Advertisement metadata: name: example namespace: metallb-system - name: Prepare address pool. become_user: "{{ ansible_user }" become_method: sudo copy: dest: "~/address-pool.yml" content: | apiVersion: metallb.io/v1beta1 kind: IPAddressPool metadata: name: first-pool namespace: metallb-system spec: addresses: - {{ metal_lb_range } - pause: seconds=30 - name: Load address pool become_user: "{{ ansible_user }" become_method: sudo command: kubectl apply -f ~/address-pool.yml - name: Load L2Advertisement become_user: "{{ ansible_user }" become_method: sudo command: kubectl apply -f ~/l2advertisement.yml ... First off, it asks for a range of IP addresses that are available for use by the LoadBalancers. It subsequently installs the Nginx Ingress Controller and, lastly, MetallLB to load balance behind the Ingress. MetalLB uses either the ARP (IPv4)/NDP(IPv6) or the BGP to announce the MAC address of the network adaptor. Its pods attract traffic to the network. BGP is probably better as it has multiple MetalLB speaker pods announcing. This might make for a more stable cluster should a node fall out. ARP/NDP only has one speaker attracting traffic. This causes a slight unresponsiveness should the master speaker fail and another speaker has to be elected. ARP is configured above because I do not have access to a router with a known ASN that can be tied into BGP. Next, we prepare to boot the cluster by designating co-masters and an HAProxy instance in the inventory. Lastly, booting with the k8s_boot.yml playbook ensures the cluster topology as declared in the inventory file is enacted: Each node in the cluster has one MetalLB speaker pod responsible for attracting traffic. As stated above, only one will associate one of the available IP addresses with its Mac address when using ARP. The identity of this live wire can be seen at the very bottom of the Ingress Controller service description: Availability in Action We can now test cluster stability. The first thing to do is to install an Ingress: kubectl create ingress demo --class=nginx --rule="www.demo.io/*=demo:80" Browse the URL, and you should see one of the Apache instances returning a page stating: "It works!": This IP address spoofing is pure magic. It routes www.demo.io to the Apache web server without it being defined using a DNS entry outside the cluster. The Ingress can be interrogated from kubectl: One sees that it can be accessed on one of the IP addresses entered during bootup. The same can also be confirmed using wget, the developer tools of any browser worth its salt, or by inspecting the ingress controller: Should the external IP remain in the pending state, Kubernetes could not provision the load balancers. The MetalLB site has a section that explains how to troubleshoot this. We confirmed that the happy case works, but does the web server regain responsiveness in case of failure? We start off by testing whether the IngressController is a single point of failure by switching the node where it ran: Kubernetes realized that the node was no longer in the cluster, terminated all the pods running on that cluster, and rescheduled them on the remaining worker node. This included the IngressController. The website went down for a while, but Kubernetes eventually recovered service. In other words, orchestration in action! Next up, we remove the MetalLB speaker by taking down the cluster where it runs: Another speaker will step up to the task! What about HAProxy? It runs outside the cluster. Surely, this is the single point of failure. Well... Yes and no. Yes, because one loses connection to the control planes: No, because all that is required is to map the IP address of k8scp from that of the HAProxy to that of one of the masters. The project has an admin playbook to do this. Run it and wait for the nodes to stabilize into a ready state. Ingress still routes, MetalLB still attracts, and httpd still serves: Due to the HAProxy being IAC, it is also no trouble to boot a new proxy and slot out the faulty/crashed one. The playbook used above to temporarily switch traffic to a master can also be used during such a proxy replacement. Unfortunately, this requires human interaction, but at least the human knows what to monitor with the utmost care and how to quickly recover the cluster. Final Architecture The final architecture is as follows: Note that all the MetalLB speakers work as a team to provide LoadBalancing for the Kubernetes Services and its Deployments. Conclusion There probably are other ways to install a high availability K8s cluster, but I like this double load balancer approach: HAProxy abstracts and encapsulates the redundancy of an unequal number of control planes, e.g., it ensures 99.9999% availability for cluster controlling commands coming from kubectl; MetalLB and Nginx Ingress Controller working together to track the scheduling of business pods. Keep in mind that the master can move a pod with its container(s) to any worker node depending on failure and resource availability. In other words, the MetalLB LoadBalancer ensures continuity of business logic in case of catastrophic node failure. In our sample, the etcd key-value store is located as part of the control-planes. This is called the stacked approach. The etcd store can also be removed from the control-planes and hosted inside its own nodes for increased stability. More on this here. Our K8s as Ansible project is shaping nicely for use as a local or play cloud. However, a few things are outstanding that one would expect in a cluster of industrial strength: Role based access control (RBAC); Service mesh to move security, observability, and reliability from the application into the platform; Availability zones in different locations, each with its one set of HAProxy, control-planes, and workers separated from each other using a service mesh; Secret management; Ansible lint needs to be run against the Ansible playbooks to identify bad and insecure practices requiring rectification; Choking incoming traffic when a high load of failure is experienced to allow business pods to continue service or recover gracefully. It should be noted, though, that nothing prevents one to add these to one's own cluster.
One of our earlier blog posts discussed the initial steps for diving into Amazon Bedrock by leveraging the AWS Go SDK. Subsequently, our second blog post expanded upon this foundation, showcasing a Serverless Go application designed for image generation with Amazon Bedrock and AWS Lambda ("Generative AI Apps With Amazon Bedrock: Getting Started for Go Developers"). Amazon Bedrock is a fully managed service that makes base models from Amazon and third-party model providers (such as Anthropic, Cohere, and more) accessible through an API. The applications demonstrated in those blog posts accessed Amazon Bedrock APIs directly, thereby avoiding any additional layers of abstraction or frameworks/libraries. This approach is particularly effective for learning and crafting straightforward solutions. However, developing generative AI applications goes beyond simply using large language models (LLMs) via an API. You need to think about other parts of the solution which include intelligent search (also known as semantic search that often requires specialized data stores), orchestrating sequential workflows (e.g., invoking another LLM based on the previous LLM response), loading data sources (text, PDF, links, etc.) to provide additional context for LLMs, maintaining historical context (for conversational/chatbot/QA solutions) and much more. Implementing these features from scratch can be difficult and time-consuming. Enter LangChain, a framework that provides off-the-shelf components to make it easier to build applications with language models. It is supported in multiple programming languages. This obviously includes Python, but also JavaScript, Java, and Go. langchaingo is the LangChain implementation for the Go programming language. This blog post covers how to extend langchaingo to use foundation model from Amazon Bedrock. The code is available in this GitHub repository. LangChain Modules One of LangChain's strengths is its extensible architecture - the same applies to the langchaingo library as well. It supports components/modules, each with interface(s) and multiple implementations. Some of these include: Models: These are the building blocks that allow LangChain apps to work with multiple language models (such as ones from Amazon Bedrock, OpenAI, etc.). Chain: These can be used to create a sequence of calls that combine multiple models and prompts. Vector databases: They can store unstructured data in the form of vector embedding. At query time, the unstructured query is embedded, and semantic/vector search is performed to retrieve the embedding vectors that are "most similar" to the embedded query. Memory: This module allows you to persist the state between chain or agent calls. By default, chains are stateless, meaning they process each incoming request independently (the same goes for LLMs). This provides ease of use, choice, and flexibility while building LangChain-powered Go applications. For example, you can change the underlying vector database by swapping the implementation with minimal code changes. Since langchaingo provides many large language model implementations, the same applies here as well. langchaingo Implementation for Amazon Bedrock As mentioned before, Amazon Bedrock processes access to multiple models including Cohere, Anthropic, etc. We will cover how to extend Amazon Bedrock to build a plugin for the Anthropic Claude (v2) model, but the guidelines apply to other models as well. Let's walk through the implementation at a high level. Any custom model (LLM) implementation has to satisfy langchaingo LLM and LanguageModel interfaces. So it implements Call, Generate, GeneratePrompt and GetNumTokens functions. The key part of the implementation is in the Generate function. Here is a breakdown of how it works. The first step is to prepare the JSON payload to be sent to Amazon Bedrock. This contains the prompt/input along with other configuration parameters. //... payload := Request{ MaxTokensToSample: opts.MaxTokens, Temperature: opts.Temperature, TopK: opts.TopK, TopP: opts.TopP, StopSequences: opts.StopWords, } if o.useHumanAssistantPrompt { payload.Prompt = fmt.Sprintf(claudePromptFormat, prompts[0]) } else { } payloadBytes, err := json.Marshal(payload) if err != nil { return nil, err } It is represented by the Request struct which is marshalled into JSON before being sent to Amazon Bedrock. type Request struct { Prompt string `json:"prompt"` MaxTokensToSample int `json:"max_tokens_to_sample"` Temperature float64 `json:"temperature,omitempty"` TopP float64 `json:"top_p,omitempty"` TopK int `json:"top_k,omitempty"` StopSequences []string `json:"stop_sequences,omitempty"` } 2. Next Amazon Bedrock is invoked with the payload and config parameters. Both synchronous and streaming invocation modes are supported. The streaming/async mode will be demonstrated in an example below: //... if opts.StreamingFunc != nil { resp, err = o.invokeAsyncAndGetResponse(payloadBytes, opts.StreamingFunc) if err != nil { return nil, err } } else { resp, err = o.invokeAndGetResponse(payloadBytes) if err != nil { return nil, err } } This is how the asynchronous invocation path is handled - the first part involves using the InvokeModelWithResponseStream function and then handling InvokeModelWithResponseStreamOutput response in the ProcessStreamingOutput function. You can refer to the details in Using the Streaming API section in "Generative AI Apps With Amazon Bedrock: Getting Started for Go Developers," linked in the introduction of this article. //... func (o *LLM) invokeAsyncAndGetResponse(payloadBytes []byte, handler func(ctx context.Context, chunk []byte) error) (Response, error) { output, err := o.brc.InvokeModelWithResponseStream(context.Background(), &bedrockruntime.InvokeModelWithResponseStreamInput{ Body: payloadBytes, ModelId: aws.String(o.modelID), ContentType: aws.String("application/json"), }) if err != nil { return Response{}, err } var resp Response resp, err = ProcessStreamingOutput(output, handler) if err != nil { return Response{}, err } return resp, nil } func ProcessStreamingOutput(output *bedrockruntime.InvokeModelWithResponseStreamOutput, handler func(ctx context.Context, chunk []byte) error) (Response, error) { var combinedResult string resp := Response{} for event := range output.GetStream().Events() { switch v := event.(type) { case *types.ResponseStreamMemberChunk: var resp Response err := json.NewDecoder(bytes.NewReader(v.Value.Bytes)).Decode(&resp) if err != nil { return resp, err } handler(context.Background(), []byte(resp.Completion)) combinedResult += resp.Completion case *types.UnknownUnionMember: fmt.Println("unknown tag:", v.Tag) default: fmt.Println("union is nil or unknown type") } } resp.Completion = combinedResult return resp, nil } 3. Once the request is processed successfully, the JSON response from Amazon Bedrock is converted (un-marshaled) back in the form of a Response struct and a slice of Generation instances as required by the Generate function signature. //... generations := []*llms.Generation{ {Text: resp.Completion}, } Code Samples: Use the Amazon Bedrock Plugin in LangChain Apps Once the Amazon Bedrock LLM plugin for langchaingo has been implemented, using it is as easy as creating a new instance with claude.New(<supported AWS region>) and using the Call (or Generate) function. Here is an example: package main import ( "context" "fmt" "log" "github.com/build-on-aws/langchaingo-amazon-bedrock-llm/claude" "github.com/tmc/langchaingo/llms" ) func main() { llm, err := claude.New("us-east-1") input := "Write a program to compute factorial in Go:" opt := llms.WithMaxTokens(2048) output, err := llm.Call(context.Background(), input, opt) //.... Prerequisites Before executing the sample code, clone the GitHub repository and change to the right directory: git clone github.com/build-on-aws/langchaingo-amazon-bedrock-llm cd langchaingo-amazon-bedrock-llm/examples Refer to the Before You Begin section in "Generative AI Apps With Amazon Bedrock: Getting Started for Go Developers" to complete the prerequisites for running the examples. This includes installing Go, configuring Amazon Bedrock access, and providing necessary IAM permissions. Run Basic Examples This example demonstrates tasks such as code generation, information extraction, and question-answering. You can refer to the code here. go run main.go Run Streaming Output Example In this example, we pass in the WithStreamingFunc option to the LLM invocation. This will switch to the streaming invocation mode for Amazon Bedrock. You can refer to the code here. //... _, err = llm.Call(context.Background(), input, llms.WithMaxTokens(2048), llms.WithTemperature(0.5), llms.WithTopK(250), llms.WithStreamingFunc(func(ctx context.Context, chunk []byte) error { fmt.Print(string(chunk)) return nil })) To run the program: go run streaming/main.go Conclusion LangChain is a powerful and extensible library that allows us to plugin external components as per requirements. This blog demonstrated how to extend langchaingo to make sure it works with the Anthropic Claude model available in Amazon Bedrock. You can use the same approach to implement support for other Amazon Bedrock models such as Amazon Titan. The examples showed how to use simple LangChain apps to using the Call function. In future blog posts, I will cover how to use them as part of chains for implementing functionality like a chatbot or QA assistant. Until then, happy building!
In the ever-evolving landscape of software engineering, the database stands as a cornerstone for storing and managing an organization's critical data. From ancient caves and temples that symbolize the earliest forms of information storage to today's distributed databases, the need to persistently store and retrieve data has been a constant in human history. In modern applications, the significance of a well-managed database is indispensable, especially as we navigate the complexities of cloud-native architectures and application modernization. Why a Database? 1. State Management in Microservices and Stateless Applications In the era of microservices and stateless applications, the database plays a pivotal role in housing the state and is crucial for user information and stock management. Despite the move towards stateless designs, certain aspects of an application still require a persistent state, making the database an integral component. 2. Seizing Current Opportunities The database is not just a storage facility; it encapsulates the current opportunities vital for an organization's success. Whether it's customer data, transaction details, or real-time analytics, the database houses the pulse of the organization's present, providing insights and supporting decision-making processes. 3. Future-Proofing for Opportunities Ahead As organizations embrace technologies like Artificial Intelligence (AI) and Machine Learning (ML), the database becomes the bedrock for unlocking new opportunities. Future-proofing involves not only storing current data efficiently but also structuring the database to facilitate seamless integration with emerging technologies. The Challenges of Database Management Handling a database is not without its challenges. The complexity arises from various factors, including modeling, migration, and the constant evolution of products. 1. Modeling Complexity The initial modeling phase is crucial, often conducted when a product is in its infancy, or the organization lacks the maturity to perform optimally. The challenge lies in foreseeing the data requirements and relationships accurately. 2. Migration Complexity Unlike code refactoring on the application side, database migration introduces complexity that surpasses application migration. The need for structural changes, data transformations, and ensuring data integrity makes database migration a challenging endeavor. 3. Product Evolution Products evolve, and so do their data requirements. The challenge is to manage the evolutionary data effectively, ensuring that the database structure remains aligned with the changing needs of the application and the organization. Polyglot Persistence: Exploring Database Options In the contemporary software landscape, the concept of polyglot persistence comes into play, allowing organizations to choose databases that best suit their specific scenarios. This approach involves exploring relational databases, NoSQL databases, and NewSQL databases based on the application's unique needs. Integrating Database and Application: Bridging Paradigms One of the critical challenges in mastering Java Persistence lies in integrating the database with the application. This integration becomes complex due to the mismatch between programming paradigms in Java and database systems. Patterns for Integration Several design patterns aid in smoothing the integration process. Patterns like Driver, Active Record, Data Mapper, Repository, DAO (Data Access Object), and DTO (Data Transfer Object) provide blueprints for bridging the gap between the Java application and the database. Data-Oriented vs. Object-Oriented Programming While Java embraces object-oriented programming principles like inheritance, polymorphism, encapsulation, and types, the database world revolves around normalization, denormalization, and structural considerations. Bridging these paradigms requires a thoughtful approach. Principles of Database-Oriented Programming: Separating Code (Behavior) from Data Encourage a clean separation between business logic and data manipulation. Representing Data with Generic Data Structures Use generic structures to represent data, ensuring flexibility and adaptability. Treating Data as Immutable Embrace immutability to enhance data consistency and reliability. Separating Data Schema from Data Representation Decouple the database schema from the application's representation of data to facilitate changes without affecting the entire system. Principles of Object-Oriented Programming Expose Behavior and Hide Data Maintain a clear distinction between the functionality of objects and their underlying data. Abstraction Utilize abstraction to simplify complex systems and focus on essential features. Polymorphism Leverage polymorphism to create flexible and reusable code. Conclusion Mastering Java Persistence requires a holistic understanding of these principles, patterns, and paradigms. The journey involves selecting the proper database technologies and integrating them seamlessly with Java applications while ensuring adaptability to future changes. In this dynamic landscape, success stories, documentation, and a maturity model serve as guiding beacons, aiding developers and organizations in their pursuit of efficient and robust database management for cloud-native applications and modernization initiatives. Video and Slide Presentation Slides
In the digital landscape of today, applications heavily rely on external HTTP/REST APIs for a wide range of functionalities. These APIs often orchestrate a complex web of internal and external API calls. This creates a network of dependencies. Therefore, when a dependent API fails or undergoes downtime, the primary application-facing API needs adeptly handle these disruptions gracefully. In light of this, this article explores the implementation of retry mechanisms and fallback methods in Spring microservices. Specifically, it highlights how these strategies can significantly bolster API integration reliability and notably improve user experience. Understanding Dependent API Failures Mobile and web applications consuming APIs dependent on other services for successful execution face unique challenges. For instance, calls to dependent APIs can fail for a variety of reasons, including network issues, timeouts, internal server errors, or scheduled downtimes. As a result, such failures can compromise user experience, disrupt crucial functionalities, and lead to data inconsistencies. Thus, implementing strategies to gracefully handle these failures is vital for maintaining system integrity. Retry Mechanisms As a primary solution, retry mechanisms serve to handle transient errors and temporary issues. By automatically reattempting an API call, this mechanism can often resolve problems related to brief network glitches or temporary server unavailability. Importantly, it's crucial to differentiate between scenarios suitable for retries, such as network timeouts, and those where retries might be ineffective or even detrimental, like business logic exceptions or data validation errors. Retry Strategies Common approaches include: Fixed interval retries: Attempting retries at regular intervals Exponential backoff: This strategy involves increasing the interval between retries exponentially, thereby reducing the load on the server and network. Moreover, both methods should be accompanied by a maximum retry limit to prevent infinite loops. Additionally, it’s essential to monitor and log each retry attempt for future analysis and system optimization. Retry and Fallback in Spring Microservices There are 2 ways in which we can implement the Retry and Fallback method. 1. resilience4j @Retry annotation is a declarative way designed to simplify the implementation of retry logic in applications. This annotation is available in the resilience4j package. By applying this annotation to service methods, Spring handles the retry process automatically when specified types of exceptions are encountered. The following is a real implementation example. The method calls an API to pull the bureau report for underwriting a loan application. If this method fails, the entire loan application underwriting process fails, impacting the consuming application, such as a mobile application. So we have annotated this method with @Retry: Java @Override @Retry(name = "SOFT_PULL", fallbackMethod = "performSoftPull_Fallback") public CreditBureauReportResponse performSoftPull(SoftPullParams softPullParams, ErrorsI error) { CreditBureauReportResponse result = null; try { Date dt = new Date(); logger.info("UnderwritingServiceImpl::performSoftPull method call at :" + dt.toString() + ", for loan acct id:" + softPullParams.getUserLoanAccountId()); CreditBureauReportRequest request = this.getCreditBureauReportRequest(softPullParams); RestTemplate restTemplate = this.externalApiRestTemplateFactory.getRestTemplate("SOFT_PULL", error); HttpHeaders headers = this.getHttpHeaders(softPullParams); HttpEntity<CreditBureauReportRequest> entity = new HttpEntity<>(request, headers); long startTime = System.currentTimeMillis(); String uwServiceEndPoint = "/transaction"; String callUrl = String.format("%s%s", appConfig.getUnderwritingTransactionApiPrefix(), uwServiceEndPoint); ResponseEntity<CreditBureauReportResponse> responseEntity = restTemplate.exchange(callUrl, HttpMethod.POST, entity, CreditBureauReportResponse.class); result = responseEntity.getBody(); long endTime = System.currentTimeMillis(); long timeDifference = endTime - startTime; logger.info("Time taken for API call SOFT_PULL/performSoftPull call 1: " + timeDifference); } catch (HttpClientErrorException exception) { logger.error("HttpClientErrorException occurred while calling SOFT_PULL API, response string: " + exception.getResponseBodyAsString()); throw exception; } catch (HttpStatusCodeException exception) { logger.error("HttpStatusCodeException occurred while calling SOFT_PULL API, response string: " + exception.getResponseBodyAsString()); throw exception; } catch (Exception ex) { logger.error("Error occurred in performSoftPull. Detail error:", ex); throw ex; } return result; } We can define the other attributes like the number of retries and delays between retries in the application.yml file: YAML resilience4j.retry: configs: default: maxRetryAttempts: 3 waitDuration: 100 externalPartner: maxRetryAttempts: 2 waitDuration: 1000 instances: SOFT_PULL: baseConfig: externalPartner We specify the fallback method fallbackMethod = "performSoftPull_Fallback". This method is invoked if all the configured retry attempts fail; in this case, two. Java public CreditBureauReportResponse performSoftPull_Fallback(SoftPullParams softPullParams, ErrorsI error, Exception extPartnerException) { logger.info("UnderwritingServiceImpl::performSoftPull_Fallback - fallback , method called for soft pull api call"); CreditBureauReportResponse creditBureauReportResponse = null; String loanAcctId = softPullParams.getUserLoanAccountId(); ApplicantCoBorrowerIdsMapping applicantCoBorrowerIdsMapping = this.uwCoBorrowerRepository.getApplicantCoBorrowerIdsMapping(loanAcctId); try { boolean result = this.partnerServiceExceptionRepository.savePartnerServiceException(applicantCoBorrowerIdsMapping.getApplicantUserId(), applicantCoBorrowerIdsMapping.getLoanId(), PartnerService.SOFT_PULL.getValue(), "GDS", null); if (!result) { logger.error("UnderwritingServiceImpl::performSoftPull_Fallback - Unable to save entry in the partner service exception table."); } LoanSubStatus loanSubStatus = LoanSubStatus.PARTNER_API_ERROR; result = this.loanUwRepository.saveLoanStatus(applicantCoBorrowerIdsMapping.getApplicantUserId(), applicantCoBorrowerIdsMapping.getLoanId(), IcwLoanStatus.INITIATED.getValue(), loanSubStatus.getName(), "Partner Service Down", null); if (!result) { logger.error("UnderwritingServiceImpl::performSoftPull_Fallback - Unable to update loan status, sub status when partner service is down."); } } catch (Exception ex) { logger.error("UnderwritingServiceImpl::performSoftPull_Fallback - An error occurred while calling softPullExtPartnerFallbackService, detail error:", ex); } creditBureauReportResponse = new CreditBureauReportResponse(); UnderwritingApiError underwritingApiError = new UnderwritingApiError(); underwritingApiError.setCode("IC-EXT-PARTNER-1001"); underwritingApiError.setDescription("Soft Pull API error"); List<UnderwritingApiError> underwritingApiErrors = new ArrayList<>(); underwritingApiErrors.add(underwritingApiError); creditBureauReportResponse.setErrors(underwritingApiErrors); return creditBureauReportResponse; } In this scenario, the fallback method returns the same response object as the original method. However, we also record in our data storage that the service is down and save state, and relay an indicator back to the consumer service method. This indicator is then passed on to the consuming mobile application, alerting the user about issues with our partner services. Once the issue is rectified, we utilize the persisted state to resume the workflow and send a notification to the mobile application, indicating that normal operations can continue. 2. spring-retry In this case, we need to install the spring-retry and spring-aspects packages. For the same method as above, we will replace it with @Retry annotation: Java @Retryable(retryFor = {HttpClientErrorException.class, HttpStatusCodeException.class, Exception.class}, maxAttempts = 2, backoff = @Backoff(delay = 100)) public CreditBureauReportResponse performSoftPull(SoftPullParams softPullParams, ErrorsI error) { The @Retryable annotation in Spring allows us to specify multiple exception types that should trigger a retry. We can list these exception types in the value attribute of the annotation. To write a fallback method for our @Retryable annotated method performSoftPull, we would use the @Recover annotation. This method is invoked when the performSoftPull method exhausts its retry attempts due to the specified exceptions (HttpClientErrorException, HttpStatusCodeException, Exception). The @Recover method should have a matching signature to the @Retryable method, with the addition of the exception type as the first parameter. Java @Recover public CreditBureauReportResponse fallbackForPerformSoftPull(HttpClientErrorException ex, SoftPullParams softPullParams, ErrorsI error) { // Fallback Implementation } @Recover public CreditBureauReportResponse fallbackForPerformSoftPull(HttpStatusCodeException ex, SoftPullParams softPullParams, ErrorsI error) { // Fallback Implementation } @Recover public CreditBureauReportResponse fallbackForPerformSoftPull(Exception ex, SoftPullParams softPullParams, ErrorsI error) { // Fallback Implementation } Conclusion In summary, in Spring microservices, effectively handling API failures with retry mechanisms and fallback methods is essential for building robust, user-centric applications. These strategies ensure the application remains functional and provides a seamless user experience, even in the face of API failures. By implementing retries for transient issues and defining fallback methods for more persistent failures, Spring applications can offer reliability and resilience in today’s interconnected digital world.
Let's use ChatGPT to build a REST API Microservice for a budgeting application. This needs to support multi-tenant security and include actual spending matched against budget categories. Of course, a Google sheet or Excel would be the simple answer. However, I wanted a multi-user cloud solution and to use the new open-source REST API microservice platform API Logic Server (ALS). Our microservice needs an SQL database, an ORM, a server, REST API, react-admin UI, and a docker container. AI Design of the Data Model I started by asking ChatGPT 3.5 to generate a budget application data model. Markdown ## Create MySQL tables to do a budgeting application with sample account data This gave me the basic starting point with a budget, category, user, transaction (for actual spending), and account tables. However, we need to translate the spreadsheet model to a database design with rules to handle the sums, counts, and formulas. I started with the SQL group and asked ChatGPT to add new tables for CategoryTotal, MonthTotal, and YearTotal. I renamed the tables and added a flag on the category table to separate expenses from income budget items. MySQL -- Month Total select month_id, count(*), sum(amount) as 'Budget Amount', sum(actual_amount) from budget where user_id = 1 and year_id = 2023 group by year_id, month_id -- Category Total select category_id, count(*), sum(amount) as 'Budget Amount', sum(actual_amount) from budget where user_id = 1 and year_id = 2023 group by year_id, category_id API Logic Server I installed Python and API Logic Server (an open-source Python microservice platform) and used the command line interface to connect to the MySQL database. This created a SQLAlchemy model, a react-admin UI, and an OpenAPI (Swagger). Command Line To Create a New Project Install ALS, create the sample project, and start VSCode (press F5 to run). Shell $python -m venv venv; venv\Scripts\activate # win $python3 -m venv venv; . venv/bin/activate # mac/linux $python -m pip install ApiLogicServer Collecting ApiLogicServer Downloading ApiLogicServer-9.5.0-py3-none-any.whl (11.2 MB) ━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━━ 5.3/11.2 MB 269.0 kB/s eta 0:00:23 .... truncated .... $ApiLogicServer create --project_name=BudgetApp --db_url=BudgetApp $cd BudgetApp $code . SQLAlchemy Model Api Logic Server created a SQLAlchemy class definition for each table. This shows the Budget entity (table: budget), columns, and relationships. If the database model changes, this can easily be regenerated as part of the development lifecycle process. Python class Budget(SAFRSBase, Base): __tablename__ = 'budget' _s_collection_name = 'Budget' # type: ignore __bind_key__ = 'None' __table_args__ = ( ForeignKeyConstraint(['year_id', 'category_id', 'user_id'], ['category_total.year_id', 'category_total.category_id', 'category_total.user_id'], ondelete='CASCADE'), ForeignKeyConstraint(['year_id', 'month_id', 'user_id'], ['month_total.year_id', 'month_total.month_id', 'month_total.user_id'], ondelete='CASCADE') ) budget_id = Column(Integer, primary_key=True) year_id = Column(Integer, server_default="2023") month_id = Column(Integer, nullable=False) user_id = Column(ForeignKey('tenant_user.user_id'), nullable=False) category_id = Column(ForeignKey('categories.category_id'), nullable=False) description = Column(String(200)) amount : DECIMAL = Column(DECIMAL(10, 2), nullable=False) actual_amount : DECIMAL = Column(DECIMAL(10, 2), server_default="0") variance_amount : DECIMAL = Column(DECIMAL(10, 2), server_default="0") count_transactions = Column(Integer, server_default="0") budget_date = Column(DateTime, server_default=text("CURRENT_TIMESTAMP")) is_expense = Column(Integer, server_default="1") # parent relationships (access parent) category : Mapped["Category"] = relationship(back_populates=("BudgetList")) user : Mapped["TenantUser"] = relationship(back_populates=("BudgetList")) category_total : Mapped["CategoryTotal"] = relationship(back_populates=("BudgetList")) month_total : Mapped["MonthTotal"] = elationship(back_populates=("BudgetList")) # child relationships (access children) TransactionList : Mapped[List["Transaction"]] = relationship(back_populates="budget") OpenAPI Created for Each Table Declarative Rules API Logic Server rules are similar to spreadsheet definitions but derive (and persist) values at the column level when updates are submitted. And like a spreadsheet, the order of operations is determined based on the state dependency of the change. API Logic Server has an open-source rule engine (LogicBank) that monitors updates using SQLAlchemy before the flush event. That means rule invocation is automatic, multi-table, and eliminates an entire class of programming errors (i.e., rules execute for every insert, update, or delete). To aggregate a column, we need a parent table. Note that in a spreadsheet, the column totals are aggregated using a ‘sum’ or ‘count.’ The insert_parent flag allows the child row to create the parent row if it does not exist (using the multiple foreign keys) before doing the aggregations. This feature can do multi-level group-bys for all types of applications (e.g., accounting group by debit/credit for year, month, quarter). While an SQL group-by can yield a similar result, declarative rules adjust and persist the column values during insert, update, or delete. Spreadsheet-like declarative rules are entered using code completion, and examples are shown below: RULE Example Notes Sum Rule.sum(derive=models.MonthTotal.budt_amount, as_sum_of=models.Budget.amount, where=Lambda row: row.year_id == 2023) Derive parent-attribute as sum of designated child attribute; optional child qualification Count Rule.count(derive=models.Budget.transaction_count, as_count_of=models.Transaction,where=Lambda row: row.year_id == 2023) Derive parent-attribute as count of child rows; optional child qualification Formula Rule.formula(derive=models.Budget.variance, as_expression=lambda row: row.actual_amount - row.amount) Lambda function computes column value Constraint Rule.constraint(validate=models.Customer, as_condition=lambda row: row.Balance <= row.CreditLimit, error_msg="balance ({row.Balance}) exceeds credit ({row.CreditLimit})") Boolean lambda function must be True else transaction rolled back with message Copy Rule.copy(derive=models.Transaction.month_id, from_parent=models.Budget.month_id) Child value copied from parent column Event Rule.row_event(on_class=models.Budget, calling=my_function) Python Function call (early eventy, row event, and commit event) Sum Rule These simple declarations will aggregate the budget amount transaction amount and calculate the variance to the CategoryTotal, MonthTotal, and YrTotal tables. Note the flag (insert_parent) will create the parent row if it does not exist before doing the aggregation... The code completion feature makes the rule declarations easy. The rules are optimized and will handle insert updates. Delete by adjusting the values instead of doing an SQL group by formula, sum, or count each time a change is detected. (see logic/declare_logic.py) Python Rule.sum(derive=models.YrTotal.budget_total, as_sum_of=models.CategoryTotal.budget_total,insert_parent=True) Rule.sum(derive=models.CategoryTotal.budget_total, as_sum_of=models.Budget.amount,insert_parent=True) Rule.sum(derive=models.MonthTotal.budget_total, as_sum_of=models.Budget.amount,insert_parent=True) Note: rules are un-ordered and will create a runtime log of the firing sequence based on state dependencies. That makes iterations rapid (no need to review logic to determine where to insert new code) and less error-prone. Create a Custom API In addition to SQLAlchemy Model creation, API Logic Server also creates a restful JSON API for created endpoints. This unblocks UI developers immediately. Here, we create a new custom REST API to POST a batch of actual CSV transactions. While API Logic Server has already created endpoints for API/budget and API/transaction — this is a demonstration of how to extend the REST API. The new endpoints show up in the OpenAPI (Swagger) and allow testing directly. The SQLAlchemy and Flask/safrs JSON API allow a great deal of flexibility to perform complex filters and queries to shape rest APIs’. (see api/customize_api.py) Python class BatchTransactions(safrs.JABase): @classmethod @jsonapi_rpc(http_methods=["POST"]) def csv_transaction_insert(cls, *args, **kwargs): """ # yaml creates Swagger description args : budget_id: 1 amount: 100 category_id: 1 description: 'test transaction insert' """ db = safrs.DB session = db.session # we parse the POST *kwargs to handle multiple transactions - returns JSON # the csv has date, category, and amount for csv_row in get_csv_payload(kwargs): trans = models.Transaction() trans.category_id = lookup_category(csv_row, "category") trans.amount = csv_row.amount trans.transaction_date = csv_row.date session.add(trans) return {"transaction(s) insert done"} @classmethod @jsonapi_rpc(http_methods=["GET"]) def get_budget(cls, *args, **kwargs): ''' Use SQLAlchemy to get budget, category, month, and year total ''' db = safrs.DB # valid only after is initialized, above session = db.session user_id = Security.current_user().user_id budget_list = session.query(models.Budget).filter(models.Budget.year_id == 2023 and models.Budget.user_id == user_id).all() result = [] for row in budget_list: budget_row = (jsonify(row).json)['attributes'] month_total = (jsonify(row.month_total).json)['attributes'] category_total = (jsonify(row.category_total).json)['attributes'] year_total = (jsonify(row.category_total.yr_total).json)['attributes'] result.append({"budget":budget_row, "category_total":category_total, "month_total": month_total, "year_total": year_total}) return jsonify(result) Declarative Security We can initialize the API Logic Server to use a custom secure login, and this will enable declarative security. Security has two parts: authentication (login) and authorization (access). The security/declare_authorization.py file lets us declare a global tenant filter for all roles (except admin or sa). Adding a GlobalFilter will apply an additional where clause to any table that has a column named "user_id." The default role permission applies to the users' role and defines the global access setting. Grants can be applied to a role to further extend or remove access to an endpoint. Python class Roles(): ''' Define Roles here, so can use code completion (Roles.tenant) ''' tenant = "tenant" renter = "renter" admin = "admin" sa = "sa" DefaultRolePermission(to_role=Roles.tenant,can_read=True, can_delete=False) DefaultRolePermission(to_role=Roles.admin,can_read=True, can_delete=True) GlobalFilter(global_filter_attribute_name="user_id", roles_not_filtered = ["sa", "admin"], filter="{entity_class}.user_id == Security.current_user().id") Iterative Development The concept of API lifecycle management is critical. I added a variance column to each table (budget, month_total, category_total, and yr_total) to calculate the difference between the actual_amount minus budget_amount. I changed the SQL database (SQLite) and then asked the API Logic Server command line to rebuild-model-from-database. This will rebuild the database/model.py and the react-dmin UI, while preserving the logic and security we already defined. CLI to rebuild-from-database Shell ApiLogicServer rebuild-from-database --project_name=BudgetApp --db_url=BudgetApp Formula Rules operate at the column (aka field) level to calculate the variance between the budget entry and all the transaction actuals. The variance will be calculated if either the budget or the transaction's actual amounts change. Python Rule.formula(derive=models.Budget.variance_amount, as_expression=lambda row: row.actual_amount - row.amount) Rule.formula(derive=models.CategoryTotal.variance_amount, as_expression=lambda row: row.actual_amount - row.budget_total) Rule.formula(derive=models.MonthTotal.variance_amount, as_expression=lambda row: row.actual_amount - row.budget_total) Rule.formula(derive=models.YrTotal.variance_amount, as_expression=lambda row: row.actual_amount - row.budget_total) Testing The OpenAPI (Swagger) endpoint generates CURL command to test inserting Budget and Transaction entries. Using the react-admin UI to view the YrTotal endpoint to see if the aggregation group-by worked correctly. There are some Behave (TDD) tests that do the same thing. The Open API will generate both a URL and a CURL entry for the API developers and for testing locally. Below is the react-admin UI showing the YrTotal budget, actual, and variance amounts. Example CURL command to post a budget entry: Shell $curl -X 'POST' \ 'http://localhost:5656/api/budget' \ -H 'accept: application/vnd.api+json' \ -H 'Content-Type: application/json' \ -d '{ "meta": "data": { "attributes": { "year_id": 2023, "month_id": 1, "user_id": 1, "category_id": 1, "description": "Budget Test", "amount": amount, }, "type": "Budget" } }' Tracing the Rules The VSCode debug window shows a detailed list of the rules that fired and the rule execution order. More detailed information is available in the logs. Like a spreadsheet, as data value changes are made, the runtime LogicBank will fire the rules in the correct order to adjust the sums, counts, constraints, events, and formulas for months, categories, and year totals. Docker Container The DevOps folder in the API Logic Server has several subfolders to build and deploy this project as a docker container (and an optional NGINX container) locally or to the cloud. This allows me to quickly deploy my application to the cloud for testing and immediate user feedback. Summary Using the open-source API Logic Server with SQLAlchemy, Flask, safs/JSON API, and LogicBank to simulate the spreadsheet rules requires thinking of data as SQL Tables and applying rules accordingly to do the automated group-bys for sums and counts on CategoryTotal for each category, MonthTotal for each column by category, and YrTotal to sum all budget expenses. This is a multi-tenant secure cloud-based application built in a day using ChatGPT and automated microservice generation with declarative, spreadsheet-like rules. The ability to write a custom endpoint to bring back all the budget, category, month, and year totals in a single endpoint gives us, the UI developer, a complete spreadsheet functionality. API Logic Server provides automation for iterative building and deployment of a REST API microservice with declarative logic and security. These declarative rules help turn any SQL database into a spreadsheet.
Amazon Elastic Compute Cloud (EC2) stands as a cornerstone of AWS's suite of cloud services, providing a versatile platform for computing on demand. Yet, the true power of EC2 lies in its diverse array of instance types, each meticulously crafted to cater to distinct computational requirements, underpinned by a variety of specialized hardware architectures. This article goes into detail, exploring the intricacies of these instance types and dissecting the hardware that drives them. Through this foundational approach, we aim to furnish a more profound comprehension of EC2's ecosystem, equipping you with the insights necessary to make the right decisions when selecting the most apt instance for your specific use case. Why Understand the Hardware Beneath the Instances? When diving into cloud computing, it's tempting to view resources like EC2 instances as abstracted boxes, merely serving our applications without much thought to their inner workings. However, having a fundamental understanding of the underlying hardware of your chosen EC2 instance is crucial. This knowledge not only empowers you to make more informed decisions, optimizing both performance and costs, but also ensures your applications run smoothly, minimizing unexpected disruptions. Just as a chef selects the right tools for a dish or a mechanic chooses the correct parts for a repair, knowing the hardware components of your EC2 instances can be the key to unlocking their full potential. In this article, we'll demystify the hardware behind the EC2 curtains, helping you bridge the gap between abstract cloud resources and tangible hardware performance. Major Hardware Providers and Their Backgrounds Intel For years, Intel has been the cornerstone of cloud computing, with its Xeon processors powering a vast majority of EC2 instances. Renowned for their robust general-purpose computing capabilities, Intel's chips excel in a wide array of tasks, from data processing to web hosting. Their Hyper-Threading technology allows for higher multi-tasking, making them versatile for varied workloads. However, premium performance often comes at a premium cost. AMD AMD instances, particularly those sporting the EPYC series of processors, have started gaining traction in the cloud space. They are often pitched as cost-effective alternatives to Intel without compromising much on performance. AMD's strength lies in providing a high number of cores, making them suitable for tasks that benefit from parallel processing. They can offer a balance between price and performance, particularly for businesses operating on tighter budgets. ARM (Graviton) ARM's Graviton and Graviton2 processors represent a departure from traditional cloud computing hardware. These chips are known for their energy efficiency, derived from ARM's heritage in mobile computing. As a result, Graviton-powered instances can deliver a superior price-performance ratio, especially for scale-out workloads that can distribute tasks across multiple servers. They're steadily becoming the go-to choice for businesses prioritizing efficiency and cost savings. NVIDIA When it comes to GPU-intensive tasks, NVIDIA stands uncontested. Their Tesla and A100 GPUs, commonly found in EC2's GPU instances, are designed for workloads that demand heavy computational power. Whether machine learning training, 3D rendering, or high-performance computing, NVIDIA-powered instances offer accelerated performance. However, the specialized nature of these instances means they might not be the best choice for general computing tasks and can be more expensive. In essence, while EC2 instance families provide a high-level categorization, the real differentiation in performance, cost, and suitability comes from these underlying hardware providers. By understanding the strengths and limitations of each, businesses can tailor their cloud deployments to achieve the desired balance of performance and cost. 1. General Purpose Instances Notable types: T3/T4g (Intel/ARM), M7i/M7g (Intel/ARM), etc. Primary use: Balancing compute, memory, and networking Practical application: Web servers: A standard web application or website that requires balanced resources can run seamlessly on general-purpose instances Developer environments: The burstable performance of t2 and t3 makes them ideal for development and testing environments where resource demand fluctuates. 2. Compute Optimized Instances Notable Types: C7i/C7g (Intel/ARM), etc. Primary Use: High computational tasks Practical application: High-performance web servers: Websites with massive traffic or services that require quick response times Scientific modeling: Simulating climate patterns, genomic research, or quantum physics calculations 3. Memory Optimized Instances Notable Types: R7i/R7g (Intel/ARM), X1/X1e (Intel), etc. Primary Use: Memory-intensive tasks Practical Application: Large-scale databases: Running applications like MySQL, PostgreSQL, or big databases like SAP HANA Real-time Big Data analytics: Analyzing massive data sets in real-time, such as stock market trends or social media sentiment analysis 4. Storage Optimized Instances Notable types: I3/I3en (Intel), D3/D3en (Intel), H1 (Intel), etc. Primary use: High random I/O access Practical Application: NoSQL databases: Deploying high-transaction databases like Cassandra or MongoDB Data warehousing: Handling and analyzing vast amounts of data, such as user data for large enterprises 5. Accelerated Computing Instances Notable types: P5 (NVIDIA/AMD), Inf1 (Intel), G5 (NVIDIA), etc. Primary use: GPU-intensive tasks Practical application: Machine Learning: Training complex models or neural networks Video rendering: Creating high-quality animation or special effects for movies 6. High-Performance Computing (HPC) Instances Notable types: Hpc7g, Hpc7a Primary use: Tasks requiring extremely high frequencies or hardware acceleration Practical Application: Electronic Design Automation (EDA): Designing and testing electronic circuits Financial simulations: Predicting stock market movements or calculating complex investment scenarios 7. Bare Metal Instances Notable types: m5.metal, r5.metal (Intel Xeon) Primary use: Full access to underlying server resources Practical application: High-performance databases: When databases like Oracle or SQL Server require direct access to server resources Sensitive workloads: Tasks that must comply with strict regulatory or security requirements Each EC2 instance family is tailored for specific workload requirements, and the underlying hardware providers further influence their performance. Users can achieve optimal performance and cost efficiency by aligning the workload with the appropriate instance family and hardware.