DZone Spotlight

Thursday, April 18 View All Articles »

The Intersection of Executive Engineering and Staff Engineering: A Staff Engineer's Viewpoint

By Otavio Santana

CORE

Executive engineers are crucial in directing a technology-driven organization’s strategic direction and technological innovation. As a staff engineer, it is essential to understand the significance of executive engineering. It goes beyond recognizing the hierarchy within an engineering department to appreciating the profound impact these roles have on individual contributors’ day-to-day technical work and long-term career development. Staff engineers are deep technical experts who focus on solving complex technical challenges and defining architectural pathways for projects. However, their success is closely linked to the broader engineering strategy set by the executive team. This strategy determines staff engineers' priorities, technologies, and methodologies. Therefore, aligning executive decisions and technical implementation is essential for the engineering team to function effectively and efficiently. Executive engineers, such as Chief Technology Officers (CTOs) and Vice Presidents (VPs) of Engineering, extend beyond mere technical oversight; they embody the bridge between cutting-edge engineering practices and business outcomes. They are tasked with anticipating technological trends and aligning them with the business’s needs and market demands. In doing so, they ensure that the engineering teams are not just functional but are proactive agents of innovation and growth. For staff engineers, the strategies and decisions made at the executive level deeply influence their work environment, the tools they use, the scope of their projects, and their approach to innovation. Thus, understanding and engaging with executive engineering is essential for staff engineers who aspire to contribute significantly to their organizations and potentially advance into leadership roles. In this dynamic, the relationship between staff and executive engineers becomes a critical axis around which much of the company’s success revolves. This introduction aims to explore why executive engineering is vital from the staff engineer’s perspective and how it shapes an organization's technological and operational landscape. Hierarchal Structure of Engineering Roles In the hierarchical structure of engineering roles, understanding each position’s unique responsibilities and contributions—staff engineer, engineering manager, and engineering executive—is crucial for effective career progression and organizational success. Staff Engineers are primarily responsible for high-level technical problem-solving and creating architectural blueprints. They guide projects technically but usually only indirectly manage people. Engineering Managers oversee teams, focusing on managing personnel and ensuring that projects align with the organizational goals. They act as the bridge between the technical team and the broader business objectives. Engineering Executives, such as CTOs or VPs of Engineering, shape the strategic vision of the technology department and ensure its alignment with the company’s overarching goals. They are responsible for high-level decisions about the direction of technology and infrastructure, often dealing with cross-departmental coordination and external business concerns. The connection between a staff engineer and an engineering executive is pivotal in crafting and executing an effective strategy. While executives set the strategic direction, staff engineers are instrumental in grounding this strategy with their deep technical expertise and practical insights. This collaboration ensures that the strategic initiatives are visionary and technically feasible, enabling the organization to innovate while maintaining robust operational standards. The Engineering Executive’s Primer: Impactful Technical Leadership Will Larson’s book, The Engineering Executive’s Primer: Impactful Technical Leadership, is an essential guide for those aspiring to or currently in engineering leadership roles. With his extensive experience as a CTO, Larson offers a roadmap from securing an executive position to mastering the complexities of technical and strategic leadership in engineering. Key Insights From the Book Transitioning to Leadership Larson discusses the nuances of obtaining an engineering executive role, from negotiation to the critical first steps post-hire. This guidance is vital for engineers transitioning from technical to executive positions, helping them avoid common pitfalls. Strategic Planning and Communication The book outlines how to run engineering planning processes and maintain clear organizational communication effectively. These skills are essential for aligning various engineering activities with company goals and facilitating inter-departmental collaboration. Operational Excellence Larson delves into managing crucial meetings, performance management systems, and new engineers’ strategic hiring and onboarding. These processes are fundamental to maintaining a productive engineering team and fostering a high-performance culture. Personal Management Understanding the importance of managing one’s priorities and energy is another book focus, which is often overlooked in technical fields. Larson provides strategies for staying effective and resilient in the face of challenges. Navigational Tools for Executive Challenges From mergers and acquisitions to interacting with CEOs and peer executives, the book provides insights into the broader corporate interactions an engineering executive will navigate. Conclusion The engineering executive’s role is pivotal in setting a vision that integrates with the organization’s strategic objectives. Still, the symbiotic relationship with staff engineers brings this vision to fruition. Larson’s The Engineering Executive’s Primer is an invaluable resource for engineers at all levels, especially those aiming to bridge the gap between deep technical expertise and impactful leadership. Through this primer, engineering leaders can learn to manage, inspire, and drive technological innovation within their companies. More

Securing Mobile Apps: Development Strategies and Testing Methods

By Naga Santhosh Reddy Vootukuri

In today's digital world, mobile apps play a crucial role in our daily lives. They serve a range of purposes from transactions and online shopping to social interactions and work efficiency, making them essential. However, with their widespread use comes an increased risk of security threats. Ensuring the security of an app requires an approach from development methods to continuous monitoring. Prioritizing security is key to safeguarding your users and upholding the trustworthiness of your app. Remember, security is an ongoing responsibility rather than a one-time task. Stay updated on emerging risks. Adjust your security strategies accordingly. The following sections discuss the importance of security measures and outline the steps for developing a mobile app. What Is Mobile App Security and Why Does It Matter? Mobile app security involves practices and precautions to shield apps from vulnerabilities, attacks, and unauthorized entry. It encompasses elements such as data safeguarding, authentication processes, authorization mechanisms, secure coding principles, and encryption techniques. The Significance of Ensuring Mobile App Security User Trust: Users expect their personal information to be kept safe when using apps. A breach would damage trust and reputation. Compliance With Laws and Regulations: Most countries have laws to protect data such as GDPR, which organizations are required to adhere to. Not following these regulations could result in penalties. Financial Consequences: Security breaches can lead to losses to costs, compensations, and recovery efforts. Sustaining Business Operations: A compromised app has the potential to disrupt business functions and affect revenue streams. Guidelines for Developing a Secure Mobile App Creating an application entails various crucial steps aimed at fortifying the app against possible security risks. The following is a detailed roadmap for constructing an app. 1. Recognize and Establish Security Requirements Prior to commencing development, outline the security prerequisites specific to your app. Take into account aspects like authentication, data storage, encryption, and access management. 2. Choose a Reliable Cloud Platform Choose a cloud service provider that offers security functionalities. Popular choices may include Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). 3. Ensure Safe Development Practices • Educate developers on coding methods to steer clear of vulnerabilities such as SQL injection, cross-site scripting (XSS), and insecure APIs. • Conduct routine code reviews to detect security weaknesses at an early stage. 4. Implement Authentication and Authorization Measures • Employ robust authentication methods like factor authentication (MFA) for heightened user login security. • Utilize Role-Based Access Control (RBAC) to assign permissions based on user roles limiting access to functionalities. 5. Safeguard Data Through Encryption • Utilize HTTPS for communication between the application and server for in-transit encryption. • Encrypt sensitive data stored in databases or files for at-rest encryption. 6. Ensure the Security of APIs • Validate input by employing API keys. Set up rate limiting for API security. • Securely handle user authentication and authorization with OAuth and OpenID Connect protocols. 7. Conduct Regular Security Assessments • Perform penetration testing periodically to identify vulnerabilities. • Leverage automated scanning tools to detect security issues efficiently. 8. Monitor Activities and Respond to Incidents • Keep track of behavior in time to spot any irregularities or anomalies promptly. Having a plan for handling security incidents is crucial. What Is Involved in Mobile Application Security Testing? Implementing robust security testing methods is crucial for ensuring the integrity and resilience of mobile applications. Static Application Security Testing (SAST), Dynamic Application Security Testing (DAST), and Mobile App Penetration Testing are fundamental approaches that help developers identify and address security vulnerabilities. These methodologies not only fortify the security posture of apps but also contribute to maintaining user trust and confidence. Let's delve deeper into each of these testing techniques to understand their significance in securing mobile apps effectively. Static Application Security Testing (SAST) This method involves identifying security vulnerabilities in applications during the development stage. It entails examining the application's source code or binary without executing it, which helps detect security flaws in the development process. SAST scans the codebase for vulnerabilities like injection flaws, authentication, insecure data storage, and other typical security issues. Automated scanning tools are used to analyze the code and pinpoint problems such as hardcoded credentials, improper input validation, and exposure of data. By detecting security weaknesses before deployment, SAST allows developers to make necessary improvements to enhance the application's security stance. Integrating SAST into the development workflow aids in meeting industry standards and regulatory mandates. In essence, SAST strengthens mobile application resilience against cyber threats by protecting information and upholding user confidence in today's interconnected environment. Dynamic Application Security Testing (DAST) This method is used to test the security of apps while they are running, assessing their security in time. Unlike analysis that looks at the app's source code, DAST evaluates how the app behaves in a setting. DAST tools emulate real-world attacks by interacting with the app as a user would, sending different inputs and observing the reactions. By analyzing how the app operates during runtime, DAST can pinpoint security issues such as injection vulnerabilities, weak authentication measures, and improper error handling. DAST mainly focuses on uncovering vulnerabilities that may not be obvious from examining the code. Some common techniques used in DAST include fuzz testing, where the app is bombarded with inputs to reveal vulnerabilities, and penetration testing conducted by hackers to exploit security flaws. By using DAST, developers can detect vulnerabilities that malicious actors could exploit to compromise an app's confidentiality, integrity, or availability of data. Integrating DAST into mobile app development allows developers to find and fix security weaknesses before deployment, thereby reducing the chances of security breaches and strengthening application security. Mobile App Penetration Testing This proactive approach is employed to pinpoint weaknesses and vulnerabilities in apps. Simulating real-world attacks is part of assessing the security stance of an application and its underlying infrastructure. Penetration tests can be conducted manually by cybersecurity experts or automated using specialized tools and software. The testing procedure includes several phases: Reconnaissance: Gather details about the application's structure, features, and possible attack paths. Vulnerability Scanning: Use automated tools to pinpoint security vulnerabilities in the app. Exploitation: Attempt to exploit identified vulnerabilities to gain entry or elevate privileges. Post-Exploitation: Document the consequences of breaches and offer recommendations for mitigation. Mobile App Penetration Testing helps organizations uncover and rectify security weaknesses and reduces the risk of data breaches, financial harm, and damage to reputation. By evaluating the security of their apps, companies can enhance their security standing and maintain the confidence of their clients. By combining the above methodologies, Mobile App Security Testing helps identify and rectify security vulnerabilities in the development process, ensuring that mobile apps are strong, resilient, and protected against cybersecurity risks. This helps safeguard user data and maintain user trust in today's interconnected world. Common Mobile App Security Threats Data Leakage Data leakage refers to the unauthorized exposure of sensitive information stored or transmitted via mobile apps. This poses significant risks for both individuals and companies, including identity theft, financial scams, damage to reputation, and legal ramifications. For individuals, data leaks can compromise details such as names, addresses, social security numbers, and financial information, impacting their privacy and security. Moreover, leaks of health or personal data can tarnish someone's reputation and well-being. On the business front, data leaks can result in financial losses, regulatory fines, and erosion of customer trust. Breaches involving customer data can harm a company's image, leading to customer loss, which can affect revenue and competitiveness. Failure to secure sensitive information can also lead to severe consequences and penalties, especially in regulated industries like healthcare, finance, or e-commerce. Therefore, implementing robust security measures is crucial to protect information and maintain user trust in mobile apps. Man-in-the-Middle (MITM) Attacks Man-in-the-Middle (MITM) Attacks happen when someone secretly intercepts and alters communication between two parties. In the context of apps, this involves a hacker inserting themselves between a user's device and the server, allowing them to spy on shared information. MITM attacks are risky, potentially leading to data theft and identity fraud as hackers can access login credentials, financial transactions, and personal data. To prevent MITM attacks, developers should use encryption methods such as HTTPS/TLS, while users should avoid public Wi-Fi networks and consider using VPNs for added security. Remaining vigilant and taking precautions are essential in protecting against MITM attacks. Injection Attacks Injection attacks pose significant security risks to apps as malicious actors exploit vulnerabilities to insert and execute unauthorized code. Common examples include SQL injection and JavaScript injection. During these attacks, perpetrators tamper with input fields to inject commands, gaining unauthorized access to data or disrupting app functions. Injection attacks can lead to data breaches, data tampering, and system compromise. To prevent these attacks, developers should enforce input validation, use secure queries, and adhere to secure coding practices. Regular security assessments and tests are crucial for pinpointing and addressing vulnerabilities in apps. Insecure Authentication Insecure authentication methods can lead to vulnerabilities, opening the door to entry and data breaches. Common issues include weak passwords, absence of two-factor authentication, and improper session management. Cyber attackers exploit these weaknesses to impersonate users, access data unlawfully, or seize control of user accounts. This compromised authentication system jeopardizes user privacy, data accuracy, and accessibility, posing risks to individuals and organizations. To address this risk, developers should implement security measures such as two-factor authentication and session tokens. Regular updates and enhancements to security protocols are crucial to stay ahead of evolving threats. Data Storage Ensuring secure data storage is crucial in today's technology landscape, especially for apps. It's vital to protect sensitive information and financial records to prevent unauthorized access and data breaches. Secure data storage includes encrypting information both at rest and in transit using encryption methods and secure storage techniques. Moreover, setting up access controls, authentication procedures, and conducting regular security checks are essential to uphold the confidentiality and integrity of stored data. By prioritizing these data storage practices and security protocols, developers can ensure that user information remains shielded from risks and vulnerabilities. Faulty Encryption Faulty encryption and flawed security measures can lead to vulnerabilities within apps, putting sensitive data at risk of unauthorized access and misuse. If encryption algorithms are weak or not implemented correctly, encrypted data could be easily decoded by actors. Poor key management, like storing encryption keys insecurely, worsens these threats. Additionally, security protocols lacking proper authentication or authorization controls create opportunities for attackers to bypass security measures. The consequences of inadequate encryption and security measures can be substantial and can include data breaches, financial losses, and a decline in user trust. To address these risks effectively, developers should prioritize encryption algorithms, secure management practices, and thorough security protocols in their mobile apps. The Unauthorized Use of Device Functions The misuse of device capabilities within apps presents a security concern, putting user privacy and device security at risk. Malicious apps or attackers could exploit weaknesses to access features like the camera, microphone, or GPS without permission leading to privacy breaches. This unauthorized access may result in monitoring, unauthorized audio/video recording, and location tracking, compromising user confidentiality. Additionally, unauthorized use of device functions could allow attackers to carry out activities such as sending premium SMS messages or making calls that incur costs or violate privacy. To address this issue effectively, developers should enforce permission controls. Carefully evaluate third-party tools and integrations to prevent misuse of device capabilities. Reverse Engineering and Altering Code Altering the code within apps can pose security risks and put the app's integrity and confidentiality at risk. Bad actors might decompile the code to find weaknesses, extract data, or alter its functions for malicious purposes. These activities allow attackers to bypass security measures, insert malicious code, or create vulnerabilities leading to data breaches, unauthorized access, and financial harm. Moreover, tampering with code can enable hackers to circumvent licensing terms or protections for developers' intellectual property, impacting their revenue streams. To effectively address this threat, developers should employ techniques like code obfuscation to obscure the code's meaning and make it harder for attackers to decipher. They should also establish safeguards during the app's operation and regularly audit the codebase for any signs of tampering or unauthorized modifications. These proactive measures help mitigate the risks associated with code alteration and maintain the app's security and integrity. Third-Party Collaborations Third-party collaborations in apps bring both advantages and risks. While connecting with third-party services can improve features and user satisfaction, it also exposes the app to security threats and privacy issues. Thoroughly evaluating third-party partners, following security protocols, and regularly monitoring are steps to manage these risks. Neglecting to assess third-party connections can lead to data breaches, compromised user privacy, and harm to the app's reputation. Therefore, developers should be cautious and diligent when entering into collaborations with parties to safeguard the security and credibility of their apps. Social Manipulation Strategies Social manipulation strategies present a security risk for apps leveraging human behavior to mislead users and jeopardize their safety. Attackers can use methods like emails deceptive phone calls or misleading messages to deceive users into sharing sensitive data like passwords or financial information. Moreover, these tactics can influence user actions like clicking on links or downloading apps containing malware. Such strategies erode user trust and may lead to data breaches, identity theft, or financial scams. To address this, it's important for users to understand social manipulation tactics and be cautious when dealing with suspicious requests, messages, or links in mobile apps. Developers should also incorporate security measures like two-factor authentication and anti-phishing tools to safeguard users against engineering attacks. Conclusion Always keep in mind that security is an ongoing responsibility and not a one-time job. Stay informed about threats and adapt your security measures accordingly. Developing an app can be crucial for safeguarding user data establishing trust and averting security breaches. More

Trend Report

Enterprise AI

Artificial intelligence (AI) has continued to change the way the world views what is technologically possible. Moving from theoretical to implementable, the emergence of technologies like ChatGPT allowed users of all backgrounds to leverage the power of AI. Now, companies across the globe are taking a deeper dive into their own AI and machine learning (ML) capabilities; they’re measuring the modes of success needed to become truly AI-driven, moving beyond baseline business intelligence goals and expanding to more innovative uses in areas such as security, automation, and performance.In DZone’s Enterprise AI Trend Report, we take a pulse on the industry nearly a year after the ChatGPT phenomenon and evaluate where individuals and their organizations stand today. Through our original research that forms the “Key Research Findings” and articles written by technical experts in the DZone Community, readers will find insights on topics like ethical AI, MLOps, generative AI, large language models, and much more.

Refcard #395

Open Source Migration Practices and Patterns

By Nuwan Dias

CORE

Open Source Migration Practices and Patterns

Refcard #171

MongoDB Essentials

By Abhishek Gupta

CORE

Telemetry Pipelines Workshop: Parsing Multiple Events

This article is part of a series exploring a workshop guiding you through the open source project Fluent Bit, what it is, a basic installation, and setting up the first telemetry pipeline project. Learn how to manage your cloud-native data from source to destination using the telemetry pipeline phases covering collection, aggregation, transformation, and forwarding from any source to any destination. The previous article in this series saw us building our first telemetry pipelines with Fluent Bit. In this article, we continue onwards with some more specific use cases that pipelines solve. You can find more details in the accompanying workshop lab. Let's get started with this use case. Before we get started it's important to review the phases of a telemetry pipeline. In the diagram below we see them laid out again. Each incoming event goes from input to parser to filter to buffer to routing before they are sent to their final output destination(s). For clarity in this article, we'll split up the configuration into files that are imported into a main fluent bit configuration file that we'll name workshop-fb.conf. Parsing Multiple Events One of the more common use cases for telemetry pipelines is having multiple event streams producing data that creates the situation that keys are not unique if parsed without some cleanup. Let's illustrate how Fluent Bit can easily provide us with a means to both parse and filter events from multiple input sources to clean up any duplicate keys before sending onward to a destination. To provide an example, we start with an inputs.conf file containing a configuration using the dummy plugin to generate two types of events, both using the same key to cause confusion if we try querying without cleaning them up first: # This entry generates a success message. [INPUT] Name dummy Tag event.success Dummy {"message":"true 200 success"} # This entry generates an error message. [INPUT] Name dummy Tag event.error Dummy {"message":"false 500 error"} Our configuration is tagging each successful event with event.success and failure events with event.error. The confusion will be caused by configuring the dummy message with the same key, message, for both event definitions. This will cause our incoming events to be confusing to deal with. The file called outputs.conf contains but one destination as shown in the following configuration: # This entry directs all tags (it matches any we encounter) # to print to standard output, which is our console. # [OUTPUT] Name stdout Match * With our inputs and outputs configured, we can now bring them together in a single main configuration file we mentioned at the start. Let's create a new file called workshop-fb.conf in our favorite editor. Add the following configuration; for now, just importing our other two files: # Fluent Bit main configuration file. # # Imports section, assumes these files are in the same # directory as the main configuration file. # @INCLUDE inputs.conf @INCLUDE outputs.conf To see if our configuration works, we can test run it with our Fluent Bit installation. Depending on the chosen install method used from the previous articles in this series, we have the option to run it from source, or using container images. First, we show how to run it using the source install execution from the directory we created to hold all our configuration files: # source install. # $ [PATH_TO]/fluent-bit --config=workshop-fb.conf The console output should look something like this - noting that we've cut out the ASCII logo at startup: ... [2024/04/05 16:49:33] [ info] [input:dummy:dummy.0] initializing [2024/04/05 16:49:33] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only) [2024/04/05 16:49:33] [ info] [input:dummy:dummy.1] initializing [2024/04/05 16:49:33] [ info] [input:dummy:dummy.1] storage_strategy='memory' (memory only) [2024/04/05 16:49:33] [ info] [output:stdout:stdout.0] worker #0 started [2024/04/05 16:49:33] [ info] [sp] stream processor started [0] event.success: [[1712328574.915990000, {}], {"message"=>"true 200 success"}] [0] event.error: [[1712328574.917728000, {}], {"message"=>"false 500 error"}] [0] event.success: [[1712328575.915732000, {}], {"message"=>"true 200 success"}] [0] event.error: [[1712328575.916608000, {}], {"message"=>"false 500 error"}] [0] event.success: [[1712328576.915161000, {}], {"message"=>"true 200 success"}] [0] event.error: [[1712328576.915288000, {}], {"message"=>"false 500 error"}] ... Also note the alternating generated event lines with messages that are hard to separate when using the same key. These events alternate in the console until exiting with CTRL_C. Next, we show how to run our telemetry pipeline configuration using a container image. First thing that is needed is a file called Buildfile. This is going to be used to build a new container image and insert our configuration files. Note this file needs to be in the same directory as your configuration files; otherwise, adjust the file path names: FROM cr.fluentbit.io/fluent/fluent-bit:3.0.1 COPY ./workshop-fb.conf /fluent-bit/etc/fluent-bit.conf COPY ./inputs.conf /fluent-bit/etc/inputs.conf COPY ./outputs.conf /fluent-bit/etc/outputs.conf Now we'll build a new container image as follows using the Buildfile, naming it with a version tag, and assuming you are in the same directory (using Podman as discussed in previous articles): $ podman build -t workshop-fb:v4 -f Buildfile STEP 1/4: FROM cr.fluentbit.io/fluent/fluent-bit:3.0.1 STEP 2/4: COPY ./workshop-fb.conf /fluent-bit/etc/fluent-bit.conf --> a379e7611210 STEP 3/4: COPY ./inputs.conf /fluent-bit/etc/inputs.conf --> f39b10d3d6d0 STEP 4/4: COPY ./outputs.conf /fluent-bit/etc/outputs.conf COMMIT workshop-fb:v4 --> b06df84452b6 Successfully tagged localhost/workshop-fb:v4 b06df84452b6eb7a040b75a1cc4088c0739a6a4e2a8bbc2007608529576ebeba Now, to run our new container image: $ podman run workshop-fb:v4 The output looks exactly like the source output above, just with different timestamps. Again you can stop the container using CTRL_C. Now we have dirty ingested data coming into our pipeline, showing that we have multiple messages on the same key. To be able to clean this up for usage before passing on to the backend (output), we need to make use of both the Parser and Filter phases. First, in the Parser phase, where unstructured data is converted into structured data, we'll make use of the built in REGEX parser plugin to structure the duplicate messages into something more usable. For clarity, this is where we are working in our telemetry pipeline: To set up the parser configuration, we create a new file called parsers.conf in our favorite editor. Add the following configuration, where we are defining a PARSER, naming the parser message_cleaning_parser, selecting the built-in regex parser, and applying the regular expression shown here to convert each message into a structured format (note this actually is applied to incoming messages in the next phase of the telemetry pipeline): # This parser uses the built-in parser plugin and applies the # regex to all incoming events. # [PARSER] Name message_cleaning_parser Format regex Regex ^(?<valid_message>[^ ]+) (?<code>[^ ]+) (?<type>[^ ]+)$ Next up is the Filter phase where we will apply the parser. For clarity, the following visual is provided: In the Filter phase, the previously defined parser is put to the test. To set up the filter configuration we create a new file called filters.conf in our favorite editor. Add the following configuration where we are defining a FILTER, naming the filter message_parser, matching all incoming messages to apply this filter, looking for the key message to select the value to be fed into the parser, and applying the parser message_cleaning_parser to it: # This filter is applied to all events and uses the named parser to # apply values found with the chosen key if it exists. # [FILTER] Name parser Match * Key_Name message Parser message_cleaning_parser To make sure the new filter and parser are included, we update our main configuration file workshop-fb.conf as follows: # Fluent Bit main configuration file. [SERVICE] parsers_file parsers.conf # Imports section. @INCLUDE inputs.conf @INCLUDE outputs.conf @INCLUDE filters.conf To verify that our configuration works we can test run it with our Fluent Bit installation. Depending on the chosen install method, here we show how to run it using the source installation followed by the container version. Below, the source install is shown from the directory we created to hold all our configuration files: # source install. # $ [PATH_TO]/fluent-bit --config=workshop-fb.conf The console output should look something like this - noting that we've cut out the ASCII logo at startup: ... [2024/04/09 16:19:42] [ info] [input:dummy:dummy.0] initializing [2024/04/09 16:19:42] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only) [2024/04/09 16:19:42] [ info] [input:dummy:dummy.1] initializing [2024/04/09 16:19:42] [ info] [input:dummy:dummy.1] storage_strategy='memory' (memory only) [2024/04/09 16:19:42] [ info] [output:stdout:stdout.0] worker #0 started [2024/04/09 16:19:42] [ info] [sp] stream processor started [0] event.success: [[1712672383.962198000, {}], {"valid_message"=>"true", "code"=>"200", "type"=>"success"}] [0] event.error: [[1712672383.964528000, {}], {"valid_message"=>"false", "code"=>"500", "type"=>"error"}] [0] event.success: [[1712672384.961942000, {}], {"valid_message"=>"true", "code"=>"200", "type"=>"success"}] [0] event.error: [[1712672384.962105000, {}], {"valid_message"=>"false", "code"=>"500", "type"=>"error"}] ... Be sure to scroll to the right in the above window to see the full console output. Note the alternating generated event lines with parsed messages that now contain keys to simplify later querying. This runs until exiting with CTRL_C. Let's now try testing our configuration by running it using a container image. The first thing that is needed is to open in our favorite editor the file Buildfile. This is going to be expanded to include the filters and parsers configuration files. Note this file needs to be in the same directory as our configuration files; otherwise, adjust the file path names: FROM cr.fluentbit.io/fluent/fluent-bit:3.0.1 COPY ./workshop-fb.conf /fluent-bit/etc/fluent-bit.conf COPY ./inputs.conf /fluent-bit/etc/inputs.conf COPY ./outputs.conf /fluent-bit/etc/outputs.conf COPY ./filters.conf /fluent-bit/etc/filters.conf COPY ./parsers.conf /fluent-bit/etc/parsers.conf Now we'll build a new container image, naming it with a version tag as follows using the Buildfile and assuming you are in the same directory (using Podman as discussed in previous articles): $ podman build -t workshop-fb:v4 -f Buildfile STEP 1/6: FROM cr.fluentbit.io/fluent/fluent-bit:3.0.1 STEP 2/6: COPY ./workshop-fb.conf /fluent-bit/etc/fluent-bit.conf --> 7eee3091e091 STEP 3/6: COPY ./inputs.conf /fluent-bit/etc/inputs.conf --> 53ff32210b0e STEP 4/6: COPY ./outputs.conf /fluent-bit/etc/outputs.conf --> 62168aa0c600 STEP 5/6: COPY ./filters.conf /fluent-bit/etc/filters.conf --> 08f0878ded1e STEP 6/6: COPY ./parsers.conf /fluent-bit/etc/parsers.conf COMMIT workshop-fb:v4 --> 92825169e230 Successfully tagged localhost/workshop-fb:v4 92825169e230a0cc36764d6190ee67319b6f4dfc56d2954d267dc89dab8939bd Now to run our new container image: $ podman run workshop-fb:v4 The output looks exactly like the source output above, noting that the alternating generated event lines with parsed messages now contain keys to simplify later querying. This completes our use cases for this article, be sure to explore this hands-on experience with the accompanying workshop lab. What's Next? This article walked us through a telemetry pipeline use case for multiple events using parsing and filtering. The series continues with the next step where we'll explore how to collect metrics using a telemetry pipeline. Stay tuned for more hands on material to help you with your cloud native observability journey.

By Eric D. Schabell

CORE

Real-Time Data Transfer from Apache Flink to Kafka to Druid for Analysis/Decision-Making

Businesses can react quickly and effectively to user behavior patterns by using real-time analytics. This allows them to take advantage of opportunities that might otherwise pass them by and prevent problems from getting worse. Apache Kafka, a popular event streaming platform, can be used for real-time ingestion of data/events generated from various sources across multiple verticals such as IoT, financial transactions, inventory, etc. This data can then be streamed into multiple downstream applications or engines for further processing and eventual analysis to support decision-making. Apache Flink serves as a powerful engine for refining or enhancing streaming data by modifying, enriching, or restructuring it upon arrival at the Kafka topic. In essence, Flink acts as a downstream application that continuously consumes data streams from Kafka topics for processing, and then ingests the processed data into various Kafka topics. Eventually, Apache Druid can be integrated to consume the processed streaming data from Kafka topics for analysis, querying, and making instantaneous business decisions. Click here for an enlarged view In my previous write-up, I explained how to integrate Flink 1.18 with Kafka 3.7.0. In this article, I will outline the steps to transfer processed data from Flink 1.18.1 to a Kafka 2.13-3.7.0 topic. A separate article detailing the ingestion of streaming data from Kafka topics into Apache Druid for analysis and querying was published a few months ago. You can read it here. Execution Environment We configured a multi-node cluster (three nodes) where each node has a minimum of 8 GB RAM and 250 GB SSD along with Ubuntu-22.04.2 amd64 as the operating system. OpenJDK 11 is installed with JAVA_HOME environment variable configuration on each node. Python 3 or Python 2 along with Perl 5 is available on each node. A three-node Apache Kafka-3.7.0 cluster has been up and running with Apache Zookeeper -3.5.6. on two nodes. Apache Druid 29.0.0 has been installed and configured on a node in the cluster where Zookeeper has not been installed for the Kafka broker. Zookeeper has been installed and configured on the other two nodes. The Leader broker is up and running on the node where Druid is running. Developed a simulator using the Datafaker library to produce real-time fake financial transactional JSON records every 10 seconds of interval and publish them to the created Kafka topic. Here is the JSON data feed generated by the simulator. JSON {"timestamp":"2024-03-14T04:31:09Z ","upiID":"9972342663@ybl","name":"Kiran Marar","note":" ","amount":"14582.00","currency":"INR","geoLocation":"Latitude: 54.1841745 Longitude: 13.1060775","deviceOS":"IOS","targetApp":"PhonePe","merchantTransactionId":"ebd03de9176201455419cce11bbfed157a","merchantUserId":"65107454076524@ybl"} Extract the archive of the Apache Flink-1.18.1-bin-scala_2.12.tgz on the node where Druid and the leader broker of Kafka are not running Running a Streaming Job in Flink We will dig into the process of extracting data from a Kafka topic where incoming messages are being published from the simulator, performing processing tasks on it, and then reintegrating the processed data back into a different topic of the multi-node Kafka cluster. We developed a Java program (StreamingToFlinkJob.java) that was submitted as a job to Flink to perform the above-mentioned steps, considering a window of 2 minutes and calculating the average amount transacted from the same mobile number (upi id) on the simulated UPI transactional data stream. The following list of jar files has been included on the project build or classpath. Using the code below, we can get the Flink execution environment inside the developed Java class. Java Configuration conf = new Configuration(); StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(conf); Now we should read the messages/stream that has already been published by the simulator to the Kafka topic inside the Java program. Here is the code block. Java KafkaSource kafkaSource = KafkaSource.<UPITransaction>builder() .setBootstrapServers(IKafkaConstants.KAFKA_BROKERS)// IP Address with port 9092 where leader broker is running in cluster .setTopics(IKafkaConstants.INPUT_UPITransaction_TOPIC_NAME) .setGroupId("upigroup") .setStartingOffsets(OffsetsInitializer.latest()) .setValueOnlyDeserializer(new KafkaUPISchema()) .build(); To retrieve information from Kafka, setting up a deserialization schema within Flink is crucial for processing events in JSON format, converting raw data into a structured form. Importantly, setParallelism needs to be set to no.of Kafka topic partitions else the watermark won't work for the source, and data is not released to the sink. Java DataStream<UPITransaction> stream = env.fromSource(kafkaSource, WatermarkStrategy.forBoundedOutOfOrderness(Duration.ofMinutes(2)), "Kafka Source").setParallelism(1); With successful event retrieval from Kafka, we can enhance the streaming job by incorporating processing steps. The subsequent code snippet reads Kafka data, organizes it by mobile number (upiID), and computes the average price per mobile number. To accomplish this, we developed a custom window function for calculating the average and implemented watermarking to handle event time semantics effectively. Here is the code snippet: Java SerializableTimestampAssigner<UPITransaction> sz = new SerializableTimestampAssigner<UPITransaction>() { @Override public long extractTimestamp(UPITransaction transaction, long l) { try { SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss'Z'"); Date date = sdf.parse(transaction.eventTime); return date.getTime(); } catch (Exception e) { return 0; } } }; WatermarkStrategy<UPITransaction> watermarkStrategy = WatermarkStrategy.<UPITransaction>forBoundedOutOfOrderness(Duration.ofMillis(100)).withTimestampAssigner(sz); DataStream<UPITransaction> watermarkDataStream = stream.assignTimestampsAndWatermarks(watermarkStrategy); //Instead of event time, we can use window based on processing time. Using TumblingProcessingTimeWindows DataStream<TransactionAgg> groupedData = watermarkDataStream.keyBy("upiId").window(TumblingEventTimeWindows.of(Time.milliseconds(2500), Time.milliseconds(500))).sum("amount"); .apply(new TransactionAgg()); Eventually, the processing logic (computation of average price for the same UPI ID based on a mobile number for the window of 2 minutes on the continuous flow of transaction stream) is executed inside Flink. Here is the code block for the Window function to calculate the average amount on each UPI ID or mobile number. Java public class TransactionAgg implements WindowFunction<UPITransaction, TransactionAgg, Tuple, TimeWindow> { @Override public void apply(Tuple key, TimeWindow window, Iterable<UPITransaction> values, Collector<TransactionAgg> out) { Integer sum = 0; //Consider whole number int count = 0; String upiID = null ; for (UPITransaction value : values) { sum += value.amount; upiID = value.upiID; count++; } TransactionAgg output = new TransactionAgg(); output.upiID = upiID; output.eventTime = window.getEnd(); output.avgAmount = (sum / count); out.collect( output); } } We have processed the data. The next step is to serialize the object and send it to a different Kafka topic. Add a KafkaSink in the developed Java code (StreamingToFlinkJob.java) to send the processed data from the Flink engine to a different Kafka topic created on the multi-node Kafka cluster. Here is the code snippet to serialize the object before sending/publishing it to the Kafka topic: Java public class KafkaTrasactionSinkSchema implements KafkaRecordSerializationSchema<TransactionAgg> { @Override public ProducerRecord<byte[], byte[]> serialize( TransactionAgg aggTransaction, KafkaSinkContext context, Long timestamp) { try { return new ProducerRecord<>( topic, null, // not specified partition so setting null aggTransaction.eventTime, aggTransaction.upiID.getBytes(), objectMapper.writeValueAsBytes(aggTransaction)); } catch (Exception e) { throw new IllegalArgumentException( "Exception on serialize record: " + aggTransaction, e); } } } And, below is the code block to sink the processed data sending back to a different Kafka topic. Java KafkaSink<TransactionAgg> sink = KafkaSink.<TransactionAgg>builder() .setBootstrapServers(IKafkaConstants.KAFKA_BROKERS) .setRecordSerializer(new KafkaTrasactionSinkSchema(IKafkaConstants.OUTPUT_UPITRANSACTION_TOPIC_NAME)) .setDeliveryGuarantee(DeliveryGuarantee.AT_LEAST_ONCE) .build(); groupedData.sinkTo(sink); // DataStream that created above for TransactionAgg env.execute(); Connecting Druid With Kafka Topic In this final step, we need to integrate Druid with the Kafka topic to consume the processed data stream that is continuously published by Flink. With Apache Druid, we can directly connect Apache Kafka so that real-time data can be ingested continuously and subsequently queried to make business decisions on the spot without interventing any third-party system or application. Another beauty of Apache Druid is that we need not configure or install any third-party UI application to view the data that landed or is published to the Kafka topic. To condense this article, I omitted the steps for integrating Druid with Apache Kafka. However, a few months ago, I published an article on this topic (linked earlier in this article). You can read it and follow the same approach. Final Note The provided code snippet above is for understanding purposes only. It illustrates the sequential steps of obtaining messages/data streams from a Kafka topic, processing the consumed data, and eventually sending/pushing the modified data into a different Kafka topic. This allows Druid to pick up the modified data stream for query, analysis as a final step. Later, we will upload the entire codebase on GitHub if you are interested in executing it on your own infrastructure. I hope you enjoyed reading this. If you found this article valuable, please consider liking and sharing it.

By Gautam Goswami

CORE

Comprehensive Guide to Unit Testing Spring AOP Aspects

Unit testing is an essential practice in software development that involves testing individual codebase components to ensure they function correctly. In Spring-based applications, developers often use Aspect-Oriented Programming (AOP) to separate cross-cutting concerns, such as logging, from the core business logic, thus enabling modularization and cleaner code. However, testing aspects in Spring AOP pose unique challenges due to their interception-based nature. Developers need to employ appropriate strategies and best practices to facilitate effective unit testing of Spring AOP aspects. This comprehensive guide aims to provide developers with detailed and practical insights on effectively unit testing Spring AOP aspects. The guide covers various topics, including the basics of AOP, testing the pointcut expressions, testing around advice, testing before and after advice, testing after returning advice, testing after throwing advice, and testing introduction advice. Moreover, the guide provides sample Java code for each topic to help developers understand how to effectively apply the strategies and best practices. By following the guide's recommendations, developers can improve the quality of their Spring-based applications and ensure that their code is robust, reliable, and maintainable. Understanding Spring AOP Before implementing effective unit testing strategies, it is important to have a comprehensive understanding of Spring AOP. AOP, or Aspect-Oriented Programming, is a programming paradigm that enables the separation of cross-cutting concerns shared across different modules in an application. Spring AOP is a widely used aspect-oriented framework that is primarily implemented using runtime proxy-based mechanisms. The primary objective of Spring AOP is to provide modularity and flexibility in designing and implementing cross-cutting concerns in a Java-based application. The key concepts that one must understand in Spring AOP include: Aspect: An aspect is a module that encapsulates cross-cutting concerns that are applied across multiple objects in an application. Aspects are defined using aspects-oriented programming techniques and are typically independent of the application's core business logic. Join point: A join point is a point in the application's execution where the aspect can be applied. In Spring AOP, a join point can be a method execution, an exception handler, or a field access. Advice: Advice is an action that is taken when a join point is reached during the application's execution. In Spring AOP, advice can be applied before, after, or around a join point. Pointcut: A pointcut is a set of joint points where an aspect's advice should be applied. In Spring AOP, pointcuts are defined using expressions that specify the join points based on method signatures, annotations, or other criteria. By understanding these key concepts, developers can effectively design and implement cross-cutting concerns in a Java-based application using Spring AOP. Challenges in Testing Spring AOP Aspects Unit testing Spring AOP aspects can be challenging compared to testing regular Java classes, due to the unique nature of AOP aspects. Some of the key challenges include: Interception-based behavior: AOP aspects intercept method invocations or join points, which makes it difficult to test their behavior in isolation. To overcome this challenge, it is recommended to use mock objects to simulate the behavior of the intercepted objects. Dependency Injection: AOP aspects may rely on dependencies injected by the Spring container, which requires special handling during testing. It is important to ensure that these dependencies are properly mocked or stubbed to ensure that the aspect is being tested in isolation and not affected by other components. Dynamic proxying: Spring AOP relies on dynamic proxies, which makes it difficult to directly instantiate and test aspects. To overcome this challenge, it is recommended to use Spring's built-in support for creating and configuring dynamic proxies. Complex pointcut expressions: Pointcut expressions can be complex, making it challenging to ensure that advice is applied to the correct join points. To overcome this challenge, it is recommended to use a combination of unit tests and integration tests to ensure that the aspect is being applied correctly. Transaction management: AOP aspects may interact with transaction management, introducing additional complexity in testing. To overcome this challenge, it is recommended to use a combination of mock objects and integration tests to ensure that the aspect is working correctly within the context of the application. Despite these challenges, effective unit testing of Spring AOP aspects is crucial for ensuring the reliability, maintainability, and correctness of the application. By understanding these challenges and using the recommended testing approaches, developers can ensure that their AOP aspects are thoroughly tested and working as intended. Strategies for Unit Testing Spring AOP Aspects Unit testing Spring AOP Aspects can be challenging, given the system's complexity and the multiple pieces of advice involved. However, developers can use various strategies and best practices to overcome these challenges and ensure effective unit testing. One of the most crucial strategies is to isolate aspects from dependencies when writing unit tests. This isolation ensures that the tests focus solely on the aspect's behavior without interference from other modules. Developers can accomplish this by using mocking frameworks such as Mockito, EasyMock, or PowerMockito, which allow them to simulate dependencies' behavior and control the test environment. Another best practice is to test each piece of advice separately. AOP Aspects typically consist of multiple pieces of advice, such as "before," "after," or "around" advice. Testing each piece of advice separately ensures that the behavior of each piece of advice is correct and that it functions correctly in isolation. It's also essential to verify that the pointcut expressions are correctly configured and target the intended join points. Writing tests that exercise different scenarios helps ensure the correctness of point-cut expressions. Aspects in Spring-based applications often rely on beans managed by the ApplicationContext. Mocking the ApplicationContext allows developers to provide controlled dependencies to the aspect during testing, avoiding the need for a fully initialized Spring context. Developers should also define clear expectations for the behavior of the aspect and use assertions to verify that the aspect behaves as expected under different conditions. Assertions help ensure that the aspect's behavior aligns with the intended functionality. Finally, if aspects involve transaction management, developers should consider testing transactional behavior separately. This can be accomplished by mocking transaction managers or using in-memory databases to isolate the transactional aspect of the code for testing. By employing these strategies and best practices, developers can ensure effective unit testing of Spring AOP Aspects, resulting in robust and reliable systems. Sample Code: Testing a Logging Aspect To gain a better understanding of testing Spring AOP aspects, let's take a closer look at the sample code. We will analyze the testing process step-by-step, emphasizing important factors to take into account, and providing further information to ensure clarity. Let's assume that we will be writing unit tests for the following main class: Java import org.aspectj.lang.JoinPoint; import org.aspectj.lang.annotation.Aspect; import org.aspectj.lang.annotation.Before; import org.springframework.stereotype.Component; @Aspect @Component public class LoggingAspect { @Before("execution(* com.example.service.*.*(..))") public void logBefore(JoinPoint joinPoint) { System.out.println("Logging before " + joinPoint.getSignature().getName()); } } The LoggingAspect class logs method executions with a single advice method, logBefore, which executes before methods in the com.example.service package. The LoggingAspectTest class contains unit tests for the LoggingAspect. Let's examine each part of the test method testLogBefore() in detail: Java import org.aspectj.lang.JoinPoint; import org.aspectj.lang.Signature; import org.junit.jupiter.api.Test; import static org.mockito.Mockito.*; public class LoggingAspectTest { @Test void testLogBefore() { // Given LoggingAspect loggingAspect = new LoggingAspect(); // Creating mock objects JoinPoint joinPoint = mock(JoinPoint.class); Signature signature = mock(Signature.class); // Configuring mock behavior when(joinPoint.getSignature()).thenReturn(signature); when(signature.getName()).thenReturn("methodName"); // When loggingAspect.logBefore(joinPoint); // Then // Verifying interactions with mock objects verify(joinPoint, times(1)).getSignature(); verify(signature, times(1)).getName(); // Additional assertions can be added to ensure correct logging behavior } } In the above code, there are several sections that play a vital role in testing. Firstly, the Given section sets up the test scenario. We do this by creating an instance of the LoggingAspect and mocking the JoinPoint and Signature objects. By doing so, we can control the behavior of these objects during testing. Next, we create mock objects for the JoinPoint and Signature using the Mockito mocking framework. This allows us to simulate behavior without invoking real instances, providing a controlled environment for testing. We then use Mockito's when() method to specify the behavior of the mock objects. For example, we define that when thegetSignature() method of the JoinPoint is called, it should return the mock Signature object we created earlier. In the When section, we invoke the logBefore() method of the LoggingAspect with the mocked JoinPoint. This simulates the execution of the advice before a method call, which triggers the logging behavior. Finally, we use Mockito's verify() method to assert that specific methods of the mocked objects were called during the execution of the advice. For example, we verify that the getSignature() and getName() methods were called once each. Although not demonstrated in this simplified example, additional assertions can be added to ensure the correctness of the aspect's behavior. For instance, we could assert that the logging message produced by the aspect matches the expected format and content. Additional Considerations Testing pointcut expressions: Pointcut expressions define where advice should be applied within the application. Writing tests to verify the correctness of pointcut expressions ensures that the advice is applied to the intended join points. Testing aspect behavior: Aspects may perform more complex actions beyond simple logging. Unit tests should cover all aspects of the aspect's behavior to ensure its correctness, including handling method parameters, logging additional information, or interacting with other components. Integration testing: While unit tests focus on isolating aspects, integration tests may be necessary to verify the interactions between aspects and other components of the application, such as service classes or controllers. By following these principles and best practices, developers can create thorough and reliable unit tests for Spring AOP aspects, ensuring the stability and maintainability of their applications. Conclusion Unit testing Spring AOP aspects is crucial for reliable and correct aspect-oriented code. To create robust tests, isolate aspects, use mocking frameworks, test each advice separately, verify pointcut expressions, and assert expected behavior. Sample code provided as a starting point for Java applications. With proper testing strategies in place, developers can confidently maintain and evolve AOP-based functionalities in their Spring app.

By Reza Ganji

CORE

Vector Databases: Unlocking New Dimensions in Video Streaming

The Data Story At the core of every software application, from the simplest to the most complex, operating at scale to serve millions of users with low-latency requests, lies a foundational element: data. For over three decades, relational database management systems (RDBMS) have been at the forefront of this domain. These systems, from simply storing data in a table format consisting of rows for records and columns for attributes, have undergone significant advancements and innovations that have revolutionized structured data and semi-unstructured storage. Relational database models have established themselves as the foundation of structured data handling, are renowned for their reliability, and battle-tested their efficacy in supporting massive big data scales for enterprise applications. However, as we evolve deeper into the era of big data and artificial intelligence (AI), the limitations of traditional RDBMS in handling unstructured data, such as images, videos, audio, and natural language have become increasingly apparent. Enter the vector database, a cutting-edge innovation tailored for the age of AI and significantly change the recommendation systems. Unlike RDBMS, which excels in managing structured data, vector databases are designed to handle and query high-dimensional vector embeddings, a form of unstructured data representation that is central to modern machine learning algorithms. Introduction: Vector DB Vector embeddings allow complex data like text, images, and sounds to be transformed into numerical vectors, capturing the essence of the data in a way that machines can process. This transformation is crucial for tasks such as similarity search, recommendation systems, and natural language processing, where understanding the nuanced relationships between data points is key. Vector databases leverage specialized indexing and search algorithms to efficiently query these embeddings, enabling applications that were previously challenging or impossible with traditional RDBMS. Fundamental Difference of RDBMS and Vectors The application interacts with the database by executing various transactions and actions, which are stored in the form of rows and columns. When it comes to the vector database, the action might look a bit different. Below, you can see the different types of files, which will be read and processed by many types of AI models and create vector embeddings. Example in Action Consider the process of transforming a comprehensive movie database, such as IMDB, into a format where each movie is represented by vector embeddings and stored in a vector database. This transformation allows the database to leverage the power of vector embeddings to significantly enhance the user search experience. Because these vectors are organized within a three-dimensional space, search engineers can more efficiently perform queries across the movie database. This spatial organization not only streamlines the retrieval process but also enables the implementation of sophisticated search functionalities, such as finding movies with similar themes or genres, thereby creating a more intuitive and responsive search experience for users. Now, we will demonstrate in Python how to convert textual movie data, similar to the tables mentioned above, into vector representations using BERT (Bidirectional Encoder Representations from Transformers), a pre-trained deep learning model developed by Google. This process entails several crucial steps for transforming the text into a format that the model can process, followed by the extraction of meaningful embeddings. Let's break down each step. Step 1 Python #Import Libraries import sqlite3 from transformers import BertTokenizer, BertModel import torch sqlite3: This imports the SQLite3 library, which allows Python to interact with SQLite databases. It's used here to access a database containing IMDB movie information. from transformers import BertTokenizer, BertModel: These imports from the Hugging Face transformers library bring in the necessary tools to tokenize text data (BertTokenizer) and to load the pre-trained BERT model (BertModel) for generating vector embeddings. import torch: This imports PyTorch, a deep learning framework that BERT and many other models in the transformers library are built on. It's used for managing tensors, which are multi-dimensional arrays that serve as the basic building blocks of data for neural networks. Step 2 Python #Initialize Tokenizer and Model tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') model = BertModel.from_pretrained('bert-base-uncased') tokenizer: This initializes the BERT tokenizer, configuring it to split input text into tokens that the BERT model can understand. The from_pretrained('bert-base-uncased') method loads a tokenizer trained in lowercase English text. model: This initializes the BERT model itself, also using the from_pretrained method to load a version trained in lowercase English. This model is what will generate the embeddings from the tokenized text. Step 3 Python # Connect to Database and Fetch Movie Data conn = sqlite3.connect('path/to/your/movie_database.db') cursor = conn.cursor() cursor.execute("SELECT name, genre, release_date, length FROM movies") movies = cursor.fetchall() conn = sqlite3.connect('path/to/your/movie_database.db'): Opens a connection to an SQLite database file that contains your movie data cursor = conn.cursor(): Creates a cursor object which is used to execute SQL commands through the connection cursor.execute(...): Executes an SQL command to select specific columns (name, genre, release date, length) from the movies table movies = cursor.fetchall(): Retrieves all the rows returned by the SQL query and stores them in the variable movies Step 4 Python #Convert Movie Data to Vector Embeddings movie_vectors = [] for movie in movies: movie_data = ', '.join(str(field) for field in movie) inputs = tokenizer(movie_data, return_tensors="pt", padding=True, truncation=True, max_length=512) with torch.no_grad(): outputs = model(**inputs) movie_vector = outputs.last_hidden_state[:, 0, :].numpy() movie_vectors.append(movie_vector) movie_vectors = []: Initializes an empty list to store the vector embeddings for each movie For loop: Iterates over each movie retrieved from the database movie_data = ', '.join(...): Concatenates the movie's details into a single string inputs = tokenizer(...): Uses the BERT tokenizer to prepare the concatenated string for the model, converting it into a tensor with torch.no_grad():: Temporarily disables gradient computation, which is unnecessary during inference (model.predict) outputs = model(**inputs): Feeds the tokenized input to the BERT model to get the embeddings movie_vector = ...: Extracts the embedding of the [CLS] token, which represents the entire input sequence movie_vectors.append(movie_vector): Adds the movie's vector embedding to the list Output movie_vectors: At the end of this script, you have a list of vector embeddings, one for each movie in your database. These vectors encapsulate the semantic information of the movies' names, genres, release dates, and durations in a form that machine learning models can work with. Conclusion In our example of vector database, movies such as "Inception" and "The Matrix" known for their action-packed, thought-provoking narratives, or "La La Land" and "Eternal Sunshine of the Spotless Mind," which explore complex romantic themes are transformed into high-dimensional vectors using BERT, a deep learning model. These vectors capture not just the overt categories like genre or release year, but also subtler thematic and emotional nuances encoded in their descriptions. Once stored in a vector database, these embeddings can be queried efficiently to perform similarity searches. When a user searches for a film with a particular vibe or thematic element, the streaming service can quickly identify and suggest films that are "near" the user's interests in the vector space, even if the user's search terms don't directly match the movie's title, genre, or other metadata. For instance, a search for "dream manipulation movies" might not only return "Inception" but also suggest "The Matrix," given their thematic similarities represented in the vector space. This method of storage and retrieval significantly enriches the user experience on streaming platforms, facilitating a discovery process that aligns content with both the user's interests and current mood. It’s designed to lead to "aha moments," where users uncover hidden gems, especially valuable when navigating the vast catalogs and offerings of streaming services. By detailing the creation and application of vector embeddings from textual movie data, we demonstrate the significant use of machine learning and vector databases in revolutionizing search capabilities and elevating the user experience in digital content ecosystems, particularly within streaming video services.

By Sai Mahesh vuppalapati

Visualizing Software Architecture With the C4 Model and Structurizr

What Is the C4 Model? The C4 model is a hierarchical framework designed to help software architects and developers visualize and communicate the essential aspects of software architecture in a clear and structured way. Unlike traditional diagramming approaches that often result in cluttered and overly complex diagrams, the C4 model focuses on simplicity and abstraction to convey architectural concepts effectively. The next question is which tool you use to create said diagrams. You can use Visio, draw.io, PlantUML, even PowerPoint, or whatever tool you normally use for creating diagrams. However, these tools do not check whether naming, relations, etc. are consistently used in the different diagrams. Besides that, it might be difficult to review new versions of diagrams because it is not clear which changes are made. In order to solve these problems, Simon Brown, the author of the C4 model, created Structurizr. What Is Structurizr? Structurizr allows you to create diagrams as code. Based on the code, Structurizr visualizes the diagrams for you and you can interact with the visualization. Because the diagrams are maintained in code, you can add them to your version control system (git), and changes in the diagrams are tracked and can be easily reviewed. In a previous article, some features of Structurizr are explored. Structurizr Lite was used, which supports only one workspace. However, if you have a more diverse system landscape, Structurizr Lite is not sufficient anymore. You will have multiple workspaces, one for every software system. You also probably want an overview of your entire system landscape. In this article, you will explore how you can use Structurizr to maintain not only the software architecture of one system but your entire system landscape as code. Sources used in this blog can be found at GitHub. Prerequisites Prerequisites for this blog are: Basic knowledge of the C4 model Basic knowledge of Docker Basic knowledge of Structurizr Linux is used — if you are using a different Operating System, you will need to adjust the commands accordingly Installation As mentioned before, Structurizr Lite cannot be used for this scenario. Instead, you need to install Structurizr on-premises. Create in the root of the repository a data directory. This directory will be mapped as a volume in the docker container. If you have executed the previous blog, ensure that you clean the data directory first. With Structurizr Lite, it is intended that you can edit files in this data directory, with Structurizr on-premises it is advised not to alter the files in the data directory. Structurizr on-premises should be run on a separate server and a normal user should not have access to the data directory anyway. Execute the following command from within the root of the repository: Shell $ docker run -it --rm -p 8080:8080 -v ./data:/usr/local/structurizr structurizr/onpremises Navigate in your browser to http://localhost:8080, log in with the default user structurizr and password password, and the Structurizr webpage is shown. Single Workspace First, let’s see how you can create a single workspace with Structurizr on-premises. Click New workspace, and an empty workspace is created. It is not possible anymore to edit files on your host machine, just like with Structurizr Lite. So, how can you upload your DSL files to the workspace? In order to do so, you need Structurizr CLI. At the moment of writing, v2024.02.22 is the latest version, which can be downloaded as a zip from GitHub. Unpack the zip file, and add the directory to your path. You will upload the latest version of the software system from the previous blog. The DSL is located in the workspaces/3-basic-styles directory. Navigate to this directory. To push the DSL to Structurizr, you will make use of the push command. The push command needs some parameters, which can be found in the settings of the Structurizr workspace. You need the information as shown under API details. Below this information, the parameters can easily be copied. Execute the following command, replacing the parameters for your situation: Shell $ structurizr.sh push -url http://localhost:8080/api -id 1 -key 2607de22-7ce0-4eb1-9f28-1e7e9979121a -secret 09528dfd-0c0a-4380-85cb-766b8da5e1dc -workspace workspace.dsl Pushing workspace 1 to http://localhost:8080/api - creating new workspace - parsing model and views from /<path to project directory>/MyStructurizrPlanet/workspaces/3-basic-styles/workspace.dsl - merge layout from remote: true - storing previous version of workspace in null - pushing workspace Getting workspace with ID 1 Putting workspace with ID 1 {"success":true,"message":"OK","revision":2} - finished If everything goes well, the DSL is pushed successfully. The System Context and Container diagram are now added to the workspace. Workspace Features In this section, some interesting features of Structurizr on-premises are shown. 5.1 Version Control Every upload automatically creates a new version. It is also possible to retrieve an older version. 5.2 Error Checking The Inspections in the left menu, gives you an overview of errors in your DSL. 5.3 Reviews When you open a diagram, you can create a review. When creating the review, you can choose which diagrams need to be reviewed, what kind of review you are requesting and whether unauthenticated access is allowed or not. The reviewer can add comments of course. Next to the Public review text, a link to a checklist is present which can help you executing the review. Create System Landscape Using DSL Only The above examples consist of diagrams for a single software system. Often, multiple software systems are used in an organization. These software systems interact with each other and thus form together a system landscape. Each team will be responsible for its own software system diagrams, but it is also necessary to have a diagram containing the larger picture. Let’s explore whether this is possible using Structurizr. You will be using an example based on the enterprise example provided at the Structurizr GitHub repository. The files can be found in workspaces/4-system-landscape. Create a new workspace via the UI, navigate to the 4-system-landscape directory, and push the customer-service DSL to this workspace. Shell $ structurizr.sh push -url http://localhost:8080/api -id 2 -key f24fe705-a508-4f8d-9cf7-3fc7b323f293 -secret 02c6597f-c750-47e0-9b88-f6e26fccdf38 -workspace customer-service/workspace.dsl In the same way, create a workspace for the invoice-service and the order-service. Push the corresponding DSL to each workspace. A separate system-landscape DSL is present, which uses a plugin to create the relationships between the software systems. Create a workspace for this DSL and push it. Shell $ structurizr.sh push -url http://localhost:8080/api -id 5 -key cb18cabb-61c7-4c3a-a58e-2e97ff0fa285 -secret a638aa99-73cd-427d-8188-3788e678129f -workspace system-landscape/workspace.dsl This creates the system landscape overview. However, two issues are encountered with this view: It is not possible to click on the Order Service f.e. in order that it opens the software system diagram for the Order Service. The DSL of the Customer Service does not define the relationship with Order Service and Invoice Service as can be seen in the diagram below. It would be nice if this inconsistency was reported one way or the other. I asked a question about this on GitHub and used the answer to create a solution that can be found in the following paragraphs. Create System Landscape Using Java The solution to the problem with the absence of links to the different services is to make use of the Java Structurizr library. With this library, you have much more control to achieve the desired functionality. I used the source code from the example in the Structurizr repository and added it to the directory: workspaces/5-system-landscape. The pom file contains the necessary dependencies to run the code, and the maven-assembly-plugin is added to create a fat jar. The code executes the following steps: Create a workspace for the system landscape. Create workspaces for each service. Generate the system landscape by parsing the workspaces metadata, create the necessary relationships, add a link to the services and create a view for the system landscape. Execute the following command from within the workspaces/5-system-landscape directory in order to build the fat jar. Shell $ mvn clean package Run the code and an error occurs. Shell $ java -jar target/mystructurizrplanet-1.0-SNAPSHOT-jar-with-dependencies.jar Mar 02, 2024 11:41:12 AM com.structurizr.api.AdminApiClient createWorkspace SEVERE: com.structurizr.api.StructurizrClientException: The API key is not configured for this installation - please refer to the documentation Exception in thread "main" com.structurizr.api.StructurizrClientException: com.structurizr.api.StructurizrClientException: The API key is not configured for this installation - please refer to the documentation at com.structurizr.api.AdminApiClient.createWorkspace(AdminApiClient.java:109) at com.mydeveloperplanet.mystructurizrplanet.CreateSystemLandscape.main(CreateSystemLandscape.java:30) Caused by: com.structurizr.api.StructurizrClientException: The API key is not configured for this installation - please refer to the documentation at com.structurizr.api.AdminApiClient.createWorkspace(AdminApiClient.java:105) ... 1 more To use the Java library, you need to use an API key. This API key is disabled by default. To enable it, you need to add a file structurizr.properties to your data directory. In the properties file, you set the API key to its bcrypt encoded value. Properties files structurizr.apiKey=$2a$10$ekjju1h3fC1y2YAln7wqxuJ.q0gBjQoFPX/Wvmzr.L5aIdoqvUIwa Add read permissions to the file. Shell $ chmod o+r data/structurizr.properties Restart the Docker container and execute the jar file again. Shell $ java -jar target/mystructurizrplanet-1.0-SNAPSHOT-jar-with-dependencies.jar Mar 02, 2024 11:50:03 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 7 Mar 02, 2024 11:50:04 AM com.structurizr.api.WorkspaceApiClient putWorkspace INFO: Putting workspace with ID 7 Mar 02, 2024 11:50:04 AM com.structurizr.api.WorkspaceApiClient putWorkspace INFO: {"success":true,"message":"OK","revision":2} Mar 02, 2024 11:50:04 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 8 Mar 02, 2024 11:50:04 AM com.structurizr.api.WorkspaceApiClient putWorkspace INFO: Putting workspace with ID 8 Mar 02, 2024 11:50:04 AM com.structurizr.api.WorkspaceApiClient putWorkspace INFO: {"success":true,"message":"OK","revision":2} Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 9 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient putWorkspace INFO: Putting workspace with ID 9 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient putWorkspace INFO: {"success":true,"message":"OK","revision":2} Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 1 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 2 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 3 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 4 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 5 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 6 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 7 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 8 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 9 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient getWorkspace INFO: Getting workspace with ID 6 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient putWorkspace INFO: Putting workspace with ID 6 Mar 02, 2024 11:50:05 AM com.structurizr.api.WorkspaceApiClient putWorkspace INFO: {"success":true,"message":"OK","revision":2} If you open the system landscape workspace, it is now possible to double-click one of the services, and you will be navigated to the corresponding service. Great, but there are some caveats to mention: This source code always creates new workspaces every time you run it. This is just an example of what is possible using the Java library. If you want to update existing workspaces, you will need to alter the source code for this purpose. The source code contains a hardcoded API key in plain text. You should not do this in a production environment. Validate Relationships Is it possible to validate the relationships using the Java library? Yes, it is. An example of the source code can be found in directory workspaces/6-validate-relationships. This code will validate offline whether the DSL contains the correct relationships. It is only intended to prove that the validation can be done. For using this in production, the source code needs to be made more robust. Build the code and run it. Shell $ mvn clean package $ java -jar target/validaterelationships-1.0-SNAPSHOT-jar-with-dependencies.jar missing relation in CustomerService {2 | Order Service | } ---[Manages customer data using]---> {4 | Customer Service | } missing relation in CustomerService {3 | Invoice Service | } ---[Gets customer data from]---> {4 | Customer Service | } The validation finds the two errors in the Customer Service. Add the relationships to the Customer Service DSL. Plain Text model { !extend customerService { api = container "Customer API" database = container "Customer Database" api -> database "Reads from and writes to" orderService -> customerService "Gets customer data from" invoiceService -> customerService "Gets customer data from" } } Build the code and run it. The errors are gone and the relationships are visible in the Customer Service if you run the code from the previous paragraph. Conclusion Structurizr offers many features to get a grip on your software architecture. It also allows you to generate a system landscape and to implement several customizations, e.g. custom validation checks. You need to learn the Java Structurizr library, but the learning curve is not very steep.

By Gunter Rotsaert

CORE

Analyze and Debug Quarkus-Based AWS Lambda Functions With X-Ray

Serverless architectures have emerged as a paradigm-shifting approach to building, fast, scalable, and cost-efficient applications. While serverless architectures provide unparalleled flexibility, they also introduce new challenges in terms of monitoring and troubleshooting. In this article, we'll explore how Quarkus integrates with AWS X-Ray and how using a Jakarta CDI Interceptor can keep your code clean while adding custom instrumentation. Quarkus and AWS Lambda Quarkus is a Java-based framework tailored for GraalVM and HotSpot, which results in an amazingly fast boot time while having an incredibly low memory footprint. It offers near-instant scale-up and high-density memory utilization, which can be very useful for container orchestration platforms like Kubernetes or Serverless runtimes like AWS Lambda. Building AWS Lambda Functions can be as easy as starting a Quarkus project, adding the quarkus-amazon-lambda dependency, and defining your AWS Lambda Handler function. XML <dependency> <groupId>io.quarkus</groupId> <artifactId>quarkus-amazon-lambda</artifactId> </dependency> An extensive guide on how to develop AWS Lambda Functions with Quarkus can be found in the official Quarkus AWS Lambda Guide. Enabling X-Ray for Your Lambda Functions Quarkus provides out-of-the-box support for X-Ray, but you will need to add a dependency to your project and configure some settings to make it work with GraalVM/native compiled Quarkus applications. Let's first start by adding the quarkus-amazon-lambda-xray dependency. XML  <dependency> <groupId>io.quarkus</groupId> <artifactId>quarkus-amazon-lambda-xray</artifactId> </dependency> Don't forget to enable tracing for your Lambda function otherwise, it won't work. An example of doing that is by setting the tracing argument to active within your AWS CDK code. Java function = Function.Builder.create(this, "feed-parsing-function") ... .memorySize(512) .tracing(Tracing.ACTIVE) .runtime(Runtime.PROVIDED_AL2023) .logRetention(RetentionDays.ONE_WEEK) .build(); After the deployment of your function and a function invocation, you should be able to see the X-Ray traces from within the Cloudwatch interface. By default, it will show you some basic timing information for your function like the initialization and the invocation duration. Adding More Instrumentation Now that the dependencies are in place and tracing is enabled for our function, we can enrich the traces in X-Ray by leveraging the X-Ray SDKs TracingIntercepter . For instance, for the SQS and DynamoDB client, you can explicitly set the intercepter inside the application.properties file. Plain Text quarkus.dynamodb.async-client.type=aws-crt quarkus.dynamodb.interceptors=com.amazonaws.xray.interceptors.TracingInterceptor quarkus.sqs.async-client.type=aws-crt quarkus.sqs.interceptors=com.amazonaws.xray.interceptors.TracingInterceptor After putting these properties in place, redeploying, and executing the function, the TracingIntercepter will wrap around each API call to SQS and DynamoDB and store the actual trace information alongside the trace. This is very useful for debugging purposes as it will allow you to validate your code and check for any mistakes. Requests to AWS Services are part of the pricing model, so if you make a mistake in your code and you make too many calls, it can become quite costly. Custom Subsegments With the AWS SDK TracingInterceptor configured, we get information about the calls to the AWS APIs, but what if we want to see information about our own code or remote calls to services outside of AWS? The Java SDK for X-Ray supports the concept of adding custom subsegments to your traces. You can add subsegments to a trace by adding a few lines of code to your own business logic as you can see in the following code snippet. Java public void someMethod(String argument) { // wrap in subsegment Subsegment subsegment = AWSXRay.beginSubsegment("someMethod"); try { // Your business logic } catch (Exception e) { subsegment.addException(e); throw e; } finally { AWSXRay.endSubsegment(); } } Although this is trivial to do, it will become quite messy if you have a lot of methods you want to apply tracing to. This isn't ideal, and it would be better if we didn't have to mix our own code with the X-Ray instrumentation. Quarkus and Jakarta CDI Interceptors The Quarkus programming model is based on the Lite version of the Jakarta Contexts and Dependency Injection 4.0 specification. Besides dependency injection, the specification also describes other features like: Lifecycle Callbacks — A bean class may declare lifecycle @PostConstruct and @PreDestroy callbacks. Interceptors — Used to separate cross-cutting concerns from business logic. Decorators — Similar to interceptors, but because they implement interfaces with business semantics, they are able to implement business logic. Events and Observers — Beans may also produce and consume events to interact in a completely decoupled fashion. As mentioned, CDI Interceptors are used to separate cross-cutting concerns from business logic. As tracing is a cross-cutting concern, this sounds like a great fit. Let's take a look at how we can create an interceptor for our AWS X-Ray instrumentation. How to Create an Interceptor for AWS X-Ray Instrumentation We start with defining our interceptor binding, which we will call XRayTracing. Interceptor bindings are intermediate annotations that may be used to associate interceptors with target beans. Java package com.jeroenreijn.aws.quarkus.xray; import jakarta.annotation.Priority; import jakarta.interceptor.InterceptorBinding; import java.lang.annotation.Retention; import static java.lang.annotation.RetentionPolicy.RUNTIME; @InterceptorBinding @Retention(RUNTIME) @Priority(0) public @interface XRayTracing { } The next step is to define the actual Interceptor logic, which is the code that will add the additional X-Ray instructions for creating the subsegment and wrapping it around our business logic. Java package com.jeroenreijn.aws.quarkus.xray; import com.amazonaws.xray.AWSXRay; import jakarta.interceptor.AroundInvoke; import jakarta.interceptor.Interceptor; import jakarta.interceptor.InvocationContext; @Interceptor @XRayTracing public class XRayTracingInterceptor { @AroundInvoke public Object tracingMethod(InvocationContext ctx) throws Exception { AWSXRay.beginSubsegment("## " + ctx.getMethod().getName()); try { return ctx.proceed(); } catch (Exception e) { AWSXRay.getCurrentSubsegment().addException(e); throw e; } finally { AWSXRay.endSubsegment(); } } } An important part of the interceptor is the @AroundInvoke annotation, which means that this interceptor code will be wrapped around the invocation of our own business logic. Now that we've defined both our interceptor binding and our interceptor, it's time to start using it. Every method that we want to create a subsegment for can now be annotated with the @XRayTracing annotation. Java @XRayTracing public SyndFeed getLatestFeed() { InputStream feedContent = getFeedContent(); return getSyndFeed(feedContent); } @XRayTracing public SyndFeed getSyndFeed(InputStream feedContent) { try { SyndFeedInput feedInput = new SyndFeedInput(); return feedInput.build(new XmlReader(feedContent)); } catch (FeedException | IOException e) { throw new RuntimeException(e); } } That looks much better. Pretty clean, if I say so myself. Based on the hierarchy of subsegments for a trace, X-Ray will be able to show a nested tree structure with the timing information. Closing Thoughts The integration between Quarkus and X-Ray is quite simple to enable. The developer experience is really good out of the box with defining the interceptors on a per-client basis. With the help of CDI interceptors, you can keep your code clean without worrying too much about X-Ray-specific code inside your business logic. An alternative to building your own Interceptor might be to start using AWS PowerTools for Lambda (Java). Powertools for Java is a great way to boost your developer productivity, but it can be used for more than X-Ray, so I’ll save it for another post.

By Jeroen Reijn

CORE

Singleton: 6 Ways To Write and Use in Java Programming

In Java programming, object creation or instantiation of a class is done with "new" operator and with a public constructor declared in the class as below. Java Clazz clazz = new Clazz(); We can read the code snippet as follows: Clazz() is the default public constructor called with "new" operator to create or instantiate an object for Clazz class and assigned to variable clazz, whose type is Clazz. While creating a singleton, we have to ensure only one single object is created or only one instantiation of a class takes place. To ensure this, the following common things become the prerequisite. All constructors need to be declared as "private" constructors. It prevents the creation of objects with "new" operator outside the class. A private constant/variable object holder to hold the singleton object is needed; i.e., a private static or a private static final class variable needs to be declared. It holds the singleton object. It acts as a single source of reference for the singleton object By convention, the variable is named as INSTANCE or instance. A static method to allow access to the singleton object by other objects is required. This static method is also called a static factory method, as it controls the creation of objects for the class. By convention, the method is named as getInstance(). With this understanding, let us delve deeper into understanding singleton. Following are the 6 ways one can create a singleton object for a class. 1. Static Eager Singleton Class When we have all the instance properties in hand, and we like to have only one object and a class to provide a structure and behavior for a group of properties related to each other, we can use the static eager singleton class. This is well-suited for application configuration and application properties. Java public class EagerSingleton { private static final EagerSingleton INSTANCE = new EagerSingleton(); private EagerSingleton() {} public static EagerSingleton getInstance() { return INSTANCE; } public static void main(String[] args) { EagerSingleton eagerSingleton = EagerSingleton.getInstance(); } } The singleton object is created while loading the class itself in JVM and assigned to the INSTANCE constant. getInstance() provides access to this constant. While compile-time dependencies over properties are good, sometimes run-time dependencies are required. In such a case, we can make use of a static block to instantiate singleton. Java public class EagerSingleton { private static EagerSingleton instance; private EagerSingleton(){} // static block executed during Class loading static { try { instance = new EagerSingleton(); } catch (Exception e) { throw new RuntimeException("Exception occurred in creating EagerSingleton instance"); } } public static EagerSingleton getInstance() { return instance; } } The singleton object is created while loading the class itself in JVM as all static blocks are executed while loading. Access to the instance variable is provided by the getInstance() static method. 2. Dynamic Lazy Singleton Class Singleton is more suited for application configuration and application properties. Consider heterogenous container creation, object pool creation, layer creation, facade creation, flyweight object creation, context preparation per requests, and sessions, etc.: they all require dynamic construction of a singleton object for better "separation of concern." In such cases, dynamic lazy singletons are required. Java public class LazySingleton { private static LazySingleton instance; private LazySingleton(){} public static LazySingleton getInstance() { if (instance == null) { instance = new LazySingleton(); } return instance; } } The singleton object is created only when the getInstance() method is called. Unlike the static eager singleton class, this class is not thread-safe. Java public class LazySingleton { private static LazySingleton instance; private LazySingleton(){} public static synchronized LazySingleton getInstance() { if (instance == null) { instance = new LazySingleton(); } return instance; } } The getInstance() method needs to be synchronized to ensure the getInstance() method is thread-safe in singleton object instantiation. 3. Dynamic Lazy Improved Singleton Class Java public class LazySingleton { private static LazySingleton instance; private LazySingleton(){} public static LazySingleton getInstance() { if (instance == null) { synchronized (LazySingleton.class) { if (instance == null) { instance = new LazySingleton(); } } } return instance; } } Instead of locking the entire getInstance() method, we could lock only the block with double-checking or double-checked locking to improve performance and thread contention. Java public class EagerAndLazySingleton { private EagerAndLazySingleton(){} private static class SingletonHelper { private static final EagerAndLazySingleton INSTANCE = new EagerAndLazySingleton(); } public static EagerAndLazySingleton getInstance() { return SingletonHelper.INSTANCE; } } The singleton object is created only when the getInstance() method is called. It is a Java memory-safe singleton class. It is a thread-safe singleton and is lazily loaded. It is the most widely used and recommended. Despite performance and safety improvement, the only objective to create just one object for a class is challenged by memory reference, reflection, and serialization in Java. Memory reference: In a multithreaded environment, reordering of read and writes for threads can occur on a referenced variable, and a dirty object read can happen anytime if the variable is not declared volatile. Reflection: With reflection, the private constructor can be made public and a new instance can be created. Serialization: A serialized instance object can be used to create another instance of the same class. All of these affect both static and dynamic singletons. In order to overcome such challenges, it requires us to declare the instance holder as volatile and override equals(), hashCode() and readResolve() of default parent class of all classes in Java, Object.class. 4. Singleton With Enum The issue with memory safety, reflection, and serialization can be avoided if enums are used for static eager singleton. Java public enum EnumSingleton { INSTANCE; } These are static eager singletons in disguise, thread safe. It is good to prefer an enum where a static eagerly initialized singleton is required. 5. Singleton With Function and Libraries While understanding the challenges and caveats in singleton is a must to appreciate, why should one worry about reflection, serialization, thread safety, and memory safety when one can leverage proven libraries? Guava is such a popular and proven library, handling a lot of best practices for writing effective Java programs. I have had the privilege of using the Guava library to explain supplier-based singleton object instantiation to avoid a lot of heavy-lifting lines of code. Passing a function as an argument is the key feature of functional programming. While the supplier function provides a way to instantiate object producers, in our case, the producer must produce only one object and should keep returning the same object repeatedly after a single instantiation. We can memoize/cache the created object. Functions defined with lambdas are usually lazily invoked to instantiate objects and the memoization technique helps in lazily invoked dynamic singleton object creation. Java import com.google.common.base.Supplier; import com.google.common.base.Suppliers; public class SupplierSingleton { private SupplierSingleton() {} private static final Supplier<SupplierSingleton> singletonSupplier = Suppliers.memoize(()-> new SupplierSingleton()); public static SupplierSingleton getInstance() { return singletonSupplier.get(); } public static void main(String[] args) { SupplierSingleton supplierSingleton = SupplierSingleton.getInstance(); } } Functional programming, supplier function, and memoization help in the preparation of singletons with a cache mechanism. This is most useful when we don't want heavy framework deployment. 6. Singleton With Framework: Spring, Guice Why worry about even preparing an object via supplier and maintaining cache? Frameworks like Spring and Guice work on POJO objects to provide and maintain singleton. This is heavily used in enterprise development where many modules each require their own context with many layers. Each context and each layer are good candidates for singleton patterns. Java import org.springframework.beans.factory.config.ConfigurableBeanFactory; import org.springframework.context.annotation.AnnotationConfigApplicationContext; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.Scope; class SingletonBean { } @Configuration public class SingletonBeanConfig { @Bean @Scope(value = ConfigurableBeanFactory.SCOPE_SINGLETON) public SingletonBean singletonBean() { return new SingletonBean(); } public static void main(String[] args) { AnnotationConfigApplicationContext applicationContext = new AnnotationConfigApplicationContext(SingletonBean.class); SingletonBean singletonBean = applicationContext.getBean(SingletonBean.class); } } Spring is a very popular framework. Context and Dependency Injection are the core of Spring. import com.google.inject.AbstractModule; import com.google.inject.Guice; import com.google.inject.Injector; interface ISingletonBean {} class SingletonBean implements ISingletonBean { } public class SingletonBeanConfig extends AbstractModule { @Override protected void configure() { bind(ISingletonBean.class).to(SingletonBean.class); } public static void main(String[] args) { Injector injector = Guice.createInjector(new SingletonBeanConfig()); SingletonBean singletonBean = injector.getInstance(SingletonBean.class); } } Guice from Google is also a framework to prepare singleton objects and an alternative to Spring. Following are the ways singleton objects are leveraged with "Factory of Singletons." Factory Method, Abstract Factory, and Builders are associated with the creation and construction of specific objects in JVM. Wherever we envision the construction of an object with specific needs, we can discover the singleton's need. Further places where one can check out and discover singleton are as follows. Prototype or Flyweight Object pools Facades Layering Context and class loaders Cache Cross-cutting concerns and aspect-oriented programming Conclusion Patterns appear when we solve use cases for our business problems and for our non-functional requirement constraints like performance, security, and CPU and memory constraints. Singleton objects for a given class is such a pattern, and requirements for its use will fall in place to discover. The class by nature is a blueprint to create multiple objects, yet the need for dynamic heterogenous containers to prepare "context," "layer,", "object pools," and "strategic functional objects" did push us to make use of declaring globally accessible or contextually accessible objects. Thanks for your valuable time, and I hope you found something useful to revisit and discover.

By Narendran Solai Sridharan

Using Spring AI With AI/LLMs to Query Relational Databases

The AIDocumentLibraryChat project has been extended to support questions for searching relational databases. The user can input a question and then the embeddings search for relevant database tables and columns to answer the question. Then the AI/LLM gets the database schemas of the relevant tables and generates based on the found tables and columns a SQL query to answer the question with a result table. Dataset and Metadata The open-source dataset that is used has 6 tables with relations to each other. It contains data about museums and works of art. To get useful queries of the questions, the dataset has to be supplied with metadata and that metadata has to be turned in embeddings. To enable the AI/LLM to find the needed tables and columns, it needs to know their names and descriptions. For all datatables like the museum table, metadata is stored in the column_metadata and table_metadata tables. Their data can be found in the files: column_metadata.csv and table_metadata.csv. They contain a unique ID, the name, the description, etc. of the table or column. That description is used to create the embeddings the question embeddings are compared with. The quality of the description makes a big difference in the results because the embedding is more precise with a better description. Providing synonyms is one option to improve the quality. The Table Metadata contains the schema of the table to add only the relevant table schemas to the AI/LLM prompt. Embeddings To store the embeddings in Postgresql, the vector extension is used. The embeddings can be created with the OpenAI endpoint or with the ONNX library that is provided by Spring AI. Three types of embeddings are created: Tabledescription embeddings Columndescription embeddings Rowcolumn embeddings The Tabledescription embeddings have a vector based on the table description and the embedding has the tablename, the datatype = table, and the metadata id in the metadata. The Columndescription embeddings have a vector based on the column description and the embedding has the tablename, the dataname with the column name, the datatype = column, and the metadata id in the metadata. The Rowcolumn embeddings have a vector based on the content row column value. That is used for the style or subject of an artwork to be able to use the values in the question. The metadata has the datatype = row, the column name as dataname, the tablename, and the metadata id. Implement the Search The search has 3 steps: Retrieve the embeddings Create the prompt Execute query and return result Retrieve the Embeddings To read the embeddings from the Postgresql database with the vector extension, Spring AI uses the VectorStore class in the DocumentVSRepositoryBean: Java @Override public List<Document> retrieve(String query, DataType dataType) { return this.vectorStore.similaritySearch( SearchRequest.query(query).withFilterExpression( new Filter.Expression(ExpressionType.EQ, new Key(MetaData.DATATYPE), new Value(dataType.toString())))); } The VectorStore provides a similarity search for the query of the user. The query is turned in an embedding and with the FilterExpression for the datatype in the header values, the results are returned. The TableService class uses the repository in the retrieveEmbeddings method: Java private EmbeddingContainer retrieveEmbeddings(SearchDto searchDto) { var tableDocuments = this.documentVsRepository.retrieve( searchDto.getSearchString(), MetaData.DataType.TABLE, searchDto.getResultAmount()); var columnDocuments = this.documentVsRepository.retrieve( searchDto.getSearchString(), MetaData.DataType.COLUMN, searchDto.getResultAmount()); List<String> rowSearchStrs = new ArrayList<>(); if(searchDto.getSearchString().split("[ -.;,]").length > 5) { var tokens = List.of(searchDto.getSearchString() .split("[ -.;,]")); for(int i = 0;i<tokens.size();i = i+3) { rowSearchStrs.add(tokens.size() <= i + 3 ? "" : tokens.subList(i, tokens.size() >= i +6 ? i+6 : tokens.size()).stream().collect(Collectors.joining(" "))); } } var rowDocuments = rowSearchStrs.stream().filter(myStr -> !myStr.isBlank()) .flatMap(myStr -> this.documentVsRepository.retrieve(myStr, MetaData.DataType.ROW, searchDto.getResultAmount()).stream()) .toList(); return new EmbeddingContainer(tableDocuments, columnDocuments, rowDocuments); } First, documentVsRepository is used to retrieve the document with the embeddings for the tables/columns based on the search string of the user. Then, the search string is split into chunks of 6 words to search for the documents with the row embeddings. The row embeddings are just one word, and to get a low distance, the query string has to be short; otherwise, the distance grows due to all the other words in the query. Then the chunks are used to retrieve the row documents with the embeddings. Create the Prompt The prompt is created in the TableService class with the createPrompt method: Java private Prompt createPrompt(SearchDto searchDto, EmbeddingContainer documentContainer) { final Float minRowDistance = documentContainer.rowDocuments().stream() .map(myDoc -> (Float) myDoc.getMetadata().getOrDefault(MetaData.DISTANCE, 1.0f)).sorted().findFirst().orElse(1.0f); LOGGER.info("MinRowDistance: {}", minRowDistance); var sortedRowDocs = documentContainer.rowDocuments().stream() .sorted(this.compareDistance()).toList(); var tableColumnNames = this.createTableColumnNames(documentContainer); List<TableNameSchema> tableRecords = this.tableMetadataRepository .findByTableNameIn(tableColumnNames.tableNames()).stream() .map(tableMetaData -> new TableNameSchema(tableMetaData.getTableName(), tableMetaData.getTableDdl())).collect(Collectors.toList()); final AtomicReference<String> joinColumn = new AtomicReference<String>(""); final AtomicReference<String> joinTable = new AtomicReference<String>(""); final AtomicReference<String> columnValue = new AtomicReference<String>(""); sortedRowDocs.stream().filter(myDoc -> minRowDistance <= MAX_ROW_DISTANCE) .filter(myRowDoc -> tableRecords.stream().filter(myRecord -> myRecord.name().equals(myRowDoc.getMetadata() .get(MetaData.TABLE_NAME))).findFirst().isEmpty()) .findFirst().ifPresent(myRowDoc -> { joinTable.set(((String) myRowDoc.getMetadata() .get(MetaData.TABLE_NAME))); joinColumn.set(((String) myRowDoc.getMetadata() .get(MetaData.DATANAME))); tableColumnNames.columnNames().add(((String) myRowDoc.getMetadata() .get(MetaData.DATANAME))); columnValue.set(myRowDoc.getContent()); this.tableMetadataRepository.findByTableNameIn( List.of(((String) myRowDoc.getMetadata().get(MetaData.TABLE_NAME)))) .stream().map(myTableMetadata -> new TableNameSchema( myTableMetadata.getTableName(), myTableMetadata.getTableDdl())).findFirst() .ifPresent(myRecord -> tableRecords.add(myRecord)); }); var messages = createMessages(searchDto, minRowDistance, tableColumnNames, tableRecords, joinColumn, joinTable, columnValue); Prompt prompt = new Prompt(messages); return prompt; } First, the min distance of the rowDocuments is filtered out. Then a list row of documents sorted by distance is created. The method createTableColumnNames(...) creates the tableColumnNames record that contains a set of column names and a list of table names. The tableColumnNames record is created by first filtering for the 3 tables with the lowest distances. Then the columns of these tables with the lowest distances are filtered out. Then the tableRecords are created by mapping the table names to the schema DDL strings with the TableMetadataRepository. Then the sorted row documents are filtered for MAX_ROW_DISTANCE and the values joinColumn, joinTable, and columnValue are set. Then the TableMetadataRepository is used to create a TableNameSchema and add it to the tableRecords. Now the placeholders in systemPrompt and the optional columnMatch can be set: Java private final String systemPrompt = """ ... Include these columns in the query: {columns} \n Only use the following tables: {schemas};\n %s \n """; private final String columnMatch = """ Join this column: {joinColumn} of this table: {joinTable} where the column has this value: {columnValue}\n """; The method createMessages(...) gets the set of columns to replace the {columns} placeholder. It gets tableRecords to replace the {schemas} placeholder with the DDLs of the tables. If the row distance was beneath the threshold, the property columnMatch is added at the string placeholder %s. Then the placeholders {joinColumn}, {joinTable}, and {columnValue} are replaced. With the information about the required columns the schemas of the tables with the columns and the information of the optional join for row matches, the AI/LLM is able to create a sensible SQL query. Execute Query and Return Result The query is executed in the createQuery(...) method: Java public SqlRowSet searchTables(SearchDto searchDto) { EmbeddingContainer documentContainer = this.retrieveEmbeddings(searchDto); Prompt prompt = createPrompt(searchDto, documentContainer); String sqlQuery = createQuery(prompt); LOGGER.info("Sql query: {}", sqlQuery); SqlRowSet rowSet = this.jdbcTemplate.queryForRowSet(sqlQuery); return rowSet; } First, the methods to prepare the data and create the SQL query are called and then queryForRowSet(...) is used to execute the query on the database. The SqlRowSet is returned. The TableMapper class uses the map(...) method to turn the result into the TableSearchDto class: Java public TableSearchDto map(SqlRowSet rowSet, String question) { List<Map<String, String>> result = new ArrayList<>(); while (rowSet.next()) { final AtomicInteger atomicIndex = new AtomicInteger(1); Map<String, String> myRow = List.of(rowSet .getMetaData().getColumnNames()).stream() .map(myCol -> Map.entry( this.createPropertyName(myCol, rowSet, atomicIndex), Optional.ofNullable(rowSet.getObject( atomicIndex.get())) .map(myOb -> myOb.toString()).orElse(""))) .peek(x -> atomicIndex.set(atomicIndex.get() + 1)) .collect(Collectors.toMap(myEntry -> myEntry.getKey(), myEntry -> myEntry.getValue())); result.add(myRow); } return new TableSearchDto(question, result, 100); } First, the result list for the result maps is created. Then, rowSet is iterated for each row to create a map of the column names as keys and the column values as values. This enables returning a flexible amount of columns with their results. createPropertyName(...) adds the index integer to the map key to support duplicate key names. Summary Backend Spring AI supports creating prompts with a flexible amount of placeholders very well. Creating the embeddings and querying the vector table is also very well supported. Getting reasonable query results needs the metadata that has to be provided for the columns and tables. Creating good metadata is an effort that scales linearly with the amount of columns and tables. Implementing the embeddings for columns that need them is an additional effort. The result is that an AI/LLM like OpenAI or Ollama with the "sqlcoder:70b-alpha-q6_K" model can answer questions like: "Show the artwork name and the name of the museum that has the style Realism and the subject of Portraits." The AI/LLM can within boundaries answer natural language questions that have some fit with the metadata. The amount of embeddings needed is too big for a free OpenAI account and the "sqlcoder:70b-alpha-q6_K" is the smallest model with reasonable results. AI/LLM offers a new way to interact with relational databases. Before starting a project to provide a natural language interface for a database, the effort and the expected results have to be considered. The AI/LLM can help with questions of small to middle complexity and the user should have some knowledge about the database. Frontend The returned result of the backend is a list of maps with keys as column names and values column values. The amount of returned map entries is unknown, because of that the table to display the result has to support a flexible amount of columns. An example JSON result looks like this: JSON {"question":"...","resultList":[{"1_name":"Portrait of Margaret in Skating Costume","2_name":"Philadelphia Museum of Art"},{"1_name":"Portrait of Mary Adeline Williams","2_name":"Philadelphia Museum of Art"},{"1_name":"Portrait of a Little Girl","2_name":"Philadelphia Museum of Art"}],"resultAmount":100} The resultList property contains a JavaScript array of objects with property keys and values. To be able to display the column names and values in an Angular Material Table component, these properties are used: TypeScript protected columnData: Map<string, string>[] = []; protected columnNames = new Set<string>(); The method getColumnNames(...) of the table-search.component.ts is used to turn the JSON result in the properties: TypeScript private getColumnNames(tableSearch: TableSearch): Set<string> { const result = new Set<string>(); this.columnData = []; const myList = !tableSearch?.resultList ? [] : tableSearch.resultList; myList.forEach((value) => { const myMap = new Map<string, string>(); Object.entries(value).forEach((entry) => { result.add(entry[0]); myMap.set(entry[0], entry[1]); }); this.columnData.push(myMap); }); return result; } First, the result set is created and the columnData property is set to an empty array. Then, myList is created and iterated with forEach(...). For each of the objects in the resultList, a new Map is created. For each property of the object, a new entry is created with the property name as the key and the property value as the value. The entry is set on the columnData map and the property name is added to the result set. The completed map is pushed on the columnData array and the result is returned and set to the columnNames property. Then a set of column names is available in the columnNames set and a map with column name to column value is available in the columnData. The template table-search.component.html contains the material table: HTML @if(searchResult && searchResult.resultList?.length) { <table mat-table [dataSource]="columnData"> <ng-container *ngFor="let disCol of columnNames" matColumnDef="{{ disCol }"> <th mat-header-cell *matHeaderCellDef>{{ disCol }</th> <td mat-cell *matCellDef="let element">{{ element.get(disCol) }</td> </ng-container> <tr mat-header-row *matHeaderRowDef="columnNames"></tr> <tr mat-row *matRowDef="let row; columns: columnNames"></tr> </table> } First, the searchResult is checked for existence and objects in the resultList. Then, the table is created with the datasource of the columnData map. The table header row is set with <tr mat-header-row *matHeaderRowDef="columnNames"></tr> to contain the columnNames. The table rows and columns are defined with <tr mat-row *matRowDef="let row; columns: columnNames"></tr>. The cells are created by iterating the columnNames like this: <ng-container *ngFor="let disCol of columnNames" matColumnDef="{{ disCol }">. The header cells are created like this: <th mat-header-cell *matHeaderCellDef>{{ disCol }</th>. The table cells are created like this: <td mat-cell *matCellDef="let element">{{ element.get(disCol) }</td>. element is the map of the columnData array element and the map value is retrieved with element.get(disCol). Summary Frontend The new Angular syntax makes the templates more readable. The Angular Material table component is more flexible than expected and supports unknown numbers of columns very well. Conclusion To question a database with the help of an AI/LLM takes some effort for the metadata and a rough idea of the users what the database contains. AI/LLMs are not a natural fit for query creation because SQL queries require correctness. A pretty large model was needed to get the required query correctness, and GPU acceleration is required for productive use. A well-designed UI where the user can drag and drop the columns of the tables in the result table might be a good alternative for the requirements. Angular Material Components support drag and drop very well. Before starting such a project the customer should make an informed decision on what alternative fits the requirements best.

By Sven Loesekann

SSE vs WebSockets

In today’s text, I want to take a closer look at Server-Sent Events (or SSE for short) and WebSockets. Both are good and battle-tested approaches to data exchange. I will start with a short characteristic of both tools — what they are and what they offer. Then, I will compare them according to eight categories, which, in my opinion, are the most crucial for modern-day systems. The categories are as follows: Communication Direction Underlying Protocol Security Simplicity Performance Message Structure Ease of Adoption Tooling In contrast to my previous comparison, which compared REST and gRPC, I will not proclaim any winner or grant points per category. Instead, in the Summary paragraph, you will find a kind of TL;DR table. The table contains the key differences between both technologies in the above-mentioned areas. The Why Unlike REST, both SSE and WebSockets are more use-case-focused. In this particular case (or cases), the main focus point of both concepts is providing a “real-time” communication medium. Because of their specific focus, they are less popular than REST, which is a more generic and one-size-fits-all type of tool. Nevertheless, both SSE and WebSockets offer an interesting set of possibilities and a slight refreshment from the classic REST approach to solving problems. In my opinion, it is good to be aware of them and find some space for them in our toolbox, as they may come in handy one day, providing you with a simpler solution to quite a complex problem - especially when you will need “real-time” updates or when your app will require a more push oriented approach. Besides comparing and describing them here, I also want to make them more popular. What Is WebSockets? In short, WebSockets is a communication protocol that provides bi-directional communication between server and client with the usage of a single long-lasting TCP connection. Thanks to this feature, we do not have to constantly pull new data from the server. Instead, the data is exchanged between interested parties in “real time”. Each message is either binary data or Unicode text. The protocol was standardized in 2011 by the IETF in the form of RFC 6455. WebSocket protocol is distinct from HTTP, but both are located at layer 7 of the OSI model and depend on TCP at the 4th layer. The protocol has its unique set of prefixes, which works in a similar manner as HTTP prefixes of "http" and "https": ws - Indicates that the connection is not secured with TLS wss - Indicate that the connection is secured with TLS What is more, non-secure WebSockets connections (ws) should not be open from secure sites (https). Similarly, secure WebSockets connections (wss) should not be open from non-secure sites (http). On the other hand, WebSocket, by design, works on HTTP ports 443 and 80 and supports HTTP concepts like proxies and intermediaries. Additionally, WebSocket handshake uses an HTTP upgrade header to upgrade protocol from HTTP to WebSocket. The biggest disadvantage of WebSocket as a protocol is security. WebSocket is not restricted by same-origin policy, which may make CSRF-like attacks a lot easier. What Is Server-Sent Events? SSE is a technology that allows a web server to send updates to a web page. It is a part of HTML 5 specification and, similarly to WebSockets, utilizes a single long live HTTP connection to send data in “real-time." On a conceptual level, it is quite old technology with its theoretical background dating back to 2004. The first approach to implementing SSE was conducted in 2006 by the Opera team. SSE is supported by most of the modern-day browsers — Microsoft Edge added SSE support in January 2020. It can also take full advantage of HTTP/2, which eliminates one of the biggest issues of SSE, by practically eliminating the connection limit imposed by HTTP/1.1. By definition, Server-Sent Events has two basic building blocks: EventSource - An interface based on WHATWG specification and implemented by the browser, it allows the client (a browser in this case) to subscribe to events. Event stream - A protocol that describes the standard plain-text format of events sent by the server that must be followed for the EventSource client to understand and propagate them. According to the specification, events can carry arbitrary text data, an optional ID, and are delimited by newlines. They even have their unique MIME type: text/event-stream. Unfortunately, the Server-Sent Events as a technology is designed to support only text-based messages and although we can send events with custom format, in the end, the message must be a UTF-8 encoded string. What is more, SSE provides two very interesting features: Automatic reconnection - If the client disconnects unexpectedly, EventSource periodically tries to reconnect. Automatic stream resume - EventSource automatically remembers the last received message ID and will automatically send a Last-Event-ID header when trying to reconnect. The Comparison Communication Direction Probably the biggest difference between the two is their way of communication. SSE provides only one-way communication — events can only be sent from the server to the client. WebSockets provides full two-way communication, enabling interested parties to exchange information and react to any events from both sides. I would say that both of the approaches have their pros and cons with a set of dedicated use cases for each. On one hand, if you just need to push a stream of constant updates to the client, then SSE would be a more suitable choice. On the other hand, if you need to react in any way to one of those events, then WebSocket may be more beneficial. In theory (and practice), all the things that can be done with SSE can also be done with WebSockets, but then we are entering areas like support, simplicity of the solution, or security. I will describe all of these areas and more in the following paragraphs. Additionally, using WebSocket in all cases can be a significant overkill, and an SSE-based solution may just be simpler to implement. Underlying Protocol Here comes another big difference between both technologies. SSE fully relies on HTTP and has support for both HTTP/1.1 and HTTP/2. In contrast, WebSocket is using its own custom protocol — surprise, surprise — the WebSocket protocol. In the case of SSE, utilizing HTTP/2 solves one of the major issues of SSE — max parallel connection limit. The HTTP/1.1, by its specification, limits the number of parallel connections. This behavior may lead to a problem called head-of-line blocking. HTTP/2 addresses this issue via the introduction of multiplexing, which solves HOL blocking at the application layer. However, head-of-line blocking may still occur on the TCP level. As to WebSocket protocol, I mentioned it in some detail, just a few lines above. Here, I would just reiterate the most important points. The protocol is somewhat different from classic HTTP despite using an HTTP upgrade header to initialize the WebSocket connection and effectively change communication protocols. Nevertheless, it also uses TCP protocol as a base and is fully compatible with HTTP. The biggest drawback of the WebSocket protocol is its security. Simplicity In general, setting up SSE-based integration is simpler than its WebSocket counterpart. The most important reason behind it is the nature of communication utilized by a particular technology. One-directional way of SSE and its push model makes it easier on a conceptual level. Combining it with the automatic reconnection and stream continuity support provided out of the box the number of things that we have to take care of is significantly reduced. With all of these features, SSE may also be viewed as a way to reduce the coupling between client and server. Clients just need to know an endpoint that produces the events. Nevertheless, in such a case, the client can only receive messages, so if we want to send any type of information back to the server, we need to have another communication medium, which may greatly complicate things. In the case of WebSockets, things are somewhat more complicated. First of all, we need to handle the connection upgrade from HTTP to WebSockets protocol. Albeit being the simplest thing here, it is another thing we need to remember. The second problem comes from the bi-directional nature of WebSockets. We have to manage the state of a particular connection and handle all possible exceptions occurring while processing the message. For example, what if the processing of one of the messages throws an exception on the server side? Next comes the problem of handling reconnections, which, in the case of WebSockets, we have to handle ourselves. There is also a problem that impacts both technologies — long-running connections. Both technologies need to maintain long-lived open connections to send a continuous stream of events. Managing such connections, especially on a large scale, can be a challenge as we can quite quickly run out of resources. Additionally, they may require special configurations like extended timeout and are more exposed to any network connection problems. Security In the case of SSE, there is nothing special about security as it utilizes the plain old HTTP protocol as the transport medium. All standard HTTP advantages and disadvantages apply to SSE, as simple as that. On the other hand, security is one of the biggest drawbacks of WebSocket protocol as a whole. For a start, there is no such thing as Same Origin Policy, so there are no restrictions as to the place that we want to connect via WebSockets. There is even a specific type of attack aimed at exploiting this vulnerability, the Cross-Origin WebSocket Hijacking. If you want to dive more into the topic of Same Origin Policy and WebSocket, here is an article that may be interesting for you. Besides that, there are no protocol-specific security loopholes in WebSockets. I would say that in both cases, all the standards and best security practices apply, so just be careful while you are implementing your solution. Performance I would say that both of the technologies are on equal footing as to the performance. No theoretical performance limitations are coming from either of the technologies themselves. However, I would say that SSE can be faster in terms of a pure number of messages sent per second as it kind of works on fire and forget principle. WebSocket needs to also handle responses coming from the client. The only thing that can impact the performance of both of them is the underlying client we are using in our application and its implementation. Check, read documentation, run custom stress tests, and you may end up with very interesting insight about the tool you are using or your entire system. Message Structure The message structure is probably another one of the most important differences between protocols. As I mentioned above, SSE is a pure text protocol. We can send messages with different formats and contents, but in the end, everything ends up as UTF-8 encoded text. No complex format or binary data is possible. WebSocket, on the other hand, can handle both text and binary messages. Giving us the possibility to send images, audio, or just regular files. Just remember that processing files can have a significant overhead. Ease of Adoption Here, both of the technologies are at a very similar stage. There are plenty of tools for adding WebSockets and Server-Sent Events support, both client and server, from an SSE perspective. Most of the established programming languages have more than one such library. Without going into too much detail. I have prepared the table summarizing the basic libraries, adding WebSockets and SSE support. Java: Spring SSE/WebSockets Quarkus SSE/WebSockets Scala: Akka SSE/WebSockets Play /WebSockets JavaScript EventSource total.js SSE/WebSocekts Socket.io Python Starlette FastAPI As you can see, we have plenty of well-established choices if we want to add SSE or WebSockets integration to our application. Of course, this is only a minuscule example picked from all the libraries; many more are out there. The real problem may be finding the one most suitable for your particular use case. Tooling Automated Tests As far as I know, there are no automated testing tools for either SSE or WebSockets. However, you can relatively easily achieve similar functions with the use of Postman and collections. Postman supports both Server-Sent Events and WebSockets. With the use of some magic originating from Postman collections, you can prepare a set of tests verifying the correctness of your endpoints. Performance Tests In the case of Performance Tests, you can go with either JMeter or Gatling. As far as I am aware, these are the two most mature tools for overall performance testing. Of course, both of them also support SSE (JMeter, Gatling) and WebSockets (JMeter, Gatling). There are also other tools like sse-perf (SSE only), Testable, or k6 (WebSockets only). Out of all of these tools, I would personally recommend Gatling or k6. Both seem to have the best user experience and be the most production-ready. Documentation To a degree, there are no tools dedicated solely to documenting either SSE or WebSockets. On the other hand, there is a tool called AsyncAPI which can be used just in this way for both concepts. Unfortunately, OpenAPI seems not to support either SSE or WebSockets. Summary As I promised, the summary will be quick and simple — look below. I think that the table above is quite a nice and compact summary of the topic and the article as a whole. The most important difference is the Communication Direction, as it determines the possible use case of a particular technology. It will probably have the biggest impact on choosing one over the other. Message structure can also be a very important category when it comes to choosing a particular way of communication. Allowing only plain text messages is a very significant drawback for Server-Sent Events. If you ever need help choosing SSE over WebSockets or dealing with any other kind of technical problem, just let me know. Thank you for your time. Reviewed by: Michał Matłoka, Michał Grabowski

By Bartłomiej Żyliński

CORE

Modern Microservices, Part 5: Persistence and Packaging

We have a somewhat bare-bones chat service in our series so far. Our service exposes endpoints for managing topics and letting users post messages in topics. For a demo, we have been using a makeshift in-memory store that shamelessly provides no durability guarantees. A basic and essential building block in any (web) service is a data store (for storing, organizing, and retrieving data securely and efficiently). In this tutorial, we will improve the durability, organization, and persistence of data by introducing a database. There are several choices of databases: in-memory (a very basic form of which we have used earlier), object-oriented databases, key-value stores, relational databases, and more. We will not repeat an in-depth comparison of these here and instead defer to others. Furthermore, in this article, we will use a relational (SQL) database as our underlying data store. We will use the popular GORM library (an ORM framework) to simplify access to our database. There are several relational databases available, both free as well as commercial. We will use Postgres (a very popular, free, lightweight, and easy-to-manage database) for our service. Postgres is also an ideal choice for a primary source-of-truth data store because of the strong durability and consistency guarantees it provides. Setting Up the Database A typical pattern when using a database in a service is: |---------------| |-----------| |------------| |------| | Request Proto | <-> | Service | <-> | ORM/SQL | <-> | DB | |---------------| |-----------| |------------| |------| A gRPC request is received by the service (we have not shown the REST Gateway here). The service converts the model proto (e.g., Topic) contained in the request (e.g., CreateTopicRequest) into the ORM library. The ORM library generates the necessary SQL and executes it on the DB (and returns any results). Setting Up Postgres We could go the traditional way of installing Postgres (by downloading and installing its binaries for the specific platforms). However, this is complicated and brittle. Instead, we will start using Docker (and Docker Compose) going forward for a compact developer-friendly setup. Set Up Docker Set up Docker Desktop for your platform following the instructions. Add Postgres to Docker Compose Now that Docker is set up, we can add different containers to this so we can build out the various components and services OneHub requires. docker-compose.yml: version: '3.9' services: pgadmin: image: dpage/pgadmin4 ports: - ${PGADMIN_LISTEN_PORT}:${PGADMIN_LISTEN_PORT} environment: PGADMIN_LISTEN_PORT: ${PGADMIN_LISTEN_PORT} PGADMIN_DEFAULT_EMAIL: ${PGADMIN_DEFAULT_EMAIL} PGADMIN_DEFAULT_PASSWORD: ${PGADMIN_DEFAULT_PASSWORD} volumes: - ./.pgadmin:/var/lib/pgadmin postgres: image: postgres:15.3 environment: POSTGRES_DB: ${POSTGRES_DB} POSTGRES_USER: ${POSTGRES_USER} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} volumes: - ./.pgdata:/var/lib/postgresql/data ports: - 5432:5432 healthcheck: test: ["CMD-SHELL", "pg_isready -U postgres"] interval: 5s timeout: 5s retries: 5 That's it. A few key things to note are: The Docker Compose file is an easy way to get started with containers - especially on a single host without needing complicated orchestration engines (hint: Kubernetes). The main part of Docker Compose files are the service sections that describe the containers for each of the services that Docker Compose will be executing as a "single unit in a private network." This is a great way to package multiple related services needed for an application and bring them all up and down in one step instead of having to manage them one by one individually. The latter is not just cumbersome, but also error-prone (manual dependency management, logging, port checking, etc). For now, we have added one service - postgres - running on port 5432. Since the services are running in an isolated context, environment variables can be set to initialize/control the behavior of the services. These environment variables are read from a specific .env file (below). This file can also be passed as a CLI flag or as a parameter, but for now, we are using the default .env file. Some configuration parameters here are the Postgres username, password, and database name. .env: POSTGRES_DB=onehubdb POSTGRES_USER=postgres POSTGRES_PASSWORD=docker ONEHUB_DB_ENDOINT=postgres://postgres:docker@postgres:5432/onehubdb PGADMIN_LISTEN_PORT=5480 PGADMIN_DEFAULT_EMAIL=admin@onehub.com PGADMIN_DEFAULT_PASSWORD=password All data in a container is transient and is lost when the container is shut down. In order to make our database durable, we will store the data outside the container and map it as a volume. This way from within the container, Postgres will read/write to its local directory (/var/lib/postgresql/data) even though all reads/writes are sent to the host's file system (./.pgdata) Another great benefit of using Docker is that all the ports used by the different services are "internal" to the network that Docker creates. This means the same postgres service (which runs on port 5432) can be run on multiple Docker environments without having their ports changed or checked for conflicts. This works because, by default, ports used inside a Docker environment are not exposed outside the Docker environment. Here we have chosen to expose port 5432 explicitly in the ports section of docker-compose.yml. That's it. Go ahead and bring it up: docker compose up If all goes well, you should see a new Postgres database created and initialized with our username, password, and DB parameters from the .env file. The database is now ready: onehub-postgres-1 | 2023-07-28 22:52:32.199 UTC [1] LOG: starting PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit onehub-postgres-1 | 2023-07-28 22:52:32.204 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432 onehub-postgres-1 | 2023-07-28 22:52:32.204 UTC [1] LOG: listening on IPv6 address "::", port 5432 onehub-postgres-1 | 2023-07-28 22:52:32.209 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" onehub-postgres-1 | 2023-07-28 22:52:32.235 UTC [78] LOG: database system was shut down at 2023-07-28 22:52:32 UTC onehub-postgres-1 | 2023-07-28 22:52:32.253 UTC [1] LOG: database system is ready to accept connections The OneHub Docker application should now show up on the Docker desktop and should look something like this: (Optional) Setup a DB Admin Interface If you would like to query or interact with the database (outside code), pgAdmin and adminer are great tools. They can be downloaded as native application binaries, installed locally, and played. This is a great option if you would like to manage multiple databases (e.g., across multiple Docker environments). ... Alternatively ... If it is for this single project and downloading yet another (native app) binary is undesirable, why not just include it as a service within Docker itself!? With that added, our docker-compose.yml now looks like this: docker-compose.yml: version: '3.9' services: pgadmin: image: dpage/pgadmin4 ports: - ${PGADMIN_LISTEN_PORT}:${PGADMIN_LISTEN_PORT} environment: PGADMIN_LISTEN_PORT: ${PGADMIN_LISTEN_PORT} PGADMIN_DEFAULT_EMAIL: ${PGADMIN_DEFAULT_EMAIL} PGADMIN_DEFAULT_PASSWORD: ${PGADMIN_DEFAULT_PASSWORD} volumes: - ./.pgadmin:/var/lib/pgadmin postgres: image: postgres:15.3 environment: POSTGRES_DB: ${POSTGRES_DB} POSTGRES_USER: ${POSTGRES_USER} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} volumes: - ./.pgdata:/var/lib/postgresql/data ports: - 5432:5432 healthcheck: test: ["CMD-SHELL", "pg_isready -U postgres"] interval: 5s timeout: 5s retries: 5 The accompanying environment variables are in our .env file: .env: POSTGRES_DB=onehubdb POSTGRES_USER=postgres POSTGRES_PASSWORD=docker ONEHUB_DB_ENDOINT=postgres://postgres:docker@postgres:5432/onehubdb PGADMIN_LISTEN_PORT=5480 PGADMIN_DEFAULT_EMAIL=admin@onehub.com PGADMIN_DEFAULT_PASSWORD=password Now you can simply visit the pgAdmin web console on your browser. Use the email and password specified in the .env file and off you go! To connect to the Postgres instance running in the Docker environment, simply create a connection to postgres (NOTE: container local DNS names within the Docker environment are the service names themselves). On the left-side Object Explorer panel, (right) click on Servers >> Register >> Server... and give a name to your server ("postgres"). In the Connection tab, use the hostname "postgres" and set the names of the database, username, and password as set in the .env file for the POSTGRES_DB, POSTGRES_USER, and POSTGRES_PASSWORD variables respectively. Click Save, and off you go! Introducing Object Relational Mappers (ORMs) Before we start updating our service code to access the database, you may be wondering why the gRPC service itself is not packaged in our docker-compose.yml file. Without this, we would still have to start our service from the command line (or a debugger). This will be detailed in a future post. In a typical database, initialization (after the user and DB setup) would entail creating and running SQL scripts to create tables, checking for new versions, and so on. One example of a table creation statement (that can be executed via psql or pgadmin) is: CREATE TABLE topics ( id STRING NOT NULL PRIMARY KEY, created_at DATETIME DEFAULT CURRENT_TIMESTAMP, updated_at DATETIME DEFAULT CURRENT_TIMESTAMP, name STRING NOT NULL, users TEXT[], ); Similarly, an insertion would also have been manual construction of SQL statements, e.g.: INSERT INTO topics ( id, name ) VALUES ( "1", "Taylor Swift" ); ... followed by a verification of the saved results: select * from topics ; This can get pretty tedious (and error-prone with vulnerability to SQL injection attacks). SQL expertise is highly valuable but seldom feasible - especially being fluent with the different standards, different vendors, etc. Even though Postgres does a great job in being as standards-compliant as possible - for developers - some ease of use with databases is highly desirable. Here ORM libraries are indispensable, especially for developers not dealing with SQL on a regular basis (e.g., yours truly). ORM (Object Relational Mappers) provide an object-like interface to a relational database. This simplifies access to data in our tables (i.e., rows) as application-level classes (Data Access Objects). Table creations and migrations can also be managed by ORM libraries. Behind the scenes, ORM libraries are generating and executing SQL queries on the underlying databases they accessing. There are downsides to using an ORM: ORMs still incur a learning cost for developers during adoption. Interface design choices can play a role in impacting developer productivity. ORMs can be thought of as a schema compiler. The underlying SQL generated by them may not be straightforward or efficient. This results in ORM access to a database being slower than raw SQL, especially for complex queries. However, for complex queries or complex data pattern accesses, other scalability techniques may need to be applied (e.g., sharding, denormalization, etc.). The queries generated by ORMs may not be clear or straightforward, resulting in increased debugging times on slow or complex queries. Despite these downsides, ORMs can be put to good use when not overly relied upon. We shall use a popular ORM library, GORM. GORM comes with a great set of examples and documentation and the quick start is a great starting point. Create DB Models GORM models are our DB models. GORM models are simple Golang structs with struct tags on each member to identify the member's database type. Our User, Topic and Message models are simply this: Topic, Message, User Models package datastore import ( "time" "github.com/lib/pq" ) type BaseModel struct { CreatedAt time.Time UpdatedAt time.Time Id string `gorm:"primaryKey"` Version int // used for optimistic locking } type User struct { BaseModel Name string Avatar string ProfileData map[string]interface{} `gorm:"type:json"` } type Topic struct { BaseModel CreatorId string Name string `gorm:"index:SortedByName"` Users pq.StringArray `gorm:"type:text[]"` } type Message struct { BaseModel ParentId string TopicId string `gorm:"index:SortedByTopicAndCreation,priority:1"` CreatedAt time.Time `gorm:"index:SortedByTopicAndCreation,priority:2"` SourceId string UserId string ContentType string ContentText string ContentData map[string]interface{} `gorm:"type:json"` } Why are these models needed when we have already defined models in our .proto files? Recall that the models we use need to reflect the domain they are operating in. For example, our gRPC structs (in .proto files) reflect the models and programming models from the application's perspective. If/When we build a UI, view-models would reflect the UI/view perspectives (e.g., a FrontPage view model could be a merge of multiple data models). Similarly, when storing data in a database, the models need to convey intent and type information that can be understood and processed by the database. This is why GORM expects data models to have annotations on its (struct) member variables to convey database-specific information like column types, index definitions, index column orderings, etc. A good example of this in our data model is the SortByTopicAndCreation index (which, as the name suggests, helps us list topics sorted by their creation timestamp). Database indexes are one or more (re)organizations of data in a database that speed up retrievals of certain queries (at the cost of increased write times and storage space). We won't go into indexes deeply. There are fantastic resources that offer a deep dive into the various internals of database systems in great detail (and would be highly recommended). The increased writes and storage space must be considered when creating more indexes in a database. We have (in our service) been mindful about creating more indexes and kept these to the bare minimum (to suit certain types of queries). As we scale our services (in future posts) we will revisit how to address these costs by exploring asynchronous and distributed index-building techniques. Data Access Layer Conventions We now have DB models. We could at this point directly call the GORM APIs from our service implementation to read and write data from our (Postgres) database; but first, a brief detail on the conventions we have decided to choose. Motivations Database use can be thought of as being in two extreme spectrums: On the one hand, a "database" can be treated as a better filesystem with objects written by some key to prevent data loss. Any structure, consistency guarantees, optimization, or indexes are fully the responsibility of the application layer. This gets very complicated, error-prone, and hard very fast. On the other extreme, use the database engine as the undisputed brain (the kitchen sink) of your application. Every data access for every view in your application is offered (only) by one or very few (possibly complex) queries. This view, while localizing data access in a single place, also makes the database a bottleneck and its scalability daunting. In reality, vertical scaling (provisioning beefier machines) is the easiest, but most expensive solution - which most vendors will happily recommend in such cases. Horizontal scaling (getting more machines) is hard as increased data coupling and probabilities of node failures (network partitions) mean more complicated and careful tradeoffs between consistency and availability. Our sweet spot is somewhere in between. While ORMs (like GORM) provide an almost 1:1 interface compatibility between SQL and the application needs, being judicious with SQL remains advantageous and should be based on the (data and operational) needs of the application. For our chat application, some desirable (data) traits are: Messages from users must not be lost (durability). Ordering of messages is important (within a topic). Few standard query types: CRUD on Users, Topics, and Messages Message ordering by timestamp but limited to either within a topic or by a user (for last N messages) Given our data "shapes" are simple and given the read usage of our system is much higher especially given the read/write application (i.e .,1 message posted is read by many participants on a Topic), we are choosing to optimize for write consistency, simplicity and read availability, within a reasonable latency). Now we are ready to look at the query patterns/conventions. Unified Database Object First, we will add a simple data access layer that will encapsulate all the calls to the database for each particular model (topic, messages, users). Let us create an overarching "DB" object that represents our Postgres DB (in db/db.go): type OneHubDB struct { storage *gorm.DB } This tells GORM that we have a database object (possibly with a connection) to the underlying DB. The Topic Store, User Store, and Message Store modules all operate on this single DB instance (via GORM) to read/write data from their respective tables (topics, users, messages). Note that this is just one possible convention. We could have instead used three different DB (gorm.DB) instances, one for each entity type: e.g., TopicDB, UserDB, and MessageDB. Use Custom IDs Instead of Auto-Generated Ones We are choosing to generate our own primary key (IDs) for topics, users, and messages instead of depending on the auto-increment (or auto-id) generation by the database engine. This was for the following reasons: An auto-generated key is localized to the database instance that generates it. This means if/when we add more partitions to our databases (for horizontal scaling) these keys will need to be synchronized and migrating existing keys to avoid duplications at a global level is much harder. Auto increment keys offer reduced randomness, making it easy for attackers to "iterate" through all entities. Sometimes we may simply want string keys that are custom assignable if they are available (for SEO purposes). Lack of attribution to keys (e.g., a central/global key server can also allow attribution/annotation to keys for analytics purposes). For these purposes, we have added a GenId table that keeps track of all used IDs so we can perform collision detection, etc: type GenId struct { Class string `gorm:"primaryKey"` Id string `gorm:"primaryKey"` CreatedAt time.Time } Naturally, this is not a scalable solution when the data volume is large, but suffices for our demo and when needed, we can move this table to a different DB and still preserve the keys/IDs. Note that GenId itself is also managed by GORM and uses a combination of Class + Id as its primary key. An example of this is Class=Topic and Id=123. Random IDs are assigned by the application in a simple manner: func randid(maxlen int) string { max_id := int64(math.Pow(36, maxlen)) randval := rand.Int63() % max_id return strconv.FormatInt(randval, 36) } func (tdb *OneHubDB) NextId(cls string) string { for { gid := GenId{Id: randid(), Class: cls, CreatedAt: time.Now()} err := tdb.storage.Create(gid).Error log.Println("ID Create Error: ", err) if err == nil { return gid.Id } } } The method randid generates a maxlen-sized string of random characters. This is as simple as (2^63) mod maxid where maxid = 36 ^ maxlen. The NextId method is used by the different entity create methods (below) to repeatedly generate random IDs if collisions exist. In case you are worried about excessive collisions or are interested in understanding their probabilities, you can learn about them here. Judicious Use of Indexes Indexes are very beneficial to speed up certain data retrieval operations at the expense of increased writes and storage. We have limited our use of indexes to a very handful of cases where strong consistency was needed (and could be scaled easily): Topics sorted by name (for an alphabetical sorting of topics) Messages sorted by the topic and creation time stamps (for the message list natural ordering) What is the impact of this on our application? Let us find out. Topic Creations and Indexes When a topic is created (or it is updated) an index write would be required. Topic creations/updates are relatively low-frequency operations (compared to message postings). So a slightly increased write latency is acceptable. In a more realistic chat application, a topic creation is a bit more heavyweight due to the need to check permissions, apply compliance rules, etc. So this latency hit is acceptable. Furthermore, this index would only be needed when "searching" for topics and even an asynchronous index update would have sufficed. Message Related Indexes To consider the usefulness of indexes related to messages, let us look at some usage numbers. This is a very simple application, so these scalability issues most likely won't be a concern (so feel free to skip this section). If your goals are a bit more lofty, looking at Slack's usage numbers we can estimate/project some usage numbers for our own demo to make it interesting: Number of daily active topics: 100 Number of active users per topic: 10 Message sent by an active user in a topic: Every 5 minutes (assume time to type, read other messages, research, think, etc.) Thus, the number of messages created each day is: = 100 * 10 * (1400 minutes in a day / 5 minutes)= 280k messages per day~ 3 messages per second In the context of these numbers, if we were to create a message every 3 seconds, even with an extra index (or three), we can handle this load comfortably in a typical database that can handle 10k IOPS, which is rather modest. It is easy to wonder if this scales as the number of topics or active users per topic or the creation frenzy increases. Let us consider a more intense setup (in a larger or busier organization). Instead of the numbers above, if we had 10k topics and 100 active users with a message every minute (instead of 5 minutes), our write QPS would be: WriteQPS: = 10000 * 100 * 1400 / 1= 1.4B messages per day~ 14k messages per second That is quite a considerable blow-up. We can solve this in a couple of ways: Accept a higher latency on writes - For example, instead of requiring a write to happen in a few milliseconds, accept an SLO of, say, 500ms. Update indexes asynchronously - This doesn't get us that much further, as the number of writes in a system has not changed - only the when has changed. Shard our data Let us look at sharding! Our write QPS is in aggregate. On a per-topic level, it is quite low (14k/10000 = 1.4 qps). However, user behavior for our application is that such activities on a topic are fairly isolated. We only want our messages to be consistent and ordered within a topic - not globally. We now have the opportunity to dynamically scale our databases (or the Messages tables) to be partitioned by topic IDs. In fact, we could build a layer (a control plane) that dynamically spins up database shards and moves topics around reacting to load as and when needed. We will not go that extreme here, but this series is tending towards just that especially in the context of SaaS applications. The _annoyed_ reader might be wondering if this deep dive was needed right now! Perhaps not - but by understanding our data and user experience needs, we can make careful tradeoffs. Going forward, such mini-dives will benefit us immensely to quickly evaluate tradeoffs (e.g., when building/adding new features). Store Specific Implementations Now that we have our basic DB and common methods, we can go to each of the entity methods' implementations. For each of our entity methods, we will create the basic CRUD methods: Create Update Get Delete List/Search The Create and Update methods are combined into a single "Save" method to do the following: If an ID is not provided then treat it as a create. If an ID is provided treat it as an update-or-insert (upsert) operation by using the NextId method if necessary. Since we have a base model, Create and Update will set CreatedAt and UpdatedAt fields respectively. The delete method is straightforward. The only key thing here is instead of leveraging GORM's cascading delete capabilities, we also delete the related entities in a separate call. We will not worry about consistency issues resulting from this (e.g., errors in subsequent delete methods). For the Get method, we will fetch using a standard GORM get-query-pattern based on a common id column we use for all models. If an entity does not exist, then we return a nil. Users DB Our user entity methods are pretty straightforward using the above conventions. The Delete method additionally also deletes all Messages for/by the user first before deleting the user itself. This ordering is to ensure that if the deletion of topics fails, then the user deletion won't proceed giving the caller to retry. package datastore import ( "errors" "log" "strings" "time" "gorm.io/gorm" ) func (tdb *OneHubDB) SaveUser(topic *User) (err error) { db := tdb.storage topic.UpdatedAt = time.Now() if strings.Trim(topic.Id, " ") == "" { return InvalidIDError // create a new one } result := db.Save(topic) err = result.Error if err == nil && result.RowsAffected == 0 { topic.CreatedAt = time.Now() err = tdb.storage.Create(topic).Error } return } func (tdb *OneHubDB) DeleteUser(topicId string) (err error) { err = tdb.storage.Where("topic_id = ?", topicId).Delete(&Message{}).Error if err == nil { err = tdb.storage.Where("id = ?", topicId).Delete(&User{}).Error } return } func (tdb *OneHubDB) GetUser(id string) (*User, error) { var out User err := tdb.storage.First(&out, "id = ?", id).Error if err != nil { log.Println("GetUser Error: ", id, err) if errors.Is(err, gorm.ErrRecordNotFound) { return nil, nil } else { return nil, err } } return &out, err } func (tdb *OneHubDB) ListUsers(pageKey string, pageSize int) (out []*User, err error) { query := tdb.storage.Model(&User{}).Order("name asc") if pageKey != "" { count := 0 query = query.Offset(count) } if pageSize <= 0 || pageSize > tdb.MaxPageSize { pageSize = tdb.MaxPageSize } query = query.Limit(pageSize) err = query.Find(&out).Error return out, err } Topics DB Our topic entity methods are also pretty straightforward using the above conventions. The Delete method additionally also deletes all messages in the topic first before deleting the user itself. This ordering is to ensure that if the deletion of topics fails then the user deletion won't proceed giving the caller a chance to retry. Topic entity methods: package datastore import ( "errors" "log" "strings" "time" "gorm.io/gorm" ) /////////////////////// Topic DB func (tdb *OneHubDB) SaveTopic(topic *Topic) (err error) { db := tdb.storage topic.UpdatedAt = time.Now() if strings.Trim(topic.Id, " ") == "" { return InvalidIDError // create a new one } result := db.Save(topic) err = result.Error if err == nil && result.RowsAffected == 0 { topic.CreatedAt = time.Now() err = tdb.storage.Create(topic).Error } return } func (tdb *OneHubDB) DeleteTopic(topicId string) (err error) { err = tdb.storage.Where("topic_id = ?", topicId).Delete(&Message{}).Error if err == nil { err = tdb.storage.Where("id = ?", topicId).Delete(&Topic{}).Error } return } func (tdb *OneHubDB) GetTopic(id string) (*Topic, error) { var out Topic err := tdb.storage.First(&out, "id = ?", id).Error if err != nil { log.Println("GetTopic Error: ", id, err) if errors.Is(err, gorm.ErrRecordNotFound) { return nil, nil } else { return nil, err } } return &out, err } func (tdb *OneHubDB) ListTopics(pageKey string, pageSize int) (out []*Topic, err error) { query := tdb.storage.Model(&Topic{}).Order("name asc") if pageKey != "" { count := 0 query = query.Offset(count) } if pageSize <= 0 || pageSize > tdb.MaxPageSize { pageSize = tdb.MaxPageSize } query = query.Limit(pageSize) err = query.Find(&out).Error return out, err } Messages DB Message entity methods: package datastore import ( "errors" "strings" "time" "gorm.io/gorm" ) func (tdb *OneHubDB) GetMessages(topic_id string, user_id string, pageKey string, pageSize int) (out []*Message, err error) { user_id = strings.Trim(user_id, " ") topic_id = strings.Trim(topic_id, " ") if user_id == "" && topic_id == "" { return nil, errors.New("Either topic_id or user_id or both must be provided") } query := tdb.storage if topic_id != "" { query = query.Where("topic_id = ?", topic_id) } if user_id != "" { query = query.Where("user_id = ?", user_id) } if pageKey != "" { offset := 0 query = query.Offset(offset) } if pageSize <= 0 || pageSize > 10000 { pageSize = 10000 } query = query.Limit(pageSize) err = query.Find(&out).Error return out, err } // Get messages in a topic paginated and ordered by creation time stamp func (tdb *OneHubDB) ListMessagesInTopic(topic_id string, pageKey string, pageSize int) (out []*Topic, err error) { err = tdb.storage.Where("topic_id= ?", topic_id).Find(&out).Error return } func (tdb *OneHubDB) GetMessage(msgid string) (*Message, error) { var out Message err := tdb.storage.First(&out, "id = ?", msgid).Error if err != nil { if errors.Is(err, gorm.ErrRecordNotFound) { return nil, nil } else { return nil, err } } return &out, err } func (tdb *OneHubDB) ListMessages(topic_id string, pageKey string, pageSize int) (out []*Message, err error) { query := tdb.storage.Where("topic_id = ?").Order("created_at asc") if pageKey != "" { count := 0 query = query.Offset(count) } if pageSize <= 0 || pageSize > tdb.MaxPageSize { pageSize = tdb.MaxPageSize } query = query.Limit(pageSize) err = query.Find(&out).Error return out, err } func (tdb *OneHubDB) CreateMessage(msg *Message) (err error) { msg.CreatedAt = time.Now() msg.UpdatedAt = time.Now() result := tdb.storage.Model(&Message{}).Create(msg) err = result.Error return } func (tdb *OneHubDB) DeleteMessage(msgId string) (err error) { err = tdb.storage.Where("id = ?", msgId).Delete(&Message{}).Error return } func (tdb *OneHubDB) SaveMessage(msg *Message) (err error) { db := tdb.storage q := db.Model(msg).Where("id = ? and version = ?", msg.Id, msg.Version) msg.UpdatedAt = time.Now() result := q.UpdateColumns(map[string]interface{}{ "updated_at": msg.UpdatedAt, "content_type": msg.ContentType, "content_text": msg.ContentText, "content_data": msg.ContentData, "user_id": msg.SourceId, "source_id": msg.SourceId, "parent_id": msg.ParentId, "version": msg.Version + 1, }) err = result.Error if err == nil && result.RowsAffected == 0 { // Must have failed due to versioning err = MessageUpdateFailed } return } The Messages entity methods are slightly more involved. Unlike the other two, Messages entity methods also include Searching by Topic and Searching by User (for ease). This is done in the GetMessages method that provides paginated (and ordered) retrieval of messages for a topic or by a user. Write Converters To/From Service/DB Models We are almost there. Our database is ready to read/write data. It just needs to be invoked by the service. Going back to our original plan: |---------------| |-----------| |--------| |------| | Request Proto | <-> | Service | <-> | GORM | <-> | DB | |---------------| |-----------| |--------| |------| We have our service models (generated by protobuf tools) and we have our DB models that GORM understands. We will now add converters to convert between the two. Converters for entity X will follow these conventions: A method XToProto of type func(input *datastore.X) (out *protos.X) A method XFromProto of type func(input *protos.X) (out *datastore.X) With that one of our converters (for Topics) is quite simply (and boringly): package services import ( "log" "github.com/lib/pq" ds "github.com/panyam/onehub/datastore" protos "github.com/panyam/onehub/gen/go/onehub/v1" "google.golang.org/protobuf/types/known/structpb" tspb "google.golang.org/protobuf/types/known/timestamppb" ) func TopicToProto(input *ds.Topic) (out *protos.Topic) { var userIds map[string]bool = make(map[string]bool) for _, userId := range input.Users { userIds[userId] = true } out = &protos.Topic{ CreatedAt: tspb.New(input.BaseModel.CreatedAt), UpdatedAt: tspb.New(input.BaseModel.UpdatedAt), Name: input.Name, Id: input.BaseModel.Id, CreatorId: input.CreatorId, Users: userIds, } return } func TopicFromProto(input *protos.Topic) (out *ds.Topic) { out = &ds.Topic{ BaseModel: ds.BaseModel{ CreatedAt: input.CreatedAt.AsTime(), UpdatedAt: input.UpdatedAt.AsTime(), Id: input.Id, }, Name: input.Name, CreatorId: input.CreatorId, } if input.Users != nil { var userIds []string for userId := range input.Users { userIds = append(userIds, userId) } out.Users = pq.StringArray(userIds) } return } The full set of converters can be found here - Service/DB Models Converters. Hook Up the Converters in the Service Definitions Our last step is to invoke the converters above in the service implementation. The methods are pretty straightforward. For example, for the TopicService we have: CreateTopic During creation we allow custom IDs to be passed in. If an entity with the ID exists the request is rejected. If an ID is not passed in, a random one is assigned. Creator and Name parameters are required fields. The topic is converted to a "DBTopic" model and saved by calling the SaveTopic method. UpdateTopic All our Update<Entity> methods follow a similar pattern: Fetch the existing entity (by ID) from the DB. Update the entity fields based on fields marked in the update_mask (so patches are allowed). Update with any extra entity-specific operations (e.g., AddUsers, RemoveUsers, etc.) - these are just for convenience so the caller would not have to provide an entire "final" users list each time. Convert the updated proto to a "DB Model." Call SaveTopic on the DB. SaveTopic uses the "version" field in our DB to perform an optimistically concurrent write. This ensures that by the time the model is loaded and it is being written, a write by another request/thread will not be overwritten. The Delete, List and Get methods are fairly straightforward. The UserService and MessageService also are implemented in a very similar way with minor differences to suit specific requirements. Testing It All Out We have a database up and running (go ahead and start it with docker compose up). We have converters to/from service and database models. We have implemented our service code to access the database. We just need to connect to this (running) database and pass a connection object to our services in our runner binary (cmd/server.go): Add an extra flag to accept a path to the DB. This can be used to change the DB path if needed. var ( addr = flag.String("addr", ":9000", "Address to start the onehub grpc server on.") gw_addr = flag.String("gw_addr", ":8080", "Address to start the grpc gateway server on.") db_endpoint = flag.String("db_endpoint", "", fmt.Sprintf("Endpoint of DB where all topics/messages state are persisted. Default value: ONEHUB_DB_ENDPOINT environment variable or %s", DEFAULT_DB_ENDPOINT)) ) Create *gorm.DB instance from the db_endpoint value. We have already created a little utility method for opening a GORM-compatible SQL DB given an address: cmd/utils/db.go: package utils import ( // "github.com/panyam/goutils/utils" "log" "strings" "github.com/panyam/goutils/utils" "gorm.io/driver/postgres" "gorm.io/driver/sqlite" "gorm.io/gorm" ) func OpenDB(db_endpoint string) (db *gorm.DB, err error) { log.Println("Connecting to DB: ", db_endpoint) if strings.HasPrefix(db_endpoint, "sqlite://") { dbpath := utils.ExpandUserPath((db_endpoint)[len("sqlite://"):]) db, err = gorm.Open(sqlite.Open(dbpath), &gorm.Config{}) } else if strings.HasPrefix(db_endpoint, "postgres://") { db, err = gorm.Open(postgres.Open(db_endpoint), &gorm.Config{}) } if err != nil { log.Println("Cannot connect DB: ", db_endpoint, err) } else { log.Println("Successfully connected DB: ", db_endpoint) } return } Now let us create the method OpenOHDB, which is a simple wrapper that also checks for a db_endpoint value from an environment variable (if it is not provided) and subsequently opens a gorm.DB instance needed for a OneHubDB instance: func OpenOHDB() *ds.OneHubDB { if *db_endpoint == "" { *db_endpoint = cmdutils.GetEnvOrDefault("ONEHUB_DB_ENDPOINT", DEFAULT_DB_ENDPOINT) } db, err := cmdutils.OpenDB(*db_endpoint) if err != nil { log.Fatal(err) panic(err) } return ds.NewOneHubDB(db) } With the above two, we need a simple change to our main method: func main() { flag.Parse() ohdb := OpenOHDB() go startGRPCServer(*addr, ohdb) startGatewayServer(*gw_addr, *addr) } Now we shall also pass the ohdb instance to the GRPC service creation methods. And we are ready to test our durability! Remember we set up auth in a previous part, so we need to pass login credentials, albeit fake ones (where password = login + "123"). Create a Topic curl localhost:8080/v1/topics -u auser:auser123 | json_pp { "nextPageKey" : "", "topics" : [] } That's right. We do not have any topics yet so let us create some. curl -X POST localhost:8080/v1/topics \ -u auser:auser123 \ -H 'Content-Type: application/json' \ -d '{"topic": {"name": "First Topic"}' | json_pp Yielding: { "topic" : { "createdAt" : "1970-01-01T00:00:00Z", "creatorId" : "auser", "id" : "q43u", "name" : "First Topic", "updatedAt" : "2023-08-04T08:14:56.413050Z", "users" : {} } } Let us create a couple more: curl -X POST localhost:8080/v1/topics \ -u auser:auser123 \ -H 'Content-Type: application/json' \ -d '{"topic": {"name": "First Topic", "id": "1"}' | json_pp curl -X POST localhost:8080/v1/topics \ -u auser:auser123 \ -H 'Content-Type: application/json' \ -d '{"topic": {"name": "Second Topic", "id": "2"}' | json_pp curl -X POST localhost:8080/v1/topics \ -u auser:auser123 \ -H 'Content-Type: application/json' \ -d '{"topic": {"name": "Third Topic", "id": "3"}' | json_pp With a list query returning: { "nextPageKey" : "", "topics" : [ { "createdAt" : "1970-01-01T00:00:00Z", "creatorId" : "auser", "id" : "q43u", "name" : "First Topic", "updatedAt" : "2023-08-04T08:14:56.413050Z", "users" : {} }, { "createdAt" : "1970-01-01T00:00:00Z", "creatorId" : "auser", "id" : "dejc", "name" : "Second Topic", "updatedAt" : "2023-08-05T06:52:33.923076Z", "users" : {} }, { "createdAt" : "1970-01-01T00:00:00Z", "creatorId" : "auser", "id" : "zuoz", "name" : "Third Topic", "updatedAt" : "2023-08-05T06:52:35.100552Z", "users" : {} } ] } Get Topic by ID We can do a listing as in the previous section. We can also obtain individual topics: curl localhost:8080/v1/topics/q43u -u auser:auser123 | json_pp { "topic" : { "createdAt" : "1970-01-01T00:00:00Z", "creatorId" : "auser", "id" : "q43u", "name" : "First Topic", "updatedAt" : "2023-08-04T08:14:56.413050Z", "users" : {} } } Send and List Messages on a Topic Let us send a few messages on the "First Topic" (id = "q43u"): curl -X POST localhost:8080/v1/topics/q43u/messages -u 'auser:auser123' -H 'Content-Type: application/json' -d '{"message": {"content_text": "Message 1"}' curl -X POST localhost:8080/v1/topics/q43u/messages -u 'auser:auser123' -H 'Content-Type: application/json' -d '{"message": {"content_text": "Message 2"}' curl -X POST localhost:8080/v1/topics/q43u/messages -u 'auser:auser123' -H 'Content-Type: application/json' -d '{"message": {"content_text": "Message 3"}' Now to list them: curl localhost:8080/v1/topics/q43u/messages -u 'auser:auser123' | json_pp { "messages" : [ { "contentData" : null, "contentText" : "Message 1", "contentType" : "", "createdAt" : "0001-01-01T00:00:00Z", "id" : "hlso", "topicId" : "q43u", "updatedAt" : "2023-08-07T05:00:36.547072Z", "userId" : "auser" }, { "contentData" : null, "contentText" : "Message 2", "contentType" : "", "createdAt" : "0001-01-01T00:00:00Z", "id" : "t3lr", "topicId" : "q43u", "updatedAt" : "2023-08-07T05:00:39.504294Z", "userId" : "auser" }, { "contentData" : null, "contentText" : "Message 3", "contentType" : "", "createdAt" : "0001-01-01T00:00:00Z", "id" : "8ohi", "topicId" : "q43u", "updatedAt" : "2023-08-07T05:00:42.598521Z", "userId" : "auser" } ], "nextPageKey" : "" } Conclusion Who would have thought setting up and using a database would have been such a meaty topic? We covered a lot of ground here that will both give us a good "functioning" service as well as a foundation when implementing new ideas in the future: We chose a relational database - Postgres - for its strong modeling capabilities, consistency guarantees, performance, and versatility. We also chose an ORM library (GORM) to improve our velocity and portability if we need to switch to another relational data store. We wrote data models that GORM could use to read/write from the database. We eased the setup by hosting both Postgres and its admin UI (pgAdmin) in a Docker Compose file. We decided to use GORM carefully and judiciously to balance velocity with minimal reliance on complex queries. We discussed some conventions that will help us along in our application design and extensions. We also addressed a way to assess, analyze, and address scalability challenges as they might arise and use that to guide our tradeoff decisions (e.g., type and number of indexes, etc). We wrote converter methods to convert between service and data models. We finally used the converters in our service to offer a "real" persistent implementation of a chat service where messages can be posted and read. Now that we have a "minimum usable app," there are a lot of useful features to add to our service and make it more and more realistic (and hopefully production-ready). Take a breather and see you soon in continuing the exciting adventure! In the next post, we will look at also including our main binary (with gRPC service and REST Gateways) in the Docker Compose environment without sacrificing hot reloading and debugging.

By Amardeep Singh