Reactive Kafka With Spring Boot
Ending Microservices Chaos: How Architecture Governance Keeps Your Microservices on Track
Kubernetes in the Enterprise
In 2014, Kubernetes' first commit was pushed to production. And 10 years later, it is now one of the most prolific open-source systems in the software development space. So what made Kubernetes so deeply entrenched within organizations' systems architectures? Its promise of scale, speed, and delivery, that is — and Kubernetes isn't going anywhere any time soon.DZone's fifth annual Kubernetes in the Enterprise Trend Report dives further into the nuances and evolving requirements for the now 10-year-old platform. Our original research explored topics like architectural evolutions in Kubernetes, emerging cloud security threats, advancements in Kubernetes monitoring and observability, the impact and influence of AI, and more, results from which are featured in the research findings.As we celebrate a decade of Kubernetes, we also look toward ushering in its future, discovering how developers and other Kubernetes practitioners are guiding the industry toward a new era. In the report, you'll find insights like these from several of our community experts; these practitioners guide essential discussions around mitigating the Kubernetes threat landscape, observability lessons learned from running Kubernetes, considerations for effective AI/ML Kubernetes deployments, and much more.
API Integration Patterns
Threat Detection
In this emerging Generative AI era as a data architect, it is your responsibility to keep a tap on the emerging architectures that cater to Generative AI. From data management to data governance to data lineage, architectures need to emerge to handle volumes of data. In this article, you learn about emerging data architectures like data mesh, Generative AI, and Quantum-based along with the existing architectures like Data Fabric. The article will conclude by showing the key differences between the existing and the emerging data architectures. Generative AI and Data Architecture As we started the article with Generative AI, it makes more sense to talk about how Generative AI including large language models (LLMs) and other generative models, is transforming how organizations process and utilize data. The generative AI models require vast amounts of high-quality data for training and inference, driving the need for scalable, flexible data architectures. Key Components of Generative AI Architecture Data processing layer: This layer collects, organizes, and processes data for generative AI models. It is responsible for data cleansing, standardization, and feature extraction. The generative model layer: This layer contains AI models that generate new material or data and includes model selection, training, and fine-tuning. The feedback and improvement layer: This layer incorporates user feedback and interaction analysis to improve model performance. Application layer: This facilitates human-machine collaboration and makes AI models available via user interfaces or APIs. Model layer and hub: Consists of foundation models, fine-tuned models, and a centralized model hub for accessing and managing diverse AI models Modern Data Architecture Paradigms Data Mesh Data mesh is a decentralized architecture that treats data as a product and assigns responsibility for each data domain (e.g., sales, marketing, finance) to the relevant business units. Data mesh is more about distributing data ownership and enabling cross-functional teams to manage data in a way that aligns with the business needs of that domain. Example In a large healthcare organization, each department like cardiology, radiology, and pathology owns and manages its own datasets, exposing them as products that can be accessed by other departments as needed. Key Components Domain-oriented data products Self-serve data platform Federated governance Data discovery and catalog Tools Apache Kafka Kubernetes Databricks Unity Catalog Collibra Data Intelligence Cloud Data Fabric Data fabric as a data architecture used by companies like IBM is a unified architecture that aims to provide seamless, integrated data access, governance, and management across all environments (on-premise, cloud, hybrid) using a combination of technologies, tools, and processes. To ensure a consistent data experience across an organization, the data fabric architecture focuses on data integration, discovery, security, and orchestration. Data fabric can enable seamless data access and governance for customer data from multiple sources (websites, mobile apps, CRM systems) across different regions (Europe, Asia, North America) in a centralized manner. Data Fabric Architecture Key Components Metadata management Data integration layer Data virtualization AI/ML engine for automated data management Tools Informatica Intelligent Data Management Cloud IBM Cloud Pak for Data Talend Data Fabric Lakehouse Architecture Lakehouse combines the best features of data lakes and data warehouses. Lakehouses provide a flexible foundation for storing and processing the large datasets required for generative AI. Key Components Object storage Metadata layer Query engine ACID transaction support Tools Databricks Delta Lake Apache Hudi Snowflake Google BigLake Cloud-Native and Real-Time Architectures Cloud-native and real-time architectures are essential for supporting the computational demands and low-latency requirements of generative AI applications. Key Components Serverless computing Containerization Stream processing In-memory computing Tools AWS Lambda Azure Functions Apache Kafka Apache Flink Redis AI and Machine Learning Integration Specialized architectures for AI and ML workloads are crucial for supporting generative AI models. Key Components Feature store Model registry Experiment tracking GPU clusters Tools MLflow Kubeflow Amazon SageMaker Google Vertex AI Weights & Biases Data Governance and Security With the sensitive nature of data used in generative AI, robust governance and security measures are paramount. Key Components Data catalog Data lineage tracking Fine-grained access control Data encryption Tools Collibra Alation Apache Atlas HashiCorp Vault Emerging Trends Edge Computing Edge computing is becoming increasingly important for deploying generative AI models closer to data sources, reducing latency, and improving privacy. Tools Azure IoT Edge AWS IoT Greengrass TensorFlow Lite Quantum Computing While still in the early stages, quantum computing has the potential to revolutionize certain aspects of generative AI, particularly in areas like cryptography and complex optimization problems. Tools IBM Quantum Google Cirq Microsoft Quantum Development Kit Generative AI-Specific Architectures Retrieval Augmented Generation (RAG) RAG architectures combine retrieval systems with generative models to produce more accurate and contextually relevant outputs. Key Components Document retrieval system Vector database LLM for generation Prompt engineering layer Tools Pinecone Weaviate LangChain Haystack Fine-Tuning and Transfer Learning Architectures Fine-tuning and transfer learning architectures support adapting pre-trained generative models to specific domains or tasks. Key Components Pre-trained model repository Fine-tuning pipeline Evaluation framework Model versioning system Tools Hugging Face Transformers OpenAI GPT-3 Fine-tuning API Google T5 Multimodal Generative AI Architectures Architectures supporting generative AI across multiple modalities (text, image, audio, video) are becoming increasingly important. Key Components Modality-specific encoders and decoders Cross-modal attention mechanisms Unified representation learning Tools OpenAI DALL-E Google Imagen NVIDIA Omniverse Conclusion To conclude, as a data architect, it's essential to understand these evolving architectures and how they can be applied to support generative AI initiatives within your organization. The choice of architecture may vary depending on the specific use cases, data volumes, performance requirements, and existing infrastructure. By leveraging these emerging architectures, tools, and technologies, you can design scalable, flexible, and efficient data systems that drive innovation in the era of generative AI. Aspect Existing Architectures EMERGING Architectures Data Storage Centralized (Data Warehouse, Data Lake) Decentralized (Blockchain, Edge, Quantum Databases) Data Processing ETL, Batch Processing, Streaming AI-driven automation, Quantum Computing, Edge processing Data Ownership Centralized (often by IT or a data team) Domain-oriented (Data Mesh) or decentralized (Blockchain) Scalability Vertical scaling (on-premise) or hybrid (cloud-based) Horizontal scaling (quantum, edge) and distributed (blockchain) Data Governance Centralized with manual interventions AI-driven governance, automated compliance, decentralized governance Real-Time Processing Limited, often batch-driven, or near-real-time in cloud Real-time everywhere (Edge, AI-driven automation)
Repository and Data Access Object (DAO) play crucial roles in software development and data handling. However, their purposes and contexts differ, especially when we consider how they relate to the business logic of an application. Let’s explore the key differences between these concepts, where they originate, and when you should choose one. About Repository and Data Access Object (DAO) Patterns The Repository pattern originates from Domain-Driven Design (DDD), as described by Eric Evans in his book, "Domain-Driven Design: Tackling Complexity in the Heart of Software." Repositories are not just about managing data; they encapsulate business logic, ensuring that operations adhere to the Ubiquitous Language of the domain. On the other hand, the Data Access Object (DAO) pattern comes from early enterprise application design, and its formalization can be found in the book "Core J2EE Patterns: Best Practices and Design Strategies" by Deepak Alur, John Crupi, and Dan Malks, published in 2001. DAO abstracts and encapsulates all access to a data source, separating persistence logic from business logic. This allows changes to the underlying data source without affecting the rest of the application, which is especially useful in enterprise contexts where systems might switch databases or data storage mechanisms over time. The primary distinction between Repository and DAO lies in the semantics of the API they expose: DAO: DAO is data-centric, providing database operations such as insert, update, delete, and find. These methods directly map to database actions, focusing on how data is stored and retrieved. Repository: This is domain-driven and aligns with the* business logic*. It represents a collection of aggregate roots or entities. Instead of database operations, repositories offer operations that reflect the domain language. For instance, in a hotel system, a Repository might provide checkIn, checkout, or checkReservation, abstracting away the actual data operations. Aspect DAO Repository Focus Database-centric Domain-centric Operation Naming Based on CRUD actions (insert, update) Reflects business logic (checkIn, reserve) Abstraction Level Low-level, close to the database High-level, abstracted from database details Origin Core J2EE Patterns book (2001) Domain-Driven Design (Eric Evans, 2003) When to Use Simple applications or direct database access Complex domains with rich business logic Implementation Context Can vary between databases Often maps to aggregate roots in DDD Illustration Let’s illustrate this difference with a simple example in Java. We’ll model a Room entity and treat the Guest as a Value Object: Java public class Room { private int number; private boolean available; private boolean cleaned; private Guest guest; // Getters and setters omitted for brevity } public record Guest(String name, String document) { } Now, let’s look at how the DAO and Repository would handle operations for this entity: In the case of a DAO, the method terminology reflects database operations, like insert, update, or delete: Java public interface RoomDAO { void insert(Room room); void update(Room room); void delete(Room room); Optional<Room> findById(int number); } The Repository, on the other hand, defines operations that align with the business domain. The methods relate to actions that make sense in the context of managing a hotel: Java public interface Hotel { void checkIn(Room room); void checkOut(Room room); Optional<Room> checkReservation(int number); } Use DAO when your application primarily performs database operations without complex business logic. For example, if you’re working on a simple CRUD app or your data source is the focus, DAO fits naturally. Use Repository when your application has complex domain logic that needs to be encapsulated. This is especially relevant when following Domain-Driven Design principles, where the Repository is an abstraction that ensures business operations are consistent with domain rules. They understand the distinction between DAO and Repository when building scalable and maintainable applications. While both patterns handle data, DAOs are more focused on the persistence layer, and Repositories encapsulate business logic, abstracting away how the data is stored. Choosing the correct pattern for your scenario ensures that your application remains flexible, robust, and easy to maintain. The sample Java code provided reveals that Repositories operate at the domain level, interacting with entities and business operations, while DAOs focus on the underlying data storage and retrieval. This conceptual difference is important as it shapes the structure and maintainability of your code in the long run. Video
In previous weeks, I've analyzed several libraries and frameworks that augment the client with AJAX capabilities. Vue.js Alpine.js HTMX Vaadin In this post, I'll compare them across several axes. Analysis Frontend skills: Remember that I started this series from the point of view of a backend developer. In this section, I grade how much you need to know about client technologies to complete the job. Team organization: In the introduction, I hinted that the decoupling of frontend and backend teams profoundly impacted projects. Each team is fast on its own, and they can parallelize their work, but integrating the two can double the initial development time. Here, I grade how easy it is to integrate frontend and backend. Ease of setup: How easy it is to set up the tool from the backend. Ease of styling: Backend developers are not designers by default. Does the tool offer a default, at least average-looking style? How hard is it to create one? For all intents and purposes, Vue.js and Alpine.js are similar; I'll refer to them as JavaScript frameworks. JavaScript frameworks HTMX Vaadin Frontend Skills Need the full HTML, JavaScript (it's in the name), and CSS tryptic Only need HTML and CSS, HTMX takes care of the JavaScript burden No frontend skills needed; Vaadin takes care of everything Team Organization Depends on each developer's skills: Either separation between frontend and backend development Or they can develop their use case from the database to the UI Each developer can develop their use case from the database to the UI Ease of Setup Thanks to WebJars, you can manage dependencies in the POM WebJars Locator allows not specifying the version number in the HTML You still need to reference each library in the HTML page Everything is in the POM. As Vaadin generates the whole frontend, you don't need additional setup Ease of Styling No default; one needs to use an existing library, e.g., Bootstrap, or create their own Vaadin comes bundled with the Lumo theme. Other themes are available in the Vaadin Add-ons Directory, such as the Parity Theme. Applying a theme is as easy as setting it as a dependency and adding an annotation. Creating a custom theme is no small potatoes, though. You can ease the task by starting from an existing one and changing it bit by bit. Vaadin, the company, also provides custom themes for a fee. Time to Choose If you are still unsure how to proceed, here are my recommendations. If you're working on a regular business app, e.g., forms, choose Vaadin. Business apps are Vaadin's primary use case and will shine there, immensely increasing productivity. If your app requires good-looking visualization widgets, choose Vaadin as well. For example, its Vaadin Charts component is truly amazing. Note that it's commercially licensed, though. If you want to offer an API from the start, choose Vue or Alpine. While it's possible to use HTMX or Vaadin, it doesn't make sense in this context. I also emphasize "from the start:" Everybody plans to offer an API at some point, but most never do. The possible productivity potential you plan to have in the future is not worth the guaranteed productivity in the next months. The same goes for distributing your app over several channels — from the start (bis repetita placent). If you're in none of these situations, it's time to go into more detail. Are your developers skilled in frontend technologies? Are they willing to learn to close the gap? Will you need these skills in the near future? These are a couple of questions that can help you decide which way to go. Conclusion This post concludes my series on AJAX and SSR. I hope you had as much fun reading it as I did writing it. The complete source code for this post can be found on GitHub.
In this article, I aim to discuss modern approaches used to reduce the time required to establish a data transmission channel between two nodes. I will be examining both plain TCP and TLS-over-TCP. What Is a Handshake? First, let’s define what a handshake is, and for that, an illustration of the TCP handshake serves very well: The handshake is the process of exchanging messages between two nodes (such as a client and a server) to establish a connection and negotiate communication parameters before data transmission begins. The purpose of a handshake is to: Confirm the readiness of both nodes to communicate. Agree on connection parameters: protocol versions, encryption methods, authentication, and other technical details. Ensure the security and reliability of the connection before transmitting sensitive or important data. The handshake is a critically important step in network interaction because it sets the foundation for subsequent communication, guaranteeing that both nodes "understand" each other and can securely exchange information. Thus, no data could be exchanged between communication parties until the handshake is done. Why Is Optimizing the Handshake Time Important? Even if a handshake looks like a very short and simple procedure, it can take a very long time for long-distance communication. It's well known that the maximum distance between any two points on the Earth's surface is about 20,000 km, which is half the Earth’s circumference. If you take into account the speed of light (~300 km/sec), it means that the maximum RTT (round-trip-time) between the two points should be not more than 134ms. Unfortunately, that is not how global networks work. Imagine we have a client in the Netherlands and a server in Australia. The RTT between them in practice will be around 250 ms because the network will send your packets first to the US, then to Singapore, and only after that, deliver them to Australia. The channels are also usually quite loaded which could delay the packets even more. It means that a basic TCP handshake could take, let’s say, 300 ms easily. And this is probably more than the server will spend to generate an answer. To illustrate all that, let me start a simple HTTP server somewhere in Sydney. The client will be an Apache Benchmark utility to generate a lot of sequential requests and to see the resulting timings. $ ab -n 1000 -c 1 http://localhost/ … Concurrency Level: 1 Time taken for tests: 0.087 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 205000 bytes Requests per second: 11506.16 [#/sec] (mean) Time per request: 0.087 [ms] (mean) As we can see, the service is blazingly fast and serves 11.5K RPS (requests per second). Now let’s move the client to Amsterdam and recheck the same service. $ ab -n 1000 -c 1 http://service-in-sydney/ … Concurrency Level: 1 Time taken for tests: 559.284 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 205000 bytes Requests per second: 1.79 [#/sec] (mean) Time per request: 559.284 [ms] (mean) The service is 6.5K times slower! And this is all about the latency between the nodes. Of course, if your servers communicate a lot, they’ll probably have a pool of established connections and the problem won’t be that awful, but even in this situation, you have to reconnect from time to time. Also, no one keeps connections open to every partner service it uses, just the major ones. So, it can show severe delays from time to time. Another approach is to multiplex multiple requests in a single HTTP connection. That also could work great — but for instance, in HTTP/2, a single stream can slow down all the other streams together. So, it is not always the best way to go. Trusted Environments We’ll talk about trusted environments only and this is for a reason. For each of the methods we’ll discuss, I’ll mention the security or privacy issues it has if any. To mitigate them, additional measures should be taken, which could be very complicated. That is why I’ll stick to an environment where no one will send any malicious packets to the services’ endpoints. Consider a VPN or at least a network with good Firewall protection. Local area networks are also fine, but usually handshake time optimizations there are not that beneficial. TCP Fast Open First, we will discuss the base-level optimization, suitable for every TCP-based higher-level protocol: TCP Fast Open. History In 2011, a group of engineers at Google, as part of efforts to improve network performance, published a project proposing an extension to the TCP protocol that allowed clients to add data to the SYN packet of the three-way handshake. In other words, for small requests, clients could include them directly in the very first TCP packet sent to the server and start receiving data one RTT faster. To understand how important this is, remember that companies like Google have numerous data centers on most continents, and the software services operating there need to communicate actively with each other. It is not always possible to stay within a single data center, and network packets often have to cross oceans. The technology was first used within Google itself, and in the same year (2011), the first patches were submitted to the Linux kernel code. After checks and refinements, TCP Fast Open was included in the 3.6 release of the kernel in 2012. Subsequently, over the next few years, the extension was actively tested under real conditions and improved in terms of security and compatibility with devices based on other operating systems. Finally, in December 2014, TFO was standardized and published as RFC 7413. For now, almost all the biggest IT companies use TFO in their data centers. You can find articles about having that from Google, Meta, Cloudflare, Netflix, Amazon, Microsoft, etc. It means the technology works, useful to have, and is quite reliable. How It Works There are a lot of good articles describing how TFO works. My favorite is this one, which clarifies in much detail all the possible scenarios. I will describe it very briefly. First Connection: Explanation of Steps Client sends SYN (TFO Cookie Request): The client initiates a TCP connection by sending a SYN packet with a TFO option indicating a cookie request. No application data is sent at this stage because the client does not have the TFO cookie yet. Server responds with SYN-ACK (Including TFO Cookie): The server replies with a SYN-ACK packet, including a freshly generated TFO cookie in the TFO option. The cookie is a small piece of data used to validate future TFO connections from this client. Client sends ACK: The client acknowledges the receipt of the SYN-ACK by sending an ACK packet, completing the TCP three-way handshake. At this point, the TCP connection is established. Now that the connection is established, the client sends the HTTP request over the TCP connection. The server processes the HTTP request and sends back the HTTP response to the client. Later Connections: Explanation of Steps The client initiates a new TCP connection by sending a SYN packet. The SYN packet includes: TFO cookie obtained from the previous connection HTTP request data (e.g., GET request) attached to the SYN packet; the size of the payload is limited to MSS and usually varies from 500 bytes to 1Kb Upon receiving the SYN packet, the server: Validates the TFO cookie to ensure it is legitimate Accepts the early data (HTTP request) included in the SYN packet The server sends a SYN-ACK packet back to the client. The server may send early HTTP response data if it processes the request quickly enough. Alternatively, the server may send the HTTP response after the handshake is complete. Client Sends ACK. The Experiment As you can see, all the later connections could save 1 RTT on getting the data. Let’s test it with our Amsterdam/Sydney client/server lab. As to not modify either the client or the server, I’ll add a reverse proxy service on top of the client and the server: System-Wide Configuration Before doing anything, you need to ensure TFO is enabled on your server and the client. To do that on Linux, take a look at the net.ipv4.tcp_fastopen bitmask-parameter. Usually, it will be set by default to one or three. # enabling client and server support $ sudo sysctl -w net.ipv4.tcp_fastopen=3 net.ipv4.tcp_fastopen = 3 According to the kernel documentation, possible bits to enable the parameter are as follows: 0x1 (client) Enables sending data in the opening SYN on the client 0x2 (server) Enables the server support, i.e., allowing data in a SYN packet to be accepted and passed to the application before 3-way handshake finishes 0x4 (client) Send data in the opening SYN regardless of cookie availability and without a cookie option. 0x200 (server) Accept data-in-SYN w/o any cookie option present. 0x400 (server) Enable all listeners to support Fast Open by default without explicit TCP_FASTOPEN socket option. Note the bits 0x4 and 0x200. They allow bypassing the need for the client to have a cookie altogether. This enables accelerating even the first TCP handshake with a previously unknown client, which can be acceptable and useful in a trusted environment. Option 0x400 allows forcing the server code, even if it knows nothing about TFO, to accept data received in the SYN packet. This is generally not safe but may be necessary if modifying the server code is not feasible. Reverse Proxy Configuration After configuring the system, let’s install and configure a reverse proxy. I will use HAProxy for that, but you can choose any. HAProxy configuration on the server host: frontend fe_main bind :8080 tfo # TFO enabled on listening socket default_backend be_main backend be_main mode http server server1 127.0.0.1:8081 # No TFO towards the server HAProxy configuration on the client’s host: frontend fe_main bind 127.0.0.1:8080 # No TFO on listening socket default_backend be_main backend be_main mode http retry-on all-retryable-errors http-request disable-l7-retry if METH_POST server server1 server-in-sydney:8080 tfo # TFO enabled towards the proxy on the server And let’s test again: $ ab -n 1000 -c 1 http://localhost:8080/ … Concurrency Level: 1 Time taken for tests: 280.452 seconds Complete requests: 1000 Failed requests: 0 Total transferred: 205000 bytes Requests per second: 3.57 [#/sec] (mean) Time per request: 280.452 [ms] (mean) As you can see we got a result twice as good as before. And except for the URL, we had nothing to change in the client or the server themselves. The reverse proxy handled all the TFO-related work and that is a great way to go. Downsides of TFO Now the sad part about TFO: The first and biggest issue for TFO is that it is vulnerable to the so-called “replay” attacks. Imagine an intruder who sniffed the process of handshaking for some client and then started to send malicious requests on behalf of the client. Yes, the attacker won’t receive the results of the requests, but it can break something, make an amplification attack toward the client, or just exhaust the server's resources. Another big issue is privacy. In www-world, clients get cookies. We all know how those cookies can be used to track clients. The same applies to TFO, unfortunately. As soon as your device gets a cookie from the server, it will use it for many hours for all the outgoing connections to that server, even if not early data to be sent there. And the cookie becomes the unique ID of your machine. A good paper with the investigation of these security flaws can be found here. Also, even though the technology is quite mature, a lot of network devices still can block and drop SYN packets with payload, which causes issues. That said, TFO is dangerous to use in the wild Internet and it is more suitable for safer internal networks. TLS TLS (Transport Layer Security) is a cryptographic protocol designed to provide end-to-end security for data transmitted over a network. It achieves this by encrypting the data between two communicating applications so eavesdroppers can't intercept or tamper with the information. During the TLS handshake, the client and server agree on encryption algorithms and exchange keys, setting up a secure channel for data transfer. And, yes, there's the "handshake" word again. First, let’s take a look at the sequence diagram of a TLS 1.2 connection: Protected HTTP Request: Explanation of Steps TCP SYN: The client initiates a TCP connection by sending a SYN packet to the server. TCP SYN-ACK: The server responds to the client's SYN with a SYN-ACK packet to acknowledge the client's SYN and indicate readiness to establish a connection. TCP ACK: The client acknowledges the server's SYN-ACK by sending an ACK packet to complete the TCP three-way handshake, establishing the TCP connection. ClientHello: The client initiates the TLS handshake by sending a ClientHellomessage over the established TCP connection. It contains the following: Protocol version: Highest TLS version supported (e.g., TLS 1.2) Client random: A random value used in key generation Session ID: For session resumption (may be empty) Cipher suites: List of supported encryption algorithms Compression methods: Usually null (no compression) Extensions: Optional features (e.g., Server Name Indication) ServerHello: The server responds with a ServerHello message which contains: Protocol version: Selected TLS version (must be ≤ client's version) Server random: Another random value for key generation Session ID: Chosen session ID (matches client's if resuming a session) Cipher suite: Selected encryption algorithm from the client's list Compression method: Selected compression method Extensions: Optional features agreed upon Certificate: The server sends its X.509 Certificate which contains the server's public key and identity information to authenticate itself to the client. This could be a chain of certificates. ServerKeyExchange (optional): Sent if additional key exchange parameters are needed (e.g., for Diffie-Hellman key exchange) ServerHelloDone: Indicates that the server has finished sending its initial handshake messages and awaits the client's response ClientKeyExchange: The client sends key exchange information to the server to allow both client and server to compute the shared premaster secret. ChangeCipherSpec: The client signals that it will start using the newly negotiated encryption and keys. Finished: The client sends a Finished message encrypted with the new cipher suite. It contains a hash of all previous handshake messages. It allows the server to verify that the handshake integrity is intact and that the client has the correct keys. ChangeCipherSpec: The server signals that it will start using the negotiated encryption and keys. Finished: The server sends a Finished message encrypted with the new cipher suite. It contains a hash of all previous handshake messages and allows the client to verify that the server has successfully completed the handshake and that the keys match. HTTP Request: Now that the connection is established, the client sends the HTTP request over the encrypted TLS connection. HTTP Response: The server processes the HTTP request and sends back the encrypted HTTP response to the client. A big one, isn’t it? If you analyze the diagram, one can find that it is at least 4 RTTs required to establish a connection and to get a response to the request from the client. Let’s check it in our test lab with a well-known curl-utility. The Experiment To get various useful timings from curl, I usually create the following file somewhere in the filesystem of a server: $ cat /var/tmp/curl-format.txt \n time_namelookup: %{time_namelookup}\n time_connect: %{time_connect}\n time_appconnect: %{time_appconnect}\n time_pretransfer: %{time_pretransfer}\n time_redirect: %{time_redirect}\n time_starttransfer: %{time_starttransfer}\n ----------\n time_total: %{time_total}\n \n size_request: %{size_request}\n size_upload: %{size_upload}\n size_header: %{size_header}\n size_download: %{size_download}\n speed_upload: %{speed_upload}\n speed_download: %{speed_download}\n \n Now let’s use it (-w parameter) and make a call using TLS 1.2 from Amsterdam to Sydney: $ curl -w "@/var/tmp/curl-format.txt" https://server-in-sydney/ --http1.1 --tlsv1.2 --tls-max 1.2 -o /dev/null time_namelookup: 0.001645 time_connect: 0.287356 time_appconnect: 0.867459 time_pretransfer: 0.867599 time_redirect: 0.000000 time_starttransfer: 1.154564 ---------- time_total: 1.154633 size_request: 86 size_upload: 0 size_header: 181 size_download: 33 speed_upload: 0 speed_download: 28 More than a second for a single request! Remember that the service is capable of serving 11.5K requests per second. Let’s try to do something with it. TLSv1.3 This is the newest stable version of the TLS protocol defined in “RFC 8446 - The Transport Layer Security (TLS) Protocol Version 1.3.” And you know what? It is great and absolutely safe to use everywhere you can set it up. The only downside is not every client supports it for now, but servers could be configured to support both TLS 1.2 and 1.3 versions, so it is not a big deal. Now, let’s take a look at what is so great in TLSv1.3 for our handshake optimization. For that, here's a new sequence diagram. Protected HTTP Request: Explanation of Steps TCP SYN: The client initiates a TCP connection by sending a SYN packet to the server. TCP SYN-ACK: The server responds to the client's SYN with a SYN-ACK packet to acknowledge the client's SYN and indicate readiness to establish a connection. TCP ACK: The client acknowledges the server's SYN-ACK by sending an ACK packet to complete the TCP three-way handshake, establishing the TCP connection. ClientHello: The client initiates the TLS handshake by sending a ClientHellomessage. It proposes security parameters and provides key material for key exchange. The message contains: Protocol version: Indicates TLS 1.3 Random value: A random number for key generation Cipher suites: List of supported cipher suites Key share extension: Contains the client's ephemeral public key for key exchange (e.g., ECDHE) Supported versions extension: Indicates support for TLS 1.3 Other extensions: May include Server Name Indication (SNI), signature algorithms, etc. Function: Proposes security parameters and provides key material for key exchange ServerHello: The server responds with a ServerHellomessage to agree on security parameters and complete the key exchange. Its contents include: Protocol version: Confirms TLS 1.3 Random value: Another random number for key generation Cipher suite: Selected cipher suite from the client's list Key share extension: Contains the server's ephemeral public key Supported versions extension: Confirms use of TLS 1.3. EncryptedExtensions: The server sends EncryptedExtensions containing extensions that require confidentiality, like server parameters. Certificate: The server provides its certificate and any intermediate certificates to authenticate itself to the client. CertificateVerify: The server proves possession of the private key corresponding to its certificate by signing all the previous handshake messages. Finished: The server signals the completion of its handshake messages with an HMAC over the handshake messages, ensuring integrity. Finished: The client responds with its own Finished message with an HMAC over the handshake messages, using keys derived from the shared secret. HTTP Request: Now that the connection is established, the client sends the HTTP request over the encrypted TLS connection. HTTP Response: The server processes the HTTP request and sends back the encrypted HTTP response to the client. As you can see, TLSv1.3 reduces the handshake to one round trip (1-RTT) compared with the previous version of the protocol. It also allows only ephemeral key exchanges and simplified cipher suites. While it is very good for security, that is not our main concern here. For us, it means less data is required to be sent between the parties. The Experiment Let’s try it out with our curl command: $ curl -w "@/var/tmp/curl-format.txt" https://server-in-sydney/ --http1.1 --tlsv1.3 -o /dev/null time_namelookup: 0.003245 time_connect: 0.265230 time_appconnect: 0.533588 time_pretransfer: 0.533673 time_redirect: 0.000000 time_starttransfer: 0.795738 ---------- time_total: 0.795832 size_request: 86 size_upload: 0 size_header: 181 size_download: 33 speed_upload: 0 speed_download: 41 And it is all true: we cut one RTT from the response. Great! Mixing the Things We already cut one RTT by just upgrading the TLS protocol version to 1.3. Let’s remember we have TCP Fast Open available. With that, we can send a ClientHello message directly inside a SYN packet. Will it work? Let’s find out. Curl supports an option to enable TCP Fast Open towards the target. The target still has to support TFO. $ curl -w "@/var/tmp/curl-format.txt" https://server-in-sydney/ --http1.1 --tlsv1.3 -o /dev/null --tcp-fastopen % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 33 0 33 0 0 57 0 --:--:-- --:--:-- --:--:-- 57 time_namelookup: 0.002215 time_connect: 0.002272 time_appconnect: 0.292444 time_pretransfer: 0.292519 time_redirect: 0.000000 time_starttransfer: 0.574951 ---------- time_total: 0.575200 size_request: 86 size_upload: 0 size_header: 181 size_download: 33 speed_upload: 0 speed_download: 57 And it does! We now halved the original TLSv1.2 timings. But let’s look at one more thing to consider. Zero Round-Trip Time (0-RTT) TLS 1.3 doesn’t only significantly streamline the handshake process, enhancing both security and performance. It also provides a new feature called Zero Round-Trip Time (or 0-RTT). 0-RTT allows a client to start transmitting data to a server immediately, without waiting for the full TLS handshake to complete. For that, a previous TLS connection had to exist and its keys are being reused. Let’s take a look at the sequence diagram: Zero Round-Trip Time Request: Explanation of Steps TCP SYN: The client initiates a TCP connection by sending a SYN packet to the server. TCP SYN-ACK: The server responds to the client's SYN with a SYN-ACK packet to acknowledge the client's SYN and indicate readiness to establish a connection. TCP ACK: The client acknowledges the server's SYN-ACK by sending an ACK packet to complete the TCP three-way handshake, establishing the TCP connection. ClientHello with early data (0-RTT data): The client initiates the handshake and sends the HTTP request as early data (which could be an HTTP request). It contains the following: Protocol version: Indicates support for TLS 1.3 Cipher suites: List of supported cipher suites Key share extension: Contains the client's ephemeral public key for key exchange (e.g., ECDHE) Pre-shared key (PSK): Includes a session ticket or PSK obtained from a previous connection Early data indication: Signals the intention to send 0-RTT data Early data (0-RTT Data): Application data (e.g., HTTP request) sent immediately, encrypted using keys derived from the PSK The server analyzes the early data and can pass it to a backend for processing. ServerHello: The server responds with its ServerHello, agreeing on protocol parameters. The response contains the following: Protocol version: Confirms TLS 1.3 Cipher suite: Selected cipher suite from the client's list Key share extension: Server's ephemeral public key Pre-shared key extension: Indicates acceptance of the PSK Early data indication (optional): Confirms acceptance or rejection of 0-RTT data EncryptedExtensions: The server sends additional handshake parameters securely: ALPN, supported groups, etc. Early data indication (optional): Officially accepts or rejects the early data. Finished: The server signals the completion of its handshake messages with an HMAC over the handshake messages to ensure integrity. Finished: The client completes the handshake with an HMAC over the handshake messages. HTTP Response: The server responds to the request received as early data. No complete RTT saving here, but the request is being transferred as soon as possible. It gives more time to the server to process it and there are more chances that the response will be sent out right after the handshake completes. That is also a great thing. Unfortunately, such an approach reduces the security as by TLS 1.3 design, session keys should not be reused at all. Also, it opens a surface for replay attacks. I’d suggest not using such a technology in a non-secure network environment unless you carefully implement security measures to compensate for those risks. The Experiment That Didn’t Happen I wasn’t able to make an experiment as my service is a very fast responding one and doesn’t benefit from 0-RTT. But if you want to test it yourself, mind that at the moment of preparing this article, curl didn’t support early data. But you can do the testing with “openssl s_client” utility. This is how it can be done: Save TLS-session to the disk with the following command: openssl s_client -connect example.com:443 -tls1_3 -sess_out /tmp/session.pem < /dev/null Create a file with the request: echo -e "GET / HTTP/1.1\r\nHost: example.com\r\n\r\n" > /tmp/request.txt Use the saved session to query the server: openssl s_client -connect example.com:443 -tls1_3 -sess_in /tmp/session.pem -early_data /tmp/request.txt -trace HTTP/3 And the last thing I wanted to talk about is HTTP/3. First, it doesn’t use TCP and thus has no requirements for TCP handshake to happen. Second, it supports the same early data approach we’ve just seen in 0-RTT. Lastly, all the congestion and retransmission control is now outside of your kernel and clearly depends on how the server’s and the client’s developers built it. It means that for the story of handshake latency, what you get are still 2 RTTs as TLSv1.3 + TFO provided, but with that, a new high-level protocol encapsulated in UDP. But give it a try; maybe it can help. I will give a brief sequence diagram for HTTP/3, but without deep details, as they are very close to what we’ve seen for TLS1.3 and 0-RTT: Conclusion We took a dive into four available techniques intended to decrease the time spent on handshaking between a client and a server. The handshakes appear at TCP and TLS levels. The common idea for them is to send the data to the server as early as possible. Let’s recap: TCP FastOpen allows saving of one RTT putting the request into the SYN-packet. It is not very safe outside as it is prone to replay attacks, but stable and good to use in protected network environments. Be cautious for non-idempotent services: better to combine with TLS. TLSv1.3 is a newer version of the TLS protocol. It saves one RTT by reducing the amount of information exchanged during the handshake, is very safe, and great to be used inside and outside of trusted networks. It could be combined with TFO to save two RTTs instead of one. Zero Round-Trip Time (0-RTT) is an extension for TLSv1.3. It allows to send a client’s request very soon; thus, giving the server more time to process it. It comes with some security concerns but is mostly safe to use inside of a trusted network perimeter for idempotent services. HTTP/3 is the newest version of the HTTP protocol. It uses UDP and thus has no need for a TCP handshake. It can give you a response after two RTTs, so similar to TLSv1.3 + TFO. I hope this article will be useful to make your service slightly faster. Thanks for reading!
Choosing the perfect server stack for launching a product is a decision that carries a lot of weight. This choice influences not just the initial deployment but the long-term adaptability and efficiency of your app. If you're a senior developer or leading a team, you shoulder the responsibility of these architecture decisions, sifting through a sea of languages and frameworks to find the perfect fit for your project's unique needs. Your task here is to make an important choice, one that will hold up as your project evolves and expands. I am Grigorii Novikov, a Senior Backend Developer with years of experience in sculpting and rolling out software architectures. Throughout my career, I've been faced with plenty of critical decisions on server stack selection. Each decision has added layers to my understanding of how to align technology with the requirements of a growing project. In this article, I will share with you some of those hard-earned insights, helping you pick a server stack that will fit your project's current needs and support its future growth. I invite you to explore with me the ins and outs of making tech decisions that pave the way for success, making sure your project stands on a ground ripe for growth, flexibility, and innovation. If you're a senior developer or leading a team, you shoulder the responsibility of these architecture decisions, sifting through a sea of languages and frameworks to find the perfect fit for your project's unique needs. 1. Autogenerating Documentation Although not related to code per se, this point is so important it should be discussed first. Robust documentation is a cornerstone of efficient development, especially when it comes to client-side development and app testing. Tools for autogenerating documentation have revolutionized this process, ensuring that documentation keeps pace with the latest API changes, streamlining development workflows, and cutting down on the manual effort of keeping your project’s documentation up to date. Among the tools available to a developer, I recommend Swagger for its versatility, widespread adoption, and powerful community support. Another popular option is Redoc, which offers an attractive, customizable interface for API documentation. For projects requiring more extensive customization, tools like Apiary provide flexibility alongside documentation capabilities, though they may demand more initial setup. Whichever tool you choose, the objective should be to optimize the documentation process for efficiency without allowing the tool itself to become a significant time sink. Opt for a solution that minimizes manual documentation efforts while offering the flexibility to adapt to your project's unique requirements. 2. Bug Tracker Support Efficient bug tracking is critical for maintaining the health of your application. For effective bug-tracking integration, I use tools like Jira and Bugzilla, both boasting a rich feature set and flexibility. Jira, in particular, offers robust integration capabilities with many development environments; Bugzilla, on the other hand, is known for its simplicity and effectiveness, especially in open-source projects where straightforward bug tracking is a priority. Here’s an insight for you: integrating bug trackers with instant messengers and version control systems will boost your team’s collaboration and efficiency. For instance, the Jira+Bitbucket combo streamlines workflows, allowing for seamless issue tracking within the version control environment. This pairing facilitates a transparent, agile development process, where code updates and issue resolutions are closely linked, enabling faster iterations and improved code quality. Another powerful integration is Mattermost+Focalboard, which offers a comprehensive collaboration platform. It combines the direct communication benefits of Mattermost with the project and task management capabilities of Focalboard, empowering teams with real-time updates on bug tracking, alongside the flexibility to manage tasks and workflows within a unified interface. Such integrations not only optimize the bug resolution process but also foster a more cohesive and agile development environment, ultimately enhancing productivity and project outcomes. 3. Scaling on Growing When your product starts to catch on, you will face the challenge of scaling. And I don’t mean simply a surging number of users. Scaling involves fitting in new features, handling a growing database, and keeping the performance levels of your codebase and database optimal. This is when the architecture you chose for your server stack really comes into play. For instance, at the launch of your project, going for a monolithic architecture might seem like a balanced approach. But as your product grows and changes, you'll start to see where it falls short. Transitioning to a microservices architecture or bringing in scalable cloud services can give you much finer control over different aspects of your application. For scalable server stack solutions, I lean towards technologies like Kubernetes and Docker. These tools will give you the flexibility to scale services independently, manage deployments efficiently, and ensure consistency across your environments. Furthermore, cloud service providers like Amazon Web Services, Google Cloud, and Microsoft Azure offer stellar managed services that can really simplify your scaling journey. Choosing a scalable architecture means balancing the perks of scalability with the complexities of managing a distributed system. Ultimately, your aim here is to pick a server stack that meets your present needs and has the flexibility to handle future growth. 4. Finding the Perfect Fit: Between Community and Security There is no shortage of programming languages and frameworks available, each with its own set of perks like community support, resource availability, and even security features. This diversity allows a broad choice of solutions that not only address immediate development challenges but also align with long-term project goals, including security and scalability. Technologies backed by large communities and abundant resources, such as Python and JavaScript – and their respective frameworks within languages like Django or React – provide a wealth of knowledge and ready-to-use code examples. This wealth significantly cuts down the time you'd otherwise spend on troubleshooting, given the slim odds of encountering an issue not tackled by someone before you. Conversely, newer or niche technologies may bring unique perks to the table, but will often leave you bracing for a tougher time when it comes to finding quick solutions. Another crucial moment is balancing security and usability. For projects where source code protection is a major concern, consider using languages and technologies that support easy obfuscation and secure packaging. For instance, Java and .NET have established tools and ecosystems for obfuscating code. Containerization technologies like Docker will also help you here. By packaging the application and its environment into a container, you ensure that the client receives everything needed to run the app without directly accessing your code. This method not only secures the code but also simplifies the deployment process. 5. Cost Cost considerations are critical in the selection of a technology stack. It’s just about the cost of the initial setup, you also have to think long-term about what it'll cost to maintain and scale your system. Open-source technologies come with the sweet perk of zero licensing fees upfront. For startups or any project on a tight budget, this can be a major draw. Additionally, the vast pool of adept developers will help you keep labor costs more manageable. On the other hand, more complex and specialized technologies, such as blockchain or advanced data analytics platforms, may require a higher initial investment. While they offer significant pros in terms of performance and security, you should weigh the total cost of ownership against the projected benefits. Furthermore, cloud services, while reducing the need for physical infrastructure, come with their own set of costs. The above-mentioned AWS, Google Cloud, and Azure offer various pricing models that can scale with your usage; yet without careful management, these costs can spiral as your project grows. 6. Code Delivery Ensuring efficient code delivery focuses on the deployment process, primarily through Continuous Integration/Continuous Deployment (CI/CD) pipelines. This method underscores the importance of automating the transfer of code into various environments, streamlining development and production workflows. Tools such as GitLab CI and CircleCI offer robust solutions for automating testing and deployment processes. Additionally, the use of scripting tools like Ansible and Terraform further enhances this automation, allowing for the provisioning and management of infrastructure through code. These technologies will help you build a seamless pipeline that moves code from development to production with precision and reliability. By integrating these tools into your workflow, you establish a framework that not only accelerates development cycles but also ensures consistency and stability across environments. 7. Environment Creating and managing the development environment is a foundational yet complex aspect of any project's lifecycle. Designing a scalable and maintainable environment can seem daunting, especially for teams with no dedicated DevOps specialist. For many teams, the answer to the question of the best approach to environment management lies in leveraging cloud-based services and containerization. Again, AWS, Google Cloud, and Azure offer a range of services that can be tailored to fit the size and complexity of your project. These platforms provide the tools necessary to create flexible, scalable environments without the need for extensive infrastructure management. Furthermore, the adoption of technologies like Docker and Kubernetes makes deployment across different stages of development, testing, and production consistent and reliable. Building an effective and comfortable environment is not about the server setup only but also about the configuration of local environments for developers. This aspect is crucial for DevOps, as they often craft scripts to simplify the process of launching projects locally. However, this task is not always an easy one. For instance, preparing local environments in .NET can be quite challenging, highlighting the need for choosing technologies and tools that streamline both server and local setups. Ensuring developers have seamless access to efficient local development environments is essential for maintaining productivity and facilitating a smooth workflow. Choosing the right server stack for your project is like setting the foundations for a building: it requires careful consideration, foresight, and a balance between current needs and future growth. Each choice you make impacts your project’s success and its capacity to adapt and flourish in the dynamic technological landscape. With this article, I aimed to guide you through these critical decisions, equipping you with the insights to handle the complexities ahead. I hope that the insights you gained today will help you make informed choices that lead you to the success of your current and future projects! Case Study A: Mass Lie Detector Project In the development of a groundbreaking lie detector designed for mass testing, a project marked as the first of its kind in Eastern Europe, I was faced with the server stack choice as the development team’s lead. The project's core requirements — a vast number of microservice connections and extensive file operations to process diverse sensor outputs — required a robust yet flexible backend solution. We opted for Python with FastAPI over other contenders like Python/Django and Go/Fiber. The decision hinged on FastAPI's superior support for asynchronous programming, a critical feature for handling the project's intensive data processing needs efficiently. Django, while powerful, was set aside due to its synchronous nature, which could not meet our requirements for high concurrency and real-time data handling. Similarly, Go was considered for its performance but ultimately passed over in favor of FastAPI's rapid development capabilities and its built-in support for Swagger documentation, which was invaluable for our tight MVP development timeline. At the same time, the project demanded the creation of a softcam feature capable of managing webcam connections and directing the video stream across various channels. C++ became the language of choice for this task, thanks to its unparalleled execution speed and cross-platform compatibility. The decisions we made on that project have not only facilitated the project's initial success but have laid a solid foundation for its continuous growth and adaptation. Case Study B: Martial Arts Club CRM For this project, I initially opted for Python and Django, choosing them for their rapid development capabilities essential for a swift launch. This choice proved effective in the early stages, directly contributing to increased club revenue through improved attendance management. As the project's scope expanded to include features like employee management, analytics, and an internal messaging system, the limitations of Django for handling complex, concurrent processes became apparent. This realization led me to integrate Go, leveraging its goroutines and Fasthttp for the development of our internal messenger. Go's performance in managing concurrent tasks helped us expand the CRM's functionality, allowing us to maintain high performance with minimal overhead. The decision to use a hybrid technology approach, utilizing Django for core functionalities and Go for high-performance components, proved to be a critical one. This strategy allowed me to balance rapid development and scalability, ensuring the CRM could evolve to meet the growing needs of the club.
In today's data-driven world, businesses must adapt to rapid changes in how data is managed, analyzed, and utilized. Traditional centralized systems and monolithic architectures, while historically sufficient, are no longer adequate to meet the growing demands of organizations that need faster, real-time access to data insights. A revolutionary framework in this space is event-driven data mesh architecture, and when combined with AWS services, it becomes a robust solution for addressing complex data management challenges. The Data Dilemma Many organizations face significant challenges when relying on outdated data architectures. These challenges include: Centralized, Monolithic, and Domain Agnostic Data Lake A centralized data lake is a single storage location for all your data, making it easy to manage and access but potentially causing performance issues if not scaled properly. A monolithic data lake combines all data handling processes into one integrated system, which simplifies setup but can be hard to scale and maintain. A domain-agnostic data lake is designed to store data from any industry or source, offering flexibility and broad applicability but may be complex to manage and less optimized for specific uses. Traditional Architecture Failure Pressure Points Centralized Data Architecture In traditional data systems, several problems can occur. Data producers may send large volumes of data or data with errors, creating issues downstream. As data complexity increases and more diverse sources contribute to the system, the centralized data platform can struggle to handle the growing load, leading to crashes and slow performance. Increased demand for rapid experimentation can overwhelm the system, making it hard to quickly adapt and test new ideas. Data response times may become a challenge, causing delays in accessing and using data, which affects decision-making and overall efficiency. Divergence Between Operational and Analytical Data Landscapes In software architecture, issues like siloed ownership, unclear data use, tightly coupled data pipelines, and inherent limitations can cause significant problems. Siloed ownership occurs when different teams work in isolation, leading to coordination issues and inefficiencies. Lack of a clear understanding of how data should be used or shared can result in duplicated efforts and inconsistent results. Coupled data pipelines, where components are too dependent on each other, make it difficult to adapt or scale the system, leading to delays. Finally, inherent limitations in the system can slow down the delivery of new features and updates, hindering overall progress. Addressing these pressure points is crucial for a more efficient and responsive development process. Challenges With Big Data Online Analytical Processing (OLAP) systems organize data in a way that makes it easier for analysts to explore different aspects of the data. To answer queries, these systems must transform operational data into a format suitable for analysis and handling large volumes of data. Traditional data warehouses use ETL (Extract, Transform, Load) processes to manage this. Big data technologies, like Apache Hadoop, improved data warehouses by addressing scaling issues and being open source, which allowed any company to use it as long as they could manage the infrastructure. Hadoop introduced a new approach by allowing unstructured or semi-structured data, rather than enforcing a strict schema upfront. This flexibility, where data could be written without a predefined schema and structured later during querying, made it easier for data engineers to handle and integrate data. Adopting Hadoop often meant forming a separate data team: data engineers handled data extraction, data scientists managed cleaning and restructuring, and data analysts performed analytics. This setup sometimes led to problems due to limited communication between the data team and application developers, often to prevent impacting production systems. Problem 1: Issues With Data Model Boundaries The data used for analysis is closely linked to its original structure, which can be problematic with complex, frequently updated models. Changes to the data model affect all users, making them vulnerable to these changes, especially when the model involves many tables. Problem 2: Bad Data, The Costs of Ignoring the Problem Bad data often goes unnoticed until it causes issues in a schema, leading to problems like incorrect data types. Since validation is often delayed until the end of the process, bad data can spread through pipelines, resulting in expensive fixes and inconsistent solutions. Bad data can lead to significant business losses, such as billing errors costing millions. Research indicates that bad data costs businesses trillions annually, wasting substantial time for knowledge workers and data scientists. Problem 3: Lack of Single Ownership Application developers, who are experts in the source data model, typically do not communicate this information to other teams. Their responsibilities often end at their application and database boundaries. Data engineers, who manage data extraction and movement, often work reactively and have limited control over data sources. Data analysts, far removed from developers, face challenges with the data they receive, leading to coordination issues and the need for separate solutions. Problem 4: Custom Data Connections In large organizations, multiple teams may use the same data but create their own processes for managing it. This results in multiple copies of data, each managed independently, creating a tangled mess. It becomes difficult to track ETL jobs and ensure data quality, leading to inaccuracies due to factors like synchronization issues and less secure data sources. This scattered approach wastes time, money, and opportunities. Data mesh addresses these issues by treating data as a product with clear schemas, documentation, and standardized access, reducing bad data risks and improving data accuracy and efficiency. Data Mesh: A Modern Approach Data Mesh Architecture Data mesh redefines data management by decentralizing ownership and treating data as a product, supported by self-service infrastructure. This shift empowers teams to take full control over their data while federated governance ensures quality, compliance, and scalability across the organization. In simpler terms, It is an architectural framework that is designed to resolve complex data challenges by using decentralized ownership and distributed methods. It is used to integrate data from various business domains for comprehensive data analytics. It is also built on top of strong data sharing and governance policies. Goals of Data Mesh Data mesh helps various organizations get some valuable insights into the data at scale; in short, handling an ever-changing data landscape, the growing number of data sources and users, the variety of data transformations needed, and the need to quickly adapt to changes. Data mesh solves all the above-mentioned problems by decentralizing control, so teams can manage their own data without it being isolated in separate departments. This approach improves scalability by distributing data processing and storage, which helps avoid slowdowns in a single central system. It speeds up insights by allowing teams to work directly with their own data, reducing delays caused by waiting for a central team. Each team takes responsibility for their own data, which boosts quality and consistency. By using easy-to-understand data products and self-service tools, data mesh ensures that all teams can quickly access and manage their data, leading to faster, more efficient operations and better alignment with business needs. Key Principles of Data Mesh Decentralized data ownership: Teams own and manage their data products, making them responsible for their quality and availability. Data as a product: Data is treated like a product with standardized access, versioning, and schema definitions, ensuring consistency and ease of use across departments. Federated governance: Policies are established to maintain data integrity, security, and compliance, while still allowing decentralized ownership. Self-service infrastructure: Teams have access to scalable infrastructure that supports the ingestion, processing, and querying of data without bottlenecks or reliance on a centralized data team. How Do Events Help Data Mesh? Events help a data mesh by allowing different parts of the system to share and update data in real-time. When something changes in one area, an event notifies other areas about it, so everyone stays up-to-date without needing direct connections. This makes the system more flexible and scalable because it can handle lots of data and adapt to changes easily. Events also make it easier to track how data is being used and managed, and let each team handle their own data without relying on others. Finally, let us look at the event-driven data mesh architecture. Event-Driven Data Mesh Architecture This event-driven approach lets us separate the producers of data from the consumers, making the system more scalable as domains evolve over time without needing major changes to the architecture. Producers are responsible for generating events, which are then sent to a data-in-transit system. The streaming platform ensures these events are delivered reliably. When a producer microservice or datastore publishes a new event, it gets stored in a specific topic. This triggers listeners on the consumer side, like Lambda functions or Kinesis, to process the event and use it as needed. Leveraging AWS for Event-Driven Data Mesh Architecture AWS offers a suite of services that perfectly complement the event-driven data mesh model, allowing organizations to scale their data infrastructure, ensure real-time data delivery, and maintain high levels of governance and security. Here’s how various AWS services fit into this architecture: AWS Kinesis for Real-Time Event Streaming In an event-driven data mesh, real-time streaming is a crucial element. AWS Kinesis provides the ability to collect, process, and analyze real-time streaming data at scale. Kinesis offers several components: Kinesis Data Streams: Ingest real-time events and process them concurrently with multiple consumers. Kinesis Data Firehose: Delivers event streams directly to S3, Redshift, or Elastic search for further processing and analysis. Kinesis Data Analytics: Processes data in real-time to derive insights on the fly, allowing immediate feedback loops in data processing pipelines. AWS Lambda for Event Processing AWS Lambda is the backbone of serverless event processing in the data mesh architecture. With its ability to automatically scale and process incoming data streams without requiring server management, Lambda is an ideal choice for: Processing Kinesis streams in real-time Invoking API Gateway requests in response to specific events Interacting with DynamoDB, S3, or other AWS services to store, process, or analyze data AWS SNS and SQS for Event Distribution AWS Simple Notification Service (SNS) acts as the primary event broadcasting system, sending real-time notifications across distributed systems. AWS Simple Queue Service (SQS) ensures that messages between decoupled services are delivered reliably, even in the event of partial system failures. These services allow decoupled microservices to interact without direct dependencies, ensuring that the system remains scalable and fault-tolerant. AWS DynamoDB for Real-Time Data Management In decentralized architectures, DynamoDB provides a scalable, low-latency NoSQL database that can store event data in real time, making it ideal for storing the results of data processing pipelines. It supports the Outbox pattern, where events generated from the application are stored in DynamoDB and consumed by the streaming service (e.g., Kinesis or Kafka). AWS Glue for Federated Data Catalog and ETL AWS Glue offers a fully managed data catalog and ETL service, essential for federated data governance in the data mesh. Glue helps catalog, prepare, and transform data in distributed domains, ensuring discoverability, governance, and integration across the organization. AWS Lake Formation and S3 for Data Lakes While the data mesh architecture moves away from centralized data lakes, S3 and AWS Lake Formation play a crucial role in storing, securing, and cataloging data that flows between various domains, ensuring long-term storage, governance, and compliance. Event-Driven Data Mesh in Action With AWS and Python Event Producer: AWS Kinesis + Python In this example, we use AWS Kinesis to stream events when a new customer is created: Python import boto3 import json kinesis = boto3.client('kinesis') def send_event(event): kinesis.put_record( StreamName="CustomerStream", Data=json.dumps(event), PartitionKey=event['customer_id'] ) def create_customer_event(customer_id, name): event = { 'event_type': 'CustomerCreated', 'customer_id': customer_id, 'name': name } send_event(event) # Simulate a new customer creation create_customer_event('123', 'ABC XYZ') Event Processing: AWS Lambda + Python This Lambda function consumes Kinesis events and processes them in real time. Python import json import boto3 dynamodb = boto3.resource('dynamodb') table = dynamodb.Table('CustomerData') def lambda_handler(event, context): for record in event['Records']: payload = json.loads(record['kinesis']['data']) if payload['event_type'] == 'CustomerCreated': process_customer_created(payload) def process_customer_created(event): table.put_item( Item={ 'customer_id': event['customer_id'], 'name': event['name'] } ) print(f"Stored customer data: {event['customer_id']} - {event['name']}") Conclusion By leveraging AWS services such as Kinesis, Lambda, DynamoDB, and Glue, organizations can fully realize the potential of event-driven data mesh architecture. This architecture provides agility, scalability, and real-time insights, ensuring that organizations remain competitive in today’s rapidly evolving data landscape. Adopting an event-driven data mesh architecture is not just a technical enhancement but a strategic imperative for businesses that want to thrive in the era of big data and distributed systems.
As organizations put artificial intelligence and machine learning (AI/ML) workloads into continuous development and production deployment, they need to have the same levels of manageability, speed, and accountability as regular software code. The popular way to deploy these workloads is Kubernetes, and the Kubeflow and KServe projects enable them there. Recent innovations like the Model Registry, ModelCars feature, and TrustyAI integrations in this ecosystem are delivering these improvements for users who rely on AI/ML. These, and other improvements, have made open source AI/ML ready for use in production. More improvements are coming in the future. Better Model Management AI/ML analyzes data and produces output using machine learning "models," which consist of code, data, and tuning information. In 2023, the Kubeflow community identified a key requirement to have better ways of distributing tuned models across large Kubernetes clusters. Engineers working on Red Hat's OpenShift AI agreed and started work on a new Kubeflow component, Model Registry. "The Model Registry provides a central catalog for developers to index and manage models, their versions, and related artifacts metadata," explained Matteo Mortari, Principal Software Engineer at Red Hat and Kubeflow contributor. "It fills a gap between model experimentation and production activities, providing a central interface for all users to effectively collaborate on ML models." The AI/ML model development journey, from initial experimentation to deployment in production, requires coordination between data scientists, operations staff, and users. Before Model Registry, this involved coordinating information scattered across many places in the organization – even email! With Model Registry, system owners can implement efficient machine learning operations (MLOps), letting them deploy directly from a dedicated component. It's an essential tool for researchers looking to run many instances of a model across large Kubernetes clusters. The project is currently in Alpha, and was included in the recent Kubeflow 1.9 release. Faster Model Serving Kubeflow makes use of the KServe project to "serve," or run, models on each server in the Kubernetes cluster. Users care a great deal about latency and overhead when serving models: they want answers as quickly as possible, and there's never enough GPU power. Many organizations have service level objectives (SLO) for response times, particularly in regulated industries. "One of the challenges that we faced when we first tried out LLMs on Kubernetes was to avoid unnecessary data movements as much as possible," said Roland Huss, Senior Principal Software Engineer at Red Hat and KServe and Knative contributor. "Copying over a multi-gigabyte model from an external storage can take several minutes which adds to the already lengthy startup of an inference service. Kubernetes itself knows how to deal with large amounts of data when it comes to container images, so why not piggyback on those matured techniques?" This thinking led to the development of Modelcars, a passive "sidecar" container holding the model data for KServe. That way, a model needs to be present only once at a cluster node, regardless how many replicas are accessing it. Container image handling is a very well explored area in Kubernetes, with sophisticated caching and performance optimization for the image handling. The result has been faster startup times for serving models, and greatly reduced disk space requirements for cluster nodes. Huss also pointed out that Kubernetes 1.31 recently introduced an image volume type that allows the direct mount of OCI images. When that feature is generally available, which may take a year, it can replace ModelCar for even better performance. Right now, ModelCar is available in KServe v0.12 and above. Safer Model Usage AI/ML systems are complex, and it can be difficult to figure out how they arrive at their output. Yet it's important to ensure that unexpected bias or logic errors don't create misleading results. TrustyAI is a new open source project which aims to bring "responsible AI" to all stages of the AI/ML development lifecycle. "The TrustyAI community strongly believes that democratizing the design and research of responsible AI tooling via an open source model is incredibly important in ensuring that those affected by AI decisions – nowadays, basically everyone – have a say in what it means to be responsible with your AI," stated Rui Vieira, Senior Software Engineer at Red Hat and TrustyAI contributor. The project uses an approach where a core of techniques/algorithms, mostly focused on AI explainability, metrics and guardrails, can be integrated at different stages of the lifecycle. For example, a Python TrustyAI library can be used through Jupyter notebooks during the model experimentation stage to identify biases. The same functionality can be also used for continuous bias detection of production models by incorporating the tool as a pipeline step before model building or deployment. TrustyAI is in its second year of development and KServe supports TrustyAI. Future AI/ML Innovations With these features and tools, and others, development and deployment of AI/ML models is becoming more consistent, reliable, efficient, and verifiable. As with other generations of software, this allows organizations to adopt and customize their own open source AI/ML stacks that would have been too difficult or risky before. The Kubeflow and KServe community is working hard on the next generation of improvements, usually in the Kubernetes Serving Working Group (WG Serving). This includes the LLM Serving Catalog, to provide working examples for popular model servers and explore recommended configurations and patterns for inference workloads. WG Serving is also exploring the LLM Instance Gateway to more efficiently serve distinct LLM use cases on shared model servers running the same foundation model, allowing scheduling requests to pools of model servers. The KServe project is working on features to support very large models. One is multi-host/multi-node support for models which are too big to run on a single node/host. Support for "Speculative Decoding," another in-development feature, speeds up large model execution and improves inter-token latency in memory-bound LLM inference. The project is also developing "LoRA adapter" support which permits serving already trained models with in-flight modifications via adapters to support distinct use cases instead of re-training each of them from scratch before serving. The KServe community is also working on Open Inference Protocol extension to GenAI Task APIs that provide community-maintained protocols to support various GenAI task specific APIs. The community is also working closely with WG Serving to integrate with the efforts like LLM Instance Gateway and provide KServe examples in the Serving Catalog. These and other features are in the KServe Roadmap. The author will be delivering a keynote about some of these innovations at KubeCon's Cloud Native AI Day in Salt Lake City. Thanks to all of the ingenuity and effort being poured into open source AI/ML, users will find the experience of building, running, and training models to keep getting more manageable and performant for many years to come. This article was shared as part of DZone's media partnership with KubeCon + CloudNativeCon.View the Event
When we talk about security in cloud-native applications, broken access control remains one of the most dangerous vulnerabilities. The OWASP Top 10 lists it as the most prevalent security risk today, and for good reason: the impact of mismanaged permissions can lead to catastrophic outcomes like data breaches or ransomware attacks. For CISOs, addressing broken access control isn't just a technical challenge—it’s a strategic priority that touches nearly every aspect of an organization’s security posture. As part of my job as the VP of Developer Relations in Permit.io, I consulted with dozens of CISOs and security engineers leaders, from small garage startup founders to Fortune 100 enterprise security staff. This article will try to provide the most comprehensive perspective I gathered from these chats, guiding you in considering broken access control challenges in cloud-native applications. Understanding the Threat At its core, broken access control occurs when unauthorized users gain access to parts of an application they shouldn’t be able to see or modify. This vulnerability can manifest in several ways: from users gaining admin privileges they shouldn’t have to attackers exploiting weak session management to move laterally within a system. What makes this threat particularly dangerous in cloud-native environments is the complexity of modern application architectures. Microservices, third-party APIs, and distributed resources create a multifaceted ecosystem where data flows across various services. Each connection is a potential point of failure. CISOs must ensure that access control mechanisms are ironclad—every request to access sensitive data or perform critical operations must be carefully evaluated and tightly controlled. The Three Pillars of Access Control Addressing broken access control requires a comprehensive strategy built on three key pillars: authentication, permissions, and session management. Each plays a critical role in securing cloud-native applications: Authentication: This is the first line of defense, ensuring that users are who they claim to be. Strong authentication methods like multi-factor authentication (MFA) can drastically reduce the risk of unauthorized access. Permissions: Even after authentication, not all users should have equal access. Permissions dictate what authenticated users can do. In cloud-native apps, fine-grained permissions are essential to prevent privilege escalation and data leakage. Session Management: Proper session management ensures that once a user is authenticated and authorized, their activities are monitored, and their access remains limited to the session’s scope. Poor session management can allow attackers to hijack sessions or escalate privileges. Why Permissions Matter More Than Ever While all three pillars are crucial, permissions are the backbone of modern access control. In a cloud-native environment, where services and resources are distributed across different infrastructures, managing permissions becomes exponentially more challenging. A one-size-fits-all approach, like assigning simple roles (e.g., Admin, User), isn’t sufficient. Today’s applications require a more nuanced approach to permissions management. Fine-Grained Authorization To prevent unauthorized access, organizations should implement fine-grained authorization models. These models allow for more precise control by evaluating multiple attributes—such as a user’s role, location, or even payment method—before granting access. This granular level of control is necessary to avoid both horizontal and vertical privilege escalation. For example, imagine a SaaS product with different pricing tiers. A user’s access to features shouldn’t just depend on their role (e.g., admin or regular user) but also on their subscription level, which should automatically update based on their payment status in an external payment application. Implementing fine-grained permissions ensures that only users who have paid for premium features can access them, even if they have elevated roles within the system. The Importance of Least Privilege A critical part of permissions management is enforcing the principle of least privilege. Simply put, users should have the minimal level of access required to perform their tasks. This principle is especially important in cloud-native applications, where microservices may expose sensitive data across various parts of the system. For example, a developer working on one service shouldn’t have full access to every service in the environment. Limiting access in this way reduces the risk of an attacker exploiting one weak point to gain broader access. It also prevents insider threats, where an internal user might misuse their privileges. Managing Sessions to Contain Threats While permissions control access to features and data, session management ensures that users’ activities are properly constrained during their session. Strong session management practices include limiting session duration, detecting unusual behavior, and ensuring that session tokens are tightly secured. Session hijacking, where attackers steal a user’s session token and take over their session, is a common attack vector in cloud-native environments. Implementing session timeouts, MFA for high-risk actions, and token revocation mechanisms can help mitigate these risks. Effective session management also includes ensuring that users cannot escalate their privileges within the session. For example, a user who starts a session with standard permissions shouldn’t be able to gain admin-level privileges without re-authenticating. The CISO’s Role in Securing Access Control For a CISO, the challenge of preventing broken access control goes beyond simply setting policies. It involves fostering collaboration between security teams, developers, and product managers. This ensures that access control isn’t just a checkbox in compliance reports but a living, adaptive process that scales with the organization’s needs. A Strategic Approach to Collaboration CISOs must ensure that developers have the resources and tools they need to build secure applications without becoming bottlenecks in the process. Traditional access control systems often put too much burden on developers, requiring them to manually write permission logic into the code. This not only slows down development, but also introduces the risk of human error. Instead, CISOs should promote a culture of collaboration where security, development, and product teams can work together on defining and managing access control policies. By implementing automated and scalable tools, CISOs can empower teams to enforce security policies effectively while maintaining agility in the development process. Authorization-as-a-Service One of the most effective ways to manage permissions in a scalable and secure manner is through authorization-as-a-service solutions. These platforms can provide a centralized, no-code interface for defining and managing authorization policies, making it easier for non-technical stakeholders to be involved in the process. By leveraging these tools, organizations can reduce their reliance on developers to manually manage permissions. This not only speeds up the process, but also ensures that permissions are consistently enforced across all services. With real-time policy updates, automated monitoring, and auditability features, authorization-as-a-service platforms allow organizations to stay agile while maintaining strong access control measures. The flexibility of these solutions also allows for easier scaling as the application and user base grow, ensuring that permission models can evolve without requiring significant re-engineering. Additionally, having a no-code UI allows for rapid adjustments to access policies in response to changing business needs or security requirements, without creating unnecessary dependencies on development teams. Conclusion Preventing broken access control vulnerabilities in cloud-native applications is a critical priority for CISOs. It requires a strategic focus on fine-grained permissions, the principle of least privilege, and robust session management. Collaboration across teams and the adoption of modern tools like authorization-as-a-service platforms can greatly simplify this complex challenge, enabling organizations to secure their environments without sacrificing speed or flexibility. By addressing these areas, CISOs can help ensure that their organizations remain resilient to access control vulnerabilities while empowering their teams to manage permissions effectively and securely. This article was shared as part of DZone's media partnership with KubeCon + CloudNativeCon.View the Event
Virtual Threads Java 21 saw the supported introduction of virtual threads. Unlike regular Java threads (which usually correspond to OS threads), virtual threads are incredibly lightweight, indeed an application can create and use 100,000 or more virtual threads simultaneously. This magic is achieved by two major changes to the JVM: A virtual thread is managed by the JVM, not the OS. If it is executing, it is bound to a platform thread (known as a carrier); if it is not executing (say it is blocked waiting for some form of notification), the JVM "parks" the virtual thread and frees the carrier thread so it can schedule a different virtual thread. A platform thread typically has about 1 megabyte of memory preassigned to it for its stack, etc. In contrast, a virtual thread’s stack is managed in the heap and can be as little as a few hundred bytes — growing and shrinking as needed. The API for managing cooperation and communication between virtual threads is exactly the same as for legacy platform threads. This has good and bad points: The good: Implementers are familiar with the interface. The bad: You are still faced with all the usual "hard" parts of multi-threaded applications — synchronized blocks, race conditions, etc. — only now the problem is increased by orders of magnitude. Moreover, a virtual thread cannot get parked in a synchronized block – so the more synchronized blocks are used, the less efficient virtual threads become. What is needed is a new approach. One that can exploit the ability to run millions of virtual threads in a meaningful way but do so while making multi-threaded programming easier. In fact, such a model exists and it was first discussed 50 years ago: Actors. Actors and Dust The Actor concept arose during the 1970s at MIT with research by Carl Hewitt. The Actor concept is at the core of languages like Erlang and Elixir and frameworks like Dust: an open-source (Apache2 license) implementation of Actors for Java 21+. Different implementations of Actors vary in the details, so from now on we will describe the specific Dust Actor model: An Actor is a Java object associated with exactly one virtual thread. An Actor has a "mailbox" that receives and queues messages from other Actors. The thread wait()s on this mailbox, retrieves a message, processes it, and returns to waiting for its next message. How the Actor processes messages is called its Behavior. Note that if the Actor has no pending messages then, since the mailbox thread is virtual the JVM will "park" the Actor and reuse its thread. When a message is received, the JVM will un-park the Actor and give it a thread to process the message. This is all transparent to the developer whose only cares are messages and behaviors. An Actor may have its own mutable state which is inaccessible outside the Actor. In response to receipt of a message an Actor may: Mutate its state Send immutable messages to other Actors Create or destroy other Actors Change its Behavior That’s it. Note that an Actor is single threaded so there are no locking/synchronization issues within an Actor. The only way an Actor can influence another Actor is by sending it an immutable message – so there are no synchronization issues between Actors. The order of messages sent by one Actor to another is preserved by the receiving Actor but continuity is not guaranteed. If two Actors send messages to the same Actor at the same time, the messages may be interleaved but the order of each stream is preserved. Actors are managed by an ActorSystem. It has a name, and, optionally a port number. If the port is specified, then Actors in the ActorSystem can receive messages sent remotely — either from another port or another host entirely. The ActorSystem takes care of (de)serialization of messages in the remote case. An Actor has a unique address which resembles a URL: dust://host:port/actor-system-name/a1/a2/a3. If you are communicating with Actors in the same Actor System, the URL can be reduced to: /a1/a2/a3. This is more than a pathname, though: it expresses a parent/child relationship between Actors, namely: An Actor was created with the name a1. It then created an Actor called a2 : a1 is the "parent" and a2 the "child" of a1. Actor a2 then created a child of its own called a3. Actors can create many children. The only requirement is their names be distinct from their "siblings." Actor Structure Actors extend the Actor class. It is important to note that Actors are not created directly with a "new" but use a different mechanism. This is needed to set up correct parent-child relationships. We use the Props class for this as in the following simple example: /** * A very simple Actor */ public class PingPongActor extends Actor { private int max; /** * Used internally to call the appropriate constructor */ public static Props props(int max) { Props.create(PingPongActor.class, max); } public PingPongActor(int max) { this.max = max } // Define the initial Behavior @Override protected ActorBehavior createBehavior() { return message → { switch(message) { case PingPongMsg → { sender.tell(message, self); if (0 == --max) stopSelf(); } default → System.out.println(“Strange message …”); } } } } Actors are created from their Props (see below), which can also include initialization parameters. So in the above, our PingPongActor initialization includes a max count, whose use we will show shortly. Actors are created by other Actors, but that chain has to begin somewhere. When an ActorSystem is created, it creates several default top-level Actors, including one called /user. An application can then create children of this Actor via the ActorSystem: ActorSystem system = new ActorSystem('PingPong'); ActorRef ping = system.context.actorOf(PingPongActor.props(1000000), ‘ping’); The context of an ActorSystem provides the actorOf() method, which creates children of the /user Actor. Actors themselves have an identical actorOf() for creating their children. If we now looked into the ActorSystem, we would see a new PingPongActor whose name is ping and whose path is /user/ping. The value returned by this creation step is an ActorRef — a "handle" to that particular Actor. Let's build another: ActorRef pong = system.context.actorOf(PingPongActor.props(1000000), ‘pong’); So now we have two instances of PingPongActor, with their "max" state set to 1000000 and both are waiting to receive messages in their mailbox. When it has a message, it passes it to the createBehavior() lambda, which implements our behavior. So what does this behavior do? First, we need a nice message class to get things fired up: public class PingPongMsg implements Serializable {} The only constraint on messages is they must be serializable. So now let’s look at our setup: ActorSystem system = new ActorSystem('PingPong'); ActorRef ping = system.context.actorOf(PingPongActor.props(1000000), ‘ping’); ActorRef pong = system.context.actorOf(PingPongActor.props(1000000), ‘pong’); pong.tell(new PingPongMsg(), ping); ActorRefs have a tell() method which takes a Serializable message object and a (nullable) ActorRef. Thus, in the above an instance of PingPongMsg is delivered to the Actor at pong. Since the second argument was not null, that ActorRef (ping) is available as the "sender" variable in the recipient's behavior. Recall that the part of the behavior that dealt with a PingPongMsg was: case PingPongMsg → { sender.tell(message, self); if (0 == --max) stopSelf(); } The sender of this message gave me his ActorRef (ping) so I am simply sending the message back to him, telling him that I (pong) am the sender via the self variable. Rinse, lather, and repeat one million times. So the same message will have been passed back and forth two million times in total between the two Actors, and once their counters hit 0, each Actor will destroy itself. Beyond PingPong PingPongActor was just about the simplest example capable of giving a feel for Actors and Dust, but is clearly of limited value otherwise. GitHub contains several Dust repos which constitute a small library around the Dust framework. dust-core– The heart of Dust: Actors, persistent Actors, various structural Actors for building pipelines, scalable servers, etc. Programmer documentation dust-http – Small library to make it easy for Actors to access Internet endpoints, etc. dust-html – A small library to make manipulating web page content easy in idiomatic Dust dust-feeds – Actors to access RSS feeds, crawl ,websites, and use SearXNG for web searches dust-nlp – Actors to access ChatGPT (and similar) endpoints and the Hugging Face embeddings API The Actor paradigm is an ideal match for event-driven scenarios. Dust has been used to create systems such as: Intelligent news reader using LLMs to identify and follow trending topics Building occupancy management using WiFi signal strengths as proxies for people A digital twin of a toy town – 8000 Actors just to simulate flocking birds! A system to find and analyze data for M&A activities
This article explores two major approaches to artificial intelligence: symbolic AI, based on logical rules, and connectionist AI, inspired by neural networks. Beyond the technical aspects, the aim is to question concepts such as perception and cognition and to reflect on the challenges that AI must take up to better manage contradictions and aim to imitate human thought. Preamble French researcher Sébastien Konieczny was recently named EuAI Fellow 2024 for his groundbreaking work on belief fusion and inconsistency management in artificial intelligence. His research, focused on reasoning modeling and knowledge revision, opens up new perspectives to enable AI systems to tend to reason even more reliably in the face of contradictory information, and thus better manage the complexity of the real world. Konieczny's work is part of a wider context of reflection and fundamental questioning about the very nature of artificial intelligence. These questions are at the root of the long-standing debate between symbolic and connectionist approaches to AI, where technical advances and philosophical reflections are intertwined. Introduction In the field of artificial intelligence, we can observe two extreme perspectives: on the one hand, boundless enthusiasm for AI's supposedly unlimited capabilities, and on the other, deep concerns about its potential negative impact on society. For a clearer picture, it makes sense to go back to the basics of the debate between symbolic and connectionist approaches to AI. This debate, which goes back to the origins of AI, compares two fundamental visions against each other. On the one hand, the symbolic approach sees intelligence as the manipulation of symbols according to logical rules. On the other, the connectionist approach is inspired by the neuronal functioning of the human brain. By refocusing the discussion on the relationship between perception, cognition, learning, generalization, and common sense, we can elevate the debate beyond speculation about the alleged consciousness of today's AI systems. The Symbolic Approach The symbolic approach sees the manipulation of symbols as fundamental to the formation of ideas and the resolution of complex problems. According to this view, conscious thought relies on the use of symbols and logical rules to represent knowledge and, from there, to reason. “Although recently connectionist AI has started addressing problems beyond narrowly defined recognition and classification tasks, this mostly remains a promise: it remains to be seen if connectionist AI can accomplish complex tasks that require commonsense reasoning and causal reasoning, all without including knowledge and symbols.” - Ashok K. Goel Georgia Institute of Technology The Connectionist Approach This vision, projected by the symbolic approach, is contested by proponents of the connectionist approach, who maintain that intelligence emerges from complex interactions between numerous simple units, much like the neurons in the brain. They argue that current AI models, based on deep learning, demonstrate impressive capabilities without explicit symbol manipulation. Konieczny's work on reasoning modeling and knowledge revision provides food for thought in this debate. By focusing on the ability of AI systems to handle uncertain and contradictory information, this research highlights the complexity of autonomous reasoning. They highlight what’s perhaps the real challenge: how to enable an AI to revise its knowledge in the face of new information while maintaining internal consistency. Experiencing the World Now, we know very well that seeing, touching, and hearing the world (in other words, experiencing the world through the body) are essential for humans to build cognitive structures. When we consider the evolution of AI systems, particularly in the field of Generative AI (GenAI) democratized by the release of ChatGPT in 2022, AI systems are approaching a form of “thinking” in their own way. This theory is based on the fact that on the basis of massive data sets, collected from real-world sensors, advanced systems, such as autonomous systems for example, would already be able to establish models that mimic forms of understanding. Mimicking Cognitive Abilities Although AI lacks consciousness, these systems process and react to their environment in a way that suggests a data-driven way of mimicking cognitive abilities. This imitation of cognition raises fascinating questions about the nature of intelligence and consciousness. Are we creating truly intelligent entities, or simply extremely sophisticated systems of imitation? "In many computer science fields, one needs to synthesize a coherent belief from several sources. The problem is that, in general, these sources contradict each other. So the merging of belief sources is a non-trivial issue” - Sébastien Konieczny, CNRS Konieczny's observation reinforces the idea that AI needs to reconcile contradictory information. This problem, far from being purely and solely technical, opens the way to deeper reflections on the nature of reasoning and understanding. The theme of managing inconsistencies echoes philosophical debates on experience and common sense in AI. Indeed, the ability to detect and resolve contradictions is a fundamental quality of human reasoning and a key element in our understanding of the world. The Concept of Experience If we were to transpose Kant's concept of experience onto AI technologies, we might suggest that the risk for some — and the opportunity for others — lies in how our understanding of "experience" itself is evolving. If, for Kant, experience is the result of a synthesis between sensible data — i.e., the raw information perceived by our senses — and the concepts of understanding, on what criteria can we base our assertion that machines can acquire experience? This transposition prompts us to reflect on the very nature of experience, and on the possibility of machines gaining access to a form of understanding comparable to that of humans. That would be quite a significant leap from this reflection to asserting that machines can truly acquire experience. . . The Concept of Common Sense If we now turn to the concept of “common sense," we can conceive of it as a form of practical wisdom derived from everyday experience. In the context of our thinking on AI, common sense could be seen as an intuitive ability to navigate the real world, to make rapid inferences without resorting to formal reasoning. We can attribute to common sense the ability to form a bridge between perception and cognition. This suggests that lived experience is crucial to understanding the world. So how can a machine, devoid of direct sensory experience, develop this kind of intuitive intelligence? This question raises another challenge for AI: to reproduce not only formal intelligence but also that form of wisdom that we humans often take for granted. We need to understand that, when machines integrate data from our human experience, even though they haven't experienced it for themselves, they are making the closest thing we have to a “decision," not a choice. Decision vs. Choice It's necessary here to make a clear distinction between “decision” and “choice." A machine can make decisions by executing algorithms and analyzing data, but can it really make choices? Where decision involves a logical process of selection among options, based on predefined criteria, choice on the other hand involves an extra dimension of free will, common sense, self-awareness, and moral responsibility. When an AI “decides," it follows a logical path determined by its programming and data. But a real choice, like those made by humans, implies a deep understanding of the consequences, an ability to reason abstractly about values, and potentially even an intuition that goes beyond mere logic. This distinction highlights a fundamental limitation of today's AI: although it can make extremely complex and sophisticated decisions, it remains devoid of the ability to make choices in the fully human sense of the term. While this distinction turns out to be far more philosophical than technical, it is nonetheless often discussed in debates on artificial intelligence and consciousness and the capacity to think. Konieczny's research into knowledge fusion and revision sheds interesting light on this distinction. By working on methods enabling AI to handle conflicting information and estimate the reliability of sources, this work could help develop systems capable of making more nuanced decisions, perhaps coming closer to the notion of “choice” as we conceive it for humans. See and Act in the World AI, in processing data, is not granted with consciousness or experience. As Dr. Li Fei-Fei, Co-Director of Stanford’s Human-Centered AI Institute,) puts it: “To truly understand the world, we must not only see but also act in it." She used to highlight the fundamental limitation of machines, which, deprived of autonomous action, subjectivity, and the ability to “choose," cannot truly experience the world as humans do. In her lecture “What We See and What We Value: AI with a Human Perspective,” Dr. Li addresses the issue of visual intelligence as an essential component of animal and human intelligence. She argues that it is necessary to enable machines to perceive the world in a similar way while raising fundamental ethical questions about the implications of developing AI systems capable of seeing and interacting with the world around us. This reflection is fully in line with the wider debate on perception and cognition in AI, suggesting that while AI can indeed process visual data with remarkable efficiency, it remains lacking the human values and subjective experience that characterize our understanding of the world. This perspective brings us back to the central question of the experience and “lived experience” of machines, highlighting once again the gap that exists between data processing, however sophisticated, and the true understanding of the world as we humans conceive it. Conclusion While the progress of AI is undeniable and impressive, the debate between symbolic and connectionist approaches reminds us that we are still far from fully understanding the nature of intelligence and consciousness. This debate will continue to influence the development of AI while pushing us to reflect on what really makes us thinking and conscious beings. One More Thing It's important to stress that this article is not intended to suggest that machines might one day acquire true consciousness, comparable to that of humans. Rather, by exploring philosophical concepts such as experience and choice, the intention is to open up avenues of reflection on how to improve artificial intelligence. These theoretical reflections offer a framework for understanding how AI could, through advanced data processing methods, better mimic certain aspects of human cognition without claiming to achieve consciousness. It’s in this search for better techniques, and not in speculation about artificial consciousness, that the purpose of this exploration lies.
Dapr and Service Meshes: Better Together
October 29, 2024 by
Top Takeaways From Devoxx Belgium 2024
October 29, 2024 by CORE
Inside the World of Data Centers
October 29, 2024 by
Increase Model Flexibility and ROI for GenAI App Delivery With Kubernetes
October 29, 2024 by
Explainable AI: Making the Black Box Transparent
May 16, 2023 by CORE
Inside the World of Data Centers
October 29, 2024 by
Increase Model Flexibility and ROI for GenAI App Delivery With Kubernetes
October 29, 2024 by
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
Increase Model Flexibility and ROI for GenAI App Delivery With Kubernetes
October 29, 2024 by
Front End Debugging, Part 1: Not Just Console Log
October 29, 2024 by CORE
Increase Model Flexibility and ROI for GenAI App Delivery With Kubernetes
October 29, 2024 by
Front End Debugging, Part 1: Not Just Console Log
October 29, 2024 by CORE
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
Inside the World of Data Centers
October 29, 2024 by
Increase Model Flexibility and ROI for GenAI App Delivery With Kubernetes
October 29, 2024 by
Five IntelliJ Idea Plugins That Will Change the Way You Code
May 15, 2023 by