Are You an Efficient Developer? Then AI Is After Your Job
[DZone Research] Join Us for Our 5th Annual Kubernetes Survey!
Development at Scale
As organizations’ needs and requirements evolve, it’s critical for development to meet these demands at scale. The various realms in which mobile, web, and low-code applications are built continue to fluctuate. This Trend Report will further explore these development trends and how they relate to scalability within organizations, highlighting application challenges, code, and more.
Design Patterns
Threat Modeling
Over 100,000 organizations use Apache Kafka for data streaming. However, there is a problem: The broad ecosystem lacks a mature client framework and managed cloud service for Python data engineers. Quix Streams is a new technology on the market trying to close this gap. This blog post discusses this Python library, its place in the Kafka ecosystem, and when to use it instead of Apache Flink or other Python- or SQL-based substitutes. Why Python and Apache Kafka Together? Python is a high-level, general-purpose programming language. It has many use cases for scripting and development. But there is one fundamental purpose for its success: Data engineers and data scientists use Python. Period. Yes, there is R as another excellent programming language for statistical computing. And many low-code/no-code visual coding platforms for machine learning (ML). SQL usage is ubiquitous amongst data engineers and data scientists, but it’s a declarative formalism that isn’t expressive enough to specify all necessary business logic. When data transformation or non-trivial processing is required, data engineers and data scientists use Python. Hence, data engineers and data scientists use Python. If you don’t give them Python, you will find either shadow IT or Python scripts embedded into the coding box of a low-code tool. Apache Kafka is the de facto standard for data streaming. It combines real-time messaging, storage for true decoupling and replayability of historical data, data integration with connectors, and stream processing for data correlation. All in a single platform. At scale for transactions and analytics. Python and Apache Kafka for Data Engineering and Machine Learning In 2017, I wrote a blog post about “How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka.” The article is still accurate and explores how data streaming and AI/ML are complementary: Machine Learning requires a lot of infrastructure for data collection, data engineering, model training, model scoring, monitoring, and so on. Data streaming with the Kafka ecosystem enables these capabilities in real-time, reliable, and at scale. DevOps, microservices, and other modern deployment concepts merged the job roles of software developers and data engineers/data scientists. The focus is much more on data products solving a business problem, operated by the team that develops it. Therefore, the Python code needs to be production-ready and scalable. As mentioned above, the data engineering and ML tasks are usually realized with Python APIs and frameworks. Here is the problem: The Kafka ecosystem is built around Java and the JVM. Therefore, it lacks good Python support. Let’s explore the options and why Quix Streams is a brilliant opportunity for data engineering teams for machine learning and similar tasks. What Options Exist for Python and Kafka? Many alternatives exist for data engineers and data scientists to leverage Python with Kafka. Python Integration for Kafka Here are a few common alternatives for integrating Python with Kafka and their trade-offs: Python Kafka client libraries: Produce and consume via Python. This is solid but insufficient for advanced data engineering as it lacks processing primitives, such as filtering and joining operations found in Kafka Streams and other stream processing libraries. Kafka REST APIs: Confluent REST Proxy and similar components enable producing and consuming to/from Kafka. It works well for gluing interfaces together but is not ideal for ML workloads with low latency and critical SLAs. SQL: Stream processing engines like ksqlDB or FlinkSQL allow querying of data in SQL. KsqlDB and Flink are other systems that need to be operated. And SQL isn’t expressive enough for all use cases. Instead of just integrating Python and Kafka via APIs, native stream processing provides the best of both worlds: The simplicity and flexibility of dynamic Python code for rapid prototyping with Jupyter notebooks and serious data engineering AND stream processing for stateful data correlation at scale either for data ingestion and model scoring. Stream Processing With Python and Kafka In the past, we had two suboptimal open-source options for stream processing with Kafka and Python: Faust: A stream processing library, porting the ideas from Kafka Streams (a Java library and part of Apache Kafka) to Python. The feature set is much more limited compared to Kafka Streams. Robinhood open-sourced Faust. But it lacks maturity and community adoption. I saw several customers evaluating it but then moving to other options. Apache Flink’s Python API: Flink’s adoption grows significantly yearly in the stream processing community. This API is a Python version of DataStream API, which allows Python users to write Python DataStream API jobs. Developers can also use the Table API, including SQL, directly in there. It is an excellent option if you have a Flink cluster and some folks want to run Python instead of Java or SQL against it for data engineering. The Kafka-Flink integration is very mature and battle-tested. As you see, all the alternatives for combining Kafka and Python have trade-offs. They work for some use cases but are imperfect for most data engineering and data science projects. A new open-source framework to the rescue? Introducing a brand new stream processing library for Python: Quix Streams… What Is Quix Streams? Quix Streams is a stream processing library focused on ease of use for Python data engineers. The library is open-source under Apache 2.0 license and available on GitHub. Instead of a database, Quix Streams uses a data streaming platform such as Apache Kafka. You can process data with high performance and save resources without introducing a delay. Some of the Quix Streams differentiators are defined as being lightweight and powerful, with no JVM and no need for separate clusters of orchestrators. It sounds like the pitch for why to use Kafka Streams in the Java ecosystem minus the JVM — this is a positive comment! :-) Quix Streams does not use any domain-specific language or embedded framework. It’s a library that you can use in your code base. This means that with Quix Streams, you can use any external library for your chosen language. For example, data engineers can leverage Pandas, NumPy, PyTorch, TensorFlow, Transformers, and OpenCV in Python. So far, so good. This was more or less the copy and paste of Quix Streams marketing (it makes sense to me)… Now, let’s dig deeper into the technology. The Quix Streams API and Developer Experience The following is the first feedback after playing around, doing code analysis, and speaking with some Confluent colleagues and the Quix Streams team. The Good The Quix API and tooling persona is the data engineer (that’s at least my understanding). Hence, it does not directly compete with other offerings, say a Java developer using Kafka Streams. Again, the beauty of microservices and data mesh is the focus of an application or data product per use case. Choose the right tool for the job! The API is mostly sleek, with some weirdness / unintuitive parts. But it is still in beta, so hopefully, it will get more refined in the subsequent releases. No worries at this early stage of the project. The integration with other data engineering and machine learning Python frameworks is excellent. If you can combine stream processing with Pandas, NumPy, and similar libraries is a massive benefit for the developer experience. The Quix library and SaaS platform are compatible with open-source Kafka and commercial offerings and cloud services like Cloudera, Confluent Cloud, or Amazon MSK. Quix’s commercial UI provides out-of-the-box integration with self-managed Kafka and Confluent Cloud. The cloud platform also provides a managed Kafka for testing purposes (for a few cents per Kafka topic and not meant for production). The Improvable The stream processing capabilities (like powerful sliding windows) are still pretty limited and not comparable to advanced engines like Kafka Streams or Apache Flink. The roadmap includes enhanced features. The architecture is complex since executing the Python API jumps through three languages: Python -> C# -> C++. Does it matter to the end user? It depends on the use case, security requirements, and more. The reasoning for this architecture is Quix’s background coming from the McLaren F1 team and ultra-low latency use cases and building a polyglot platform for different programming environments. It would be interesting to see a benchmark for throughput and latency versus Faust, which is Python top to bottom. There is a trade-off between inter-language marshaling/unmarshalling versus the performance boost of lower-level compiled languages. This should be fine if we trust Quix’s marketing and business model. I expect they will provide some public content soon, as this question will arise regularly. The Quix Streams Data Pipeline Low Code GUI The commercial product provides a user interface for building data pipelines and code, MLOps, and a production infrastructure for operating and monitoring the built applications. Here is an example: Tiles are K8’s containers. Each purple (transformation) and orange (destination) node is backed by a Git project containing the application code. The three blue (source) nodes on the left are replay services used to test the pipeline by replaying specific streams of data. Arrows are individual Kafka topics in Confluent Cloud (green = live data). The first visible pipeline node (bottom left) is joining data from different physical sites (see the three input topics; one was receiving data when I took the image). There are three modular transformations in the visible pipeline (two rolling windows and one interpolation). There are two real-time apps (one real-time Streamlit dashboard and the other is an integration with a Twilio SMS service). Quix Streams vs. Apache Flink for Stream Processing With Python The Quix team wrote a detailed comparison of Apache Flink and Quix Streams. I don’t think it’s an entirely fair comparison as it compares open-source Apache Flink to a Quix SaaS offering. Nevertheless, for the most part, it is a good comparison. Flink was always Java-first and has added support for Python for its DataStream and Table APIs at a later stage. On the contrary, Quix Streams is brand new. Hence, it lacks maturity and customer case studies. Having said all this, I think Quix Streams is a great choice for some stream processing projects in the Python ecosystem! Should You Use Quix Streams or Apache Flink? TL;DR: There is a place for both! Choose the right tool… Modern enterprise architectures built with concepts like data mesh, microservices, and domain-driven design allow this flexibility per use case and problem. I recommend using Flink if the use case makes sense with SQL or Java. And if the team is willing to operate its own Flink cluster or has a platform team or a cloud service taking over the operational burden and complexity. On the contrary, I would use Quix Streams for Python projects if I want to go to production with a more microservice-like architecture building Python applications. However, beware that Quix currently only has a few built-in stateful functions or JOINs. More advanced stream processing use cases cannot be done with Quix (yet). This is likely to change in the next months by adding more capabilities. Hence, make sure to read Quix’s comparison with Flink. But keep in mind if you want to evaluate the open-source Quix Streams library or the Quix SaaS platform. If you are in the public cloud, you might combine Quick Streams SaaS with other fully managed cloud services like Confluent Cloud for Kafka. On the other side, in your own private VPC or on-premise, you need to build your own platform with technologies like the Quix Streams library, Kafka or Confluent Platform, and so on. The Current State and Future of Quix Streams If you build a new framework or product for data streaming, you need to make sure that it does not overlap with existing established offerings. You need differentiators and/or innovation in a new domain that does not exist today. Quix Streams accomplishes this essential requirement to be successful: The target audience is data engineers with Python backgrounds. No severe and mature tool or vendor exists in this space today. And the demand for Python will grow more and more with the focus on leveraging data for solving business problems in every company. Maturity: Making the Right (Marketing) Choices in the Early Stage Quix Streams is in the early maturity stage. Hence, a lot of decisions can still be strengthened or revamped. The following buzzwords come into my mind when I think about Quix Streams: Python, data streaming, stream processing, Python, data engineering, Machine Learning, open source, cloud, Python, .NET, C#, Apache Kafka, Apache Flink, Confluent, MSK, DevOps, Python, governance, UI, time series, IoT, Python, and a few more. TL;DR: I see a massive opportunity for Quix Streams to become a great data engineering framework (and SaaS offering) for Python users. I am not a fan of polyglot platforms. It requires finding the lowest common denominator. I was never a fan of Apache Beam for that reason. The Kafka Streams community did not choose to implement the Beam API because of too many limitations. Similarly, most people do not care about the underlying technology. Yes, Quix Streams’ core is C++. But is the goal to roll out stream processing for various programming languages, only starting with Python, then going to .NET, and then to another one? I am skeptical. Hence, I would like to see a change in the marketing strategy already: Quix Streams started with the pitch of being designed for high-frequency telemetry services when you must process high volumes of time-series data with up to nanosecond precision. It is now being revamped to focus mainly on Python and data engineering. Competition: Friends or Enemies? Getting market adoption is still hard. Intuitive use of the product, building a broad community, and the right integrations and partnerships (can) make a new product such as Quix Streams successful. Quix Streams is on a good way here. For instance, integrating serverless Confluent Cloud and other Kafka deployments works well: This is a native integration, not a connector. Everything in the pipeline image runs as a direct Kafka protocol connection using raw TCP/IP packets to produce and consume data to topics in Confluent Cloud. Quix platform is orchestrating the management of the Confluent Cloud Kafka Cluster (create/delete topics, topic sizing, topic monitoring, etc.) using Confluent APIs. However, one challenge of these kinds of startups is the decision to complement versus compete with existing solutions, cloud services, and vendors. For instance, how much time and money do you invest in data governance? Should you build this or use the complementing streaming platform or a separate independent tool (like Collibra)? We will see where Quix Streams will go here —building its cloud platform for addressing Python engineers or overlapping with other streaming platforms. My advice is the proper integration with partners that lead in their space. Working with Confluent for over six years, I know what I am talking about: We do one thing: data streaming, but we are the best in that one. We don’t even try to compete with other categories. Yes, a few overlaps always exist, but instead of competing, we strategically partner and integrate with other vendors like Snowflake (data warehouse), MongoDB (transactional database), HiveMQ (IoT with MQTT), Collibra (enterprise-wide data governance), and many more. Additionally, we extend our offering with more data streaming capabilities, i.e., improving our core functionality and business model. The latest example is our integration of Apache Flink into the fully managed cloud offering. Kafka for Python? Look At Quix Streams! In the end, a data engineer or developer has several options for stream processing deeply integrated into the Kafka ecosystem: Kafka Streams: Java client library ksqlDB: SQL service Apache Flink: Java, SQL, Python service Faust: Python client library Quix Streams: Python client library All have their pros and cons. The persona of the data engineer or developer is a crucial criterion. Quix Streams is a nice new open-source framework for the broader data streaming community. If you cannot or do not want to use just SQL but native Python, then watch the project (and the company/cloud service behind it). bytewax is another open-source stream processing library for Python integrating with Kafka. It is implemented in Rust under the hood. I never saw it in the field yet. But a few comments mentioned it after I shared this blog post on social networks. I think it is worth a mention. Let’s see if it gets more traction in the following months.
Developed by Microsoft, the .NET MAUI (multi-platform app UI) is an open-source framework to build native mobile and desktop applications for multiple platforms, including Android, iOS, macOS, Windows, and more, and that too, using a single codebase. Unlike the Xamarin forms, where developers have to maintain a separate codebase for each targeted platform. An Overview of .NET Framework If you are aware of what .NET framework is and how it works, then you can skip this section and jump to “How It Works.” The .NET Framework is a software development platform created by Microsoft. It provides a comprehensive and consistent programming model for building and running applications across various Microsoft Windows platforms, including desktop, web, and mobile. Here are some key features and components of the .NET Framework: Common Language Runtime (CLR): The CLR is the foundation of the .NET Framework. It provides core services like memory management, garbage collection, and exception handling. It also enables language interoperability, allowing multiple programming languages to be used together within the same application. Base Class Library (BCL): The BCL is a collection of pre-built classes, types, and APIs that provide a wide range of functionality to developers. It includes classes for file I/O, networking, data access, cryptography, threading, and much more. The BCL offers reusable components that simplify application development. Language support: The .NET Framework supports multiple programming languages, including C#, Visual Basic .NET (VB.NET), and F#. Developers can choose their preferred language to write code and take advantage of the rich tooling and libraries available for each language. Framework Class Library (FCL): The FCL is a collection of libraries and frameworks built on top of the BCL. It provides additional functionality for specific types of applications, such as Windows Presentation Foundation (WPF) for desktop applications, ASP.NET for web development, and Windows Communication Foundation (WCF) for building service-oriented applications. Deployment and execution: .NET Framework applications are typically deployed as compiled assemblies (.exe or .dll files) along with any required dependencies. At runtime, the assemblies are executed by the CLR, which just-in-time (JIT) compiles the IL (Intermediate Language) code into native machine instructions. How .NET MAUI Works After you write .NET MAUI code (mostly in C#), the following steps occur: Building the Application The code, including the references to the BCL classes, is compiled using the .NET build tools or an integrated development environment (IDE) like Visual Studio. During the build process, the C# code is transformed into an Intermediate Language (IL) or Common Intermediate Language (CIL). This IL is platform-agnostic and can run on any platform with a compatible .NET runtime. We can also install and enable Ahead-of-Time compilation (AOT). If this is enabled, then steps 2 and 3 below will be skipped, and .NET MAUI will generate native code for the targeted environment from IL. When AOT is activated, the Mono runtime is utilized to transform .NET MAUI code into native code for iOS and Android platforms, while the .NET CLR is employed for Windows to achieve the same conversion. Platform-Specific Compilation The IL code, along with the referenced BCL assemblies, is then compiled into platform-specific binaries for each target platform. For Android, the .NET MAUI tooling packages up the IL code and XAML, along with any necessary .NET libraries, into an Android Package (APK). This is an archive file used to distribute Android apps. For iOS, Apple doesn’t allow JIT to run on iOS devices. Due to this, you will not be able to run your app on iOS. For Apple, please use Ahead-of-Time compilation. For Windows, the IL code can be Just-In-Time (JIT) compiled at runtime or ahead-of-time (AOT) compiled into native instructions using the .NET Native compiler using Windows UI 3 (WinUI 3) library. Note: Android toolchain (.NET for Android) and Windows UI 3 (WinUI 3) library are part of the .NET framework (6 or greater). Execution and Interaction With BCL The compiled application, along with the required BCL assemblies, is deployed and executed on the respective target platforms. When the app is launched on the Android device, the .NET MAUI runtime handles just-in-time (JIT) compilation of the IL code into native machine code that can be executed. At runtime, the application can interact with the BCL classes and APIs, leveraging their functionality to perform various tasks. For example, if your application needs to read from a file, you can utilize BCL classes like `System.IO.File` to access file I/O operations. Similarly, if your application requires networking capabilities, you can use classes from the `System.Net` namespace provided by the BCL to handle networking tasks. The .NET MAUI rendering engine uses the XAML markup to construct a native user interface for the Android platform. During the platform-specific compilation process, any platform-specific resources and assets (such as images, layouts, and configuration files) specified in the code are also included in the output. Additionally, the necessary platform-specific runtime libraries and dependencies are bundled with the application to ensure it runs correctly on each target platform. Below is a diagram illustrating the steps mentioned previously: Diagram was prepared using Draw.io Installation and Setup We can create a .NET MAUI project using Visual Studio Community edition. Click here to download the same. Please note that we must install 2022 to work with .NET MAUI. The link will download Visual Studio Installer. Double-click the file to run the program and select “.NET Multi-platform UI App” under the “Desktop and Mobile” option (as shown in the image below in the red square.). This will start downloading and installation of all the required tools for building a mobile app. After the setup is finished, you are ready to develop a mobile app. Isn’t it so easy? Open the Visual Studio Community edition and click on the “Create a new project” link as shown below: Give the project and solution name as “MauiApp1”. Project Structure of .NET MAUI Application The typical structure of the application will look something like this: Let’s discuss each folder and its contents. Dependencies: This folder contains all the necessary SDKs, NuGet packages, Libraries, and components that are required to run and develop the application. Properties: It contains only one file, “launchSettings.json,” and it is used to launch and debug the application. Generally, it is used to add environment variables, arguments, etc. See an example below: The application has been configured to run on Browser on port 80 with the environment variable “ASPNETCORE_ENVIRONMENT” with the value “Development.” The command “dotnet” with the argument “run” will be used when we run the application. Please note that this file may be very simple initially. However, as we progress in development, we have to frequently update the file to change/add configuration values. Platforms This folder contains subfolders for each environment. Inside these subfolders, we have platform-specific code, resources, dependencies, and configurations. This separation allows developers to customize the behavior and appearance of the application on different platforms while maintaining the shared codebase for common functionality. The common usage can include conditional compilation to exclude or include specific sections of code, performance optimization configuration based on the platform, and platform-specific testing configuration. Resources This folder holds various non-code assets, such as images, icons, styles, and localization files. Note that assets stored in these folders are platform-independent. This means that the same resources can be used across different target platforms (iOS, Android, Windows) without duplicating files or writing platform-specific code. Now let’s see what those independent root-level files are for: App.xaml: This file serves as a central location for defining shared resources, configurations, and behaviors that apply across the entire application. It helps ensure consistency, simplifies initialization and cleanup, and allows you to handle important lifecycle events effectively. AppShell.xaml: This file is used to define the navigation for the app. The order of the child elements in the Shell element determines the order in which the pages are displayed. MainPage.xaml: This file acts as the canvas for your app's initial user interface and provides the starting point for navigation and user interaction. It's the place where you set the tone for your app's user experience and define the first impressions that users will have when they launch your application. For example, it can include containers like StackLayout, Grid, or RelativeLayout to arrange various UI elements, such as buttons, labels, images, and other controls. MauiProgram.cs: This file contains the main class “MauiProgram,” which serves as the start of the application and is used to configure fonts, services, and other third-party libraries. Conclusion In essence, .NET MAUI empowers developers to create versatile, efficient, and stunning applications that target multiple platforms without compromising on native performance. Its project structure, rendering engine, and seamless integration into the .NET ecosystem set the stage for a new era of cross-platform development, inviting developers to craft applications that delight users across diverse devices and operating systems. As the technology landscape continues to evolve, .NET MAUI emerges as an indispensable tool in the modern developer's toolkit, bridging the gap between platform diversity and code simplicity.
Normally, when we use debuggers, we set a breakpoint on a line of code, we run our code, execution pauses on our breakpoint, we look at values of variables, and maybe the call stack, and then we manually step forward through our code's execution. In time-travel debugging, also known as reverse debugging, we can step backward as well as forward. This is powerful because debugging is an exercise in figuring out what happened: traditional debuggers are good at telling you what your program is doing right now, whereas time-travel debuggers let you see what happened. You can wind back to any line of code that is executed and see the full program state at any point in your program’s history. History and Current State It all started with Smalltalk-76, developed in 1976 at Xerox PARC. It had the ability to retrospectively inspect checkpointed places in execution. Around 1980, MIT added a "retrograde motion" command to its DDT debugger, which gave a limited ability to move backward through execution. In a 1995 paper, MIT researchers released ZStep 95, the first true reverse debugger, which recorded all operations as they were performed and supported stepping backward, reverting the system to the previous state. However, it was a research tool and not widely adopted outside academia. ODB, the Omniscient Debugger, was a Java reverse debugger that was introduced in 2003, marking the first instance of time-travel debugging in a widely used programming language. GDB (perhaps the most well-known command-line debugger, used mostly with C/C++) was added in 2009. Now, time-travel debugging is available for many languages, platforms, and IDEs, including: Replay for JavaScript in Chrome, Firefox, and Node, and Wallaby for tests in Node WinDbg for Windows applications rr for C, C++, Rust, Go, and others on Linux Undo for C, C++, Java, Kotlin, Rust, and Go on Linux Various extensions (often rr- or Undo-based) for Visual Studio, VS Code, JetBrains IDEs, Emacs, etc. Implementation Techniques There are three main approaches to implementing time-travel debugging: Record and replay: Record all non-deterministic inputs to a program during its execution. Then, during the debug phase, the program can be deterministically replayed using the recorded inputs in order to reconstruct any prior state. Snapshotting: Periodically take snapshots of a program's entire state. During debugging, the program can be rolled back to these saved states. This method can be memory-intensive because it involves storing the entire state of the program at multiple points in time. Instrumentation: Add extra code to the program that logs changes in its state. This extra code allows the debugger to step the program backward by reverting changes. However, this approach can significantly slow down the program's execution. rr uses the first (the rr name stands for Record and Replay), as does Replay. WinDbg uses the first two, and Undo uses all three (see how it differs from rr). Time-Traveling in Production Traditionally, running a debugger in prod doesn't make much sense. Sure, we could SSH into a prod machine and start the process of handling requests with a debugger and a breakpoint, but once we hit the breakpoint, we're delaying responses to all current requests and unable to respond to new requests. Also, debugging non-trivial issues is an iterative process: we get a clue, we keep looking and find more clues; discovery of each clue is typically rerunning the program and reproducing the failure. So, instead of debugging in production, what we do is replicate on our dev machine whatever issue we're investigating, use a debugger locally (or, more often, add log statements), and re-run as many times as required to figure it out. Replicating takes time (and in some cases a lot of time, and in some cases infinite time), so it would be really useful if we didn't have to. While running traditional debuggers doesn't make sense, time-travel debuggers can record a process execution on one machine and replay it on another machine. So we can record (or snapshot or instrument) production and replay it on our dev machine for debugging (depending on the tool, our machine may need to have the same CPU instruction set as prod). However, the recording step generally doesn't make sense to use in prod given the high amount of overhead — if we set up recording and then have to use ten times as many servers to handle the same load, whoever pays our AWS bill will not be happy. But there are a couple of scenarios in which it does make sense: Undo only slows down execution 2–5x, so while we don't want to leave it on just in case, we can turn it on temporarily on a subset of prod processes for hard-to-repro bugs until we have captured the bug happening, and then we turn it off. When we're already recording the execution of a program in the normal course of operation. The rest of this post is about #2, which is a way of running programs called durable execution. Durable Execution What's That? First, a brief backstory. After Amazon (one of the first large adopters of microservices) decided that using message queues to communicate between services was not the way to go (hear the story first-hand here), they started using orchestration. Once they realized defining orchestration logic in YAML/JSON wasn't a good developer experience, they created AWS Simple Workflow Service to define logic in code. This technique of backing code by an orchestration engine is called durable execution, and it spread to Azure Durable Functions, Cadence (used at Uber for > 1,000 services), and Temporal (used by Stripe, Netflix, Datadog, Snap, Coinbase, and many more). Durable execution runs code durably — recording each step in a database so that when anything fails, it can be retried from the same step. The machine running the function can even lose power before it gets to line 10, and another process is guaranteed to pick up executing at line 10, with all variables and threads intact. It does this with a form of record and replay: all input from the outside is recorded, so when the second process picks up the partially executed function, it can replay the code (in a side-effect–free manner) with the recorded input in order to get the code into the right state by line 10. Durable execution's flavor of record and replay doesn't use high-overhead methods like software JIT binary translation, snapshotting, or instrumentation. It also doesn't require special hardware. It does require one constraint: durable code must be deterministic (i.e., given the same input, it must take the same code path). So, it can't do things that might have different results at different times, like use the network or disk. However, it can call other functions that are run normally ("volatile functions," as we like to call them), and while each step of those functions isn't persisted, the functions are automatically retried on transient failures (like a service being down). Only the steps that require interacting with the outside world (like calling a volatile function or calling sleep (30 days), which stores a timer in the database) persist. Their results also persisted so that when you replay the durable function that died on line ten if it previously called the volatile function on line five that returned "foo," during replay, "foo" will immediately be returned (instead of the volatile function getting called again). While it adds latency to save things to the database, Temporal supports extremely high throughput (tested up to a million recorded steps per second). In addition to function recoverability and automatic retries, it comes with many more benefits, including extraordinary visibility into and debuggability of production. Debugging Prod With durable execution, we can read through the steps that every single durable function took in production. We can also download the execution’s history, check the version of the code that's running in prod, and pass the file to a replayer (Temporal has runtimes for Go, Java, JavaScript, Python, .NET, and PHP) so we can see in a debugger exactly what the code did during that production function execution. Being able to debug any past production code is a huge step up from the other option (finding a bug, trying to repro locally, failing, turning on Undo recording in prod until it happens again, turning it off, then debugging locally). It's also a (sometimes necessary) step up distributed tracing.
As the global community grapples with the urgent challenges of climate change, the role of technology and software becomes increasingly pivotal in the quest for sustainability. There exist optimization approaches at multiple levels that can help: Algorithmic efficiency: Algorithms that require fewer computations and resources can reduce energy consumption. A classic example here is optimized sorting algorithms in data processing. Cloud efficiency: Cloud services are energy-efficient alternatives to on-premises data centers. Migrating to cloud platforms that utilize renewable energy sources can significantly reduce the carbon footprint. Code optimization: Well-optimized code requires less processing power, reducing energy demand. Code reviews focusing on efficient logic, unit testing, and integration testing can lead to cleaner, greener software. Energy-aware architectural design: Energy-efficient design principles can be incorporated into software architecture. Ensuring, for example, that software hibernates when inactive or scales resources dynamically can save energy. Distributed, decentralized, and centralized options like choreography and orchestration can be evaluated. Renewable energy: Data centers and computing facilities can be powered with renewable energy sources to minimize reliance on fossil fuels and mitigate emissions. Green Software Standards: Industry standards and certifications for green software design can drive developers to create energy-efficient solutions. In this article, we will focus on code optimization via software testing. Software testing, a fundamental component of software development, can play a significant role in mitigating the environmental impact of technology. We explore the intersection of software testing and climate change, highlighting how testing can contribute to a more sustainable technological landscape. We begin by summarizing the role of software in the energy footprint of a number of industries. We then explore basic types of software testing that can be applied, giving specific examples. These types are by no means exhaustive. Other types of testing may well be used according to the energy optimization scenario. From Bytes to Carbon Telecommunications Telecommunication networks, including cellular and fixed-line networks, require software to manage signal routing, call routing, data transmission, and network optimization. The software that governs communication protocols, such as 4G, 5G, Wi-Fi, and other wireless technologies, also plays a crucial role in determining network efficiency and energy consumption. Traffic management, billing and customer management, service provisioning, remote monitoring, and management also involve software. As the telecommunications industry continues to evolve, a focus on sustainable software development and energy-efficient practices will be crucial to minimizing its environmental impact. E-Commerce Online shopping platforms require data centers to process transactions and manage inventories. Every click, search, and transaction contributes to the digital carbon footprint. Streamlining software operations and optimizing server usage can reduce energy consumption. Finance Financial institutions rely on software for trading, risk management, and customer service. High-frequency trading algorithms, for instance, demand significant computational power. By optimizing these algorithms and reducing unnecessary computations, the energy footprint can be curbed. Healthcare Electronic health records and medical imaging software are vital in healthcare. Reducing the processing power needed for rendering medical images or utilizing cloud services for storage can mitigate carbon emissions. Transportation Ride-sharing and navigation apps require real-time data processing, contributing to energy-intensive operations. Implementing efficient route optimization algorithms, for example, can reduce the carbon footprint of such apps. Entertainment Streaming platforms, gaming, and content delivery networks rely on data centers to provide seamless experiences. Employing content delivery strategies that minimize data transfer and optimize streaming quality can alleviate energy consumption. Manufacturing Industrial automation systems use software to control production processes. By optimizing these systems for energy efficiency, manufacturers can decrease the carbon footprint associated with production. Agriculture Precision farming relies on software for data analysis and decision-making. Ensuring that sensors and software are finely tuned reduces energy waste in the field. Education Online education platforms, virtual classrooms, and digital learning materials consume energy. Optimizing code, minimizing background processing, and encouraging offline access can lower energy consumption. Aerospace and Defense Aerospace and defense industries rely on sophisticated software for designing aircraft, simulations, and defense systems. Reducing resource-intensive calculations and optimizing software design can lower energy consumption and carbon emissions. Testing Types That Can Be Used As software is omnipresent, powering devices, applications, and infrastructure across industries, the development and deployment of software are not devoid of environmental implications. Unoptimized software that requires excessive computational resources can exacerbate this problem. Ensuring software is efficient and optimized through rigorous testing can significantly reduce its carbon footprint. Performance, Load, and Stress Testing for Efficiency and Scalability Performance testing evaluates how a system responds under varying workloads. By assessing software performance, developers can identify resource-intensive processes that may lead to energy waste. Optimizing resource utilization and minimizing processing time may lead to reduced energy consumption and, consequently, a smaller carbon footprint. Bear in mind that there is not always a linear correlation between resource optimization and energy consumption. In simple terms, the relationship between resource optimization and the energy consumption is not always straightforward. Optimizing certain processes might lead to more efficient energy use but can also involve complex trade-offs. For instance, reducing processing time might lead to more rapid completion of tasks. Still, it could also lead to higher peak resource usage, potentially negating energy savings due to shorter task duration. Such cases may be addressed by performance testing on the peak resource usage and optimization. Under the performance testing umbrella, load and stress testing can be useful, examining software's ability to handle increasing loads of traffic. When software is capable of efficiently accommodating user demands, it reduces the need for over-provisioning resources, which can lead to energy inefficiencies. A well-tested application that scales seamlessly promotes resource efficiency and sustainability. When the load increases beyond certain limits, stress testing can identify breaking points of a system or its capacity to handle excessive load, which could lead to performance degradation or failure. For example, in an e-commerce platform, a sudden surge in user traffic during a sale event overwhelms the website. By subjecting the platform to simulated high loads, developers can pinpoint areas that require optimization. Ensuring swift load times, efficient search queries, and seamless transaction processing not only enhances user experience but also reduces the energy required for server processing. In the financial sector, high-frequency trading platforms are reliant on efficient software to process complex calculations in microseconds. Performance testing identifies latency issues and helps developers optimize trading algorithms. By ensuring faster execution times and minimizing unnecessary computations, energy consumption is reduced, contributing to a more sustainable financial ecosystem. Streaming platforms witness varying levels of usage throughout the day. Load testing ensures that the platform can handle numerous concurrent viewers without buffering or quality degradation. Scalability ensures that the platform can allocate resources efficiently, reducing energy consumption during high-demand periods. In the transportation sector, load testing is essential for navigation apps that provide real-time route information to users. As users request navigation guidance during peak traffic hours, scalability testing ensures that the app can handle simultaneous queries without lag. A well-scaling app minimizes the need for over-provisioned servers during peak hours, promoting energy-efficient operations. Continuous Integration and Deployment (CI/CD) Implementing CI/CD pipelines with automated testing ensures that code changes are tested rigorously before deployment. By catching bugs early and preventing faulty code from entering production, CI/CD practices contribute to efficient software development, reducing the carbon footprint associated with bug fixes and maintenance. Industrial automation systems demand reliability to prevent production line interruptions. CI/CD guarantees that changes to these systems undergo comprehensive testing to avoid disruptions. Reduced downtimes translate to energy-efficient manufacturing processes. In healthcare software, where data accuracy and patient safety are paramount, CI/CD plays a vital role. Updates to electronic health records systems or medical imaging software undergo rigorous automated testing to prevent data corruption or processing errors. By avoiding situations that necessitate prolonged maintenance, CI/CD practices reduce energy consumption associated with emergency patches. Security Testing for Data Efficiency Security testing verifies the resilience of software against cyber threats. A secure system prevents data breaches and unauthorized access, reducing the risk of compromised data that could lead to unnecessary energy expenditure in data restoration and breach resolution. Security testing ensures that simulation software used in aerospace and defense remains impervious to hacking attempts. By protecting sensitive data, this practice prevents the energy-intensive task of identifying and repairing compromised simulations. In healthcare, finance, and e-commerce, for example, sensitive patient data, financial data, and e-commerce data can be protected. Restoring trust and credibility can be challenging and energy costly. Regression Testing for Code Stability Regression testing confirms that new code does not break existing functionality. Preventing regressions reduces the need for repeated testing and bug fixes, optimizing the software development lifecycle and minimizing unnecessary computational resources. Precision farming software relies on consistent data processing for optimal decision-making. Regression testing verifies that new code changes do not introduce inaccuracies in sensor data analysis. By preventing regressions, energy is conserved by avoiding the need to address erroneous data. Online education platforms introduce new features to enhance user experiences. Regression testing ensures these changes do not disrupt existing lessons or content delivery. By maintaining stability, energy is saved by minimizing the need for post-deployment fixes. Suppose a telecommunications company is rolling out a software update for its network infrastructure to improve data transmission efficiency and reduce latency. The update includes changes to the routing algorithms used to direct data traffic across the network. While the primary goal is to enhance network performance, there is a potential risk of introducing regressions that could disrupt existing services. Before deploying the software update to the entire network, the telecommunications company conducts thorough regression testing. Test cases cover various functionalities and scenarios, including those related to call routing, data transmission, and network optimization. The tests ensure that the new code does not break existing functionalities. If existing functionalities are compromised by the new code, the company may prevent the deployment of faulty updates that could lead to network disruptions. Avoiding such disruptions reduces the need for emergency fixes, saving computational resources that would otherwise be expended in resolving network outages. By ensuring that software updates are stable and do not introduce regressions, the telecommunications company maintains optimal network performance without frequent energy-intensive rollbacks or fixes. Emerging Trends Emerging trends that could shape the future of software development's impact on energy consumption include the rise of edge computing, where processing happens closer to the data source, reducing data transmission and energy consumption. Additionally, advancements in machine learning and artificial intelligence could lead to more sophisticated energy optimization algorithms, improving the efficiency of software operations. Quantum computing might also play a role in addressing complex optimization challenges. Wrapping Up There exist estimation scenarios that by 2030, the information and communications technology (ICT) sector could account for up to 23% of global energy consumption. This surge is fueled by various devices and software applications integral to modern life. The urgency of addressing climate change demands multifaceted approaches across various sectors, including technology. Software, being a core component of our modern lives, has a critical role to play in this endeavor. By integrating sustainable practices into software testing, developers can contribute to a more environmentally conscious technological landscape. As we continue to innovate and develop software solutions, it is important that we remain mindful of the environmental impact of our creations. Embracing a testing paradigm that focuses on performance optimization and resource efficiency can help reduce the carbon footprint of software. Ultimately, software development and testing could help towards harnessing the potential of technology to address climate change while delivering efficient, effective, and sustainable solutions to a rapidly changing world.
Over the past several years, Zero Trust Architecture (ZTA) has gained increased interest from the global information security community. Over the years, several organizations have adopted Zero Trust Architecture (ZTA) and experienced considerable security improvements. One such example is Google, which implemented a BeyondCorp initiative embodying ZTA principles. The tech giant removed trust assumptions from its internal network, focusing instead on verifying users and devices for every access request, regardless of their location. This transformation has allowed Google to offer its workforce more flexibility while maintaining robust security. We also see relevant guidelines emerging from commercial entities and government bodies. Specifically, a memorandum was released detailing recommendations for US agencies and departments on how to transition to a "Zero Trust" architecture. Let's delve into a brief overview of ZTA. Key Considerations in Adopting a Zero Trust Architecture The core idea of this architecture is not to mindlessly trust any entity, system, network, or service, whether they are within or outside the security perimeter. Instead of granting access freely, every interaction should be rigorously checked. This marks a significant shift in the way we approach the protection of our infrastructure, networks, and data: from a single perimeter check to a continuous, detailed inspection of every device, user, application, and transaction. This ensures that the targeted information system always possesses comprehensive information about the party involved during the authentication/authorization phase. Furthermore, applications should not depend on network perimeter security to prevent unauthorized access. Users should log directly into applications and not entire networks\systems. In the immediate future, we should consider every application as potentially accessible over the Internet from a security standpoint. As organizations adopt this mindset, it is anticipated that the requirement to access applications through specific networks will no longer be necessary. Numerous tools can assist with ZTA implementation, such as network security solutions like Next-Generation Firewalls (NGFWs), Secure Access Service Edge (SASE), and Identity and Access Management (IAM) software. Additionally, resources like NIST's SP 800-207 Zero Trust Architecture document can provide further in-depth understanding and guidelines for ZTA adoption. Several approaches to building a ZTA exist advanced identity management, logical micro-segmentation, and network-based segmentation. All approaches aim to isolate systems as much as possible so that an attacker who compromises one app cannot travel within the organization and compromise other sectors. The transition of an organization to Zero Trust Architecture (ZTA) looks like this: The process of managing employee accounts ensures they have all the necessary resources to perform their duties while following the principle of least privilege. The devices that employees utilize for their job tasks are under constant supervision and control. The security status of these devices (configuration, patch level, integrity, etc.) plays a significant role when it comes to granting access to internal resources. The organization's systems are kept isolated from one another, and any network traffic circulating between or within these systems is both encrypted and authenticated. Applications used within the enterprise undergo both internal and external testing. Platforms such as GitLab are essential for upholding the top standards of DevSecOps principles. The organization's security teams are responsible for establishing data categories and setting security rules in order to automatically identify and prevent any unauthorized access to sensitive information. The transition to ZTA should be considered through the prism of the following key areas: identities, devices, networks, applications, and data. Let's briefly review each of them. Identities A centralized identity management system needs to be implemented across the organization. It is crucial to apply robust multi-factor authentication (MFA) throughout the enterprise. When granting users access to resources, at least one device-level signal should be taken into account, along with the authenticated user's identity information. The level of risk associated with accessing an application from a specific corporate network should be seen as no less than accessing it from the Internet. Devices The organization must keep a comprehensive inventory of all authorized devices currently in use. Moreover, it is vital that the organization can effectively prevent, detect, and respond to any incidents involving these devices. Network Organizations should aim to encrypt all traffic whenever possible, even when data travels within internal networks and client portals. It is important to actively use strong encryption protocols like TLS 1.3. The underlying principles of these protocols should be taken into account, especially for minimizing the number of long-term keys. A leak of any of these keys could pose a significant risk to the entire system's operation. Applications Organizations need to operate dedicated programs for testing application security. In case of a shortage of expertise, it is always a good idea to seek high-quality, specialized software testing services for independent third-party evaluations of application security. It is crucial for organizations to manage a responsive and open public vulnerability disclosure program. While deploying services and products, organizations should strive to use immutable workloads, especially when dealing with cloud-based infrastructure. Data It is vital to set up defenses that utilize comprehensive data categorization. Leverage cloud security services and tools to identify, classify, and safeguard your sensitive data while also implementing logging and information sharing across the entire enterprise. Companies should try to automate their data categorization and security responses, particularly when regulating access to sensitive information. Regularly audit access to any data that is at rest or while it is being transmitted on commercial cloud infrastructure. Common Challenges and Solutions The transition to ZTA is not without its hurdles. One significant challenge is the potential for increased complexity and operational overhead. Managing numerous security configurations, encryption protocols, and access control lists can be daunting. However, automated security solutions and centralized management systems can help streamline the process and reduce human error. Another common issue is resistance to change within the organization. The shift to ZTA can be disruptive, requiring changes in company culture and workflows. This challenge can be mitigated through comprehensive training programs, clear communication about the benefits of ZTA, and gradual implementation strategies. Conclusion Traditional security architectures operate on the assumption that all data and transactions are secure by default. Yet, incidents such as data breaches and other compromises can shatter this trust. Zero Trust Architecture revolutionizes this trust model, starting with the presumption that all data and transactions are potentially untrustworthy right from the outset. Adopting ZTA provides numerous benefits, such as improved security posture, reduced risk of data breaches, and flexibility in accommodating remote work or BYOD policies. However, it does come with potential drawbacks. The cost and complexity associated with the initial implementation can be high, and there is the risk of potential service disruption during the transition. To mitigate these drawbacks, companies considering ZTA should begin by assessing their current security posture and then identifying areas where ZTA principles could be initially applied while also building a roadmap for a full transition.
Three primary methods for measuring team productivity are the SPACE framework, DORA metrics, and Goals/Signals/Metrics (GSM). The SPACE Framework for Team Productivity In a recent research paper by Nicole Forsgren and her colleagues, “The SPACE of Developer Productivity" (2021.), the authors defined a framework as a systematic approach to measuring, understanding, and optimizing engineering productivity. It encourages leaders to take a comprehensive approach to productivity, communicating measurements with one another and connecting them to team objectives. The five aspects are used to categorize engineering productivity, called the Space Framework. S: Satisfaction and Well-Being Here, we measure whether our team members are fulfilled and happy usually by using some surveys. Why do we do this? Because satisfaction is correlated with productivity. Unhappy teams that are productive will burn out sooner rather than later. P: Performance This is also hard to quantify because producing more code in a unit of time is not a measure of high-quality code or productivity. Here, we can measure defect rates or change failure rates to measure it, which means the percentage of deployments causing a failure in production. Every loss of output will harm the productivity of a team. Also, if we count the number of merged PRs over time, it's correlated to production. A: Activity Activities are usually visible. Here, we can measure the number of commits per day or deployment frequency, i.e., how often we push new features to production. C: Collaboration and Communication We want extensive and effective collaboration between individuals and groups for a productive team. In addition, productive teams usually rely on high transparency and awareness of other people's work. Here, we can measure PR review time, quality of meetings, and knowledge sharing. E: Efficiency and Flow With flow, we measure individual efficiency to complete some work fast and without interruption, while efficiency means the same but on the team level. Our goal is to foster an environment where developers may experience and keep the flow for the longest possible period each day while also assisting them in feeling content with their routines. To implement the SPACE framework, the authors recommend aligning three areas with company goals and team priorities. First, when a team selects a measure, this reflects team values. Here, we want to start from team-level metrics, and when we succeed, we can roll it out to the broader organization. Example metrics (“The SPACE of Developer Productivity,” N. Forsgren et al., 2021) DORA Metrics for Team Productivity One other way to measure team productivity is DORA metrics. With these metrics, we are evaluating team performance based on the following: Lead time for changes is the time between a commit and production. Elite performers do this in less than one hour, while medium performers need one day to one week. Deployment frequency is how often we ship changes. Elite performers do this multiple times per day, while medium ones do it once a month to once every six months. The mean time to recovery is the average time it takes your team to restore service when there’s an outage. Elite performers do this in less than one hour, while medium ones do this in a day to one week. The change failure rate is the percentage of releases that result in downtime. Elite performers are 0-15%, while medium performers are 16-30%. The lead time for modifications and the deployment frequency reveal a team's velocity and how quickly they react to the constantly changing needs of consumers. The stability of service and how responsive the group is to service outages or failures are indicated by the mean time to recovery and change failure rate. Comparing all four essential criteria, one can assess how successfully their company balances speed and stability. Goals/Signals/Metrics (GSM) For Measuring Developer Productivity Yet, there are other productivity frameworks, too, such as “Goals/Signals/Metrics (GSM)” metrics from Google. In this framework, you first agree that there is a problem worth solving, then we set a goal on what we want to achieve and decide which statements, when actual, would note that we are making progress (signals). Finally, we arrive at metrics we want to measure but focus more on the desired outcome, not just the metric. For example, the goal could be “Make sure that engineers have more focus time,” signals could be “Engineers report fewer cases of meeting overload,” while metrics could be “Engineer focus time.” For metrics, you can build a team Dashboard that will collect them in one place, so it’s easy to analyze them. You can check out this video from Google if you'd like to learn more about this method.
I recently discussed how we use Copilot and ChatGPT for programming with some of my senior colleagues. We discussed our experiences, how and when it helps, and what to expect from it in the future. In this article, I will shortly write about what I imagine the future of programming with AI will be. This is not about what AI will do in programming in the future. I have no idea about that, and based on my mood, I either look forward amazed or in fear. This article is more about how we, programmers, will do our work in the future. The Past To predict the future, we have to understand the past. If you know only the current state, you cannot reliably extrapolate. Extrapolation needs at least two points and knowledge about the speed of change in the future (maximum and minimum of the derivative function). So, here we go, looking a bit at the past, focusing on the aspects that I feel are mainly important to predict how we will work in the future with AI in programming. Machine Code When computers were first introduced, we programmed them in machine code. This sentence should read, "Your father/mother programmed them in machine code," for most of you. I had the luck to program a Polish clone of the PDP-11 in machine code. To create a program, we used assembly language. We wrote that on a piece of checkerboard paper(and then we typed it in). No. Note: As I write this article, Copilot is switched on, suggesting the sentences' ends. In 10% of the cases, I accept the suggestion. Copilot suggested the part between (and) in the last sentence. It's funny that even Copilot cannot imagine a system where we did not have an assembler. We wrote the assembly on the left side of the paper and the machine code on the right. We used printed code tables to look up the machine codes, and we had to calculate the addresses. After that, we entered the code. This was also a cumbersome process. There were switches on the front panel of the computer. There were 11 switches for the address (as far as I remember) and eight switches for the data. A switch flipped up meant a bit with a value of 1, and a switch flipped down meant a bit with a value of 0. We set the address and the desired value, and then we had to push a button to write the value into the memory. The memory consisted of ferrite rings that kept their value even after the power was switched off. Assembly It was a relief when we got the assembler. It was already on a different machine and a different processor. We looked at the machine code that the assembler generated a few times, but not many times. The mapping between the assembly and the machine code was strictly one-to-one mapping. The next step was higher-level languages, like C. I wrote a lot of C code as a hobby before I started my second professional career as a programmer at 40. Close to the Metal The mapping from C to machine code is not one-to-one. There is room for optimization, and different compiler versions may create different code. Still, the functionality of the generated code is very much guaranteed. You do not need to look at the machine code to understand what the program does. I can recall that I only did it a few times. One of those times, we found a bug in the Sun C compiler (1987, while I was on a summer program at TU Delft). It was my mistake the other time, and I had to modify my C code. The compiler knew better than I did what the C construct I wrote meant. I do not have a recollection of the specifics. We do not need to look at the generated code; we write on a high level and debug on a high level. High-Level As we advance in time, we have Java. Java uses a two-level compilation. It compiles the Java code to byte code, which the Java Virtual Machine, JIT technology interprets. I looked at the generated byte code only once to learn the intricacies of the ternary operator type casting rules, and never the machine code generated. The first case could be avoided by reading the language spec, but who reads manuals? The same is true here: we step to higher levels of abstraction and do not need to look at the generated code. DSL and Generated Code Even as we advance towards higher levels, we can have Domain Specific Languages (DSLs). DSLs are Interpreted, Generate high-level code, or Generate byte code and machine code. The third case is rare because generating low-level code is expensive, requires much work, and is not worth the effort. Generating high-level code is more common. As an example, we can take Java::Geci fluent API generator. It reads a regular expression like the definition of the fluent API, creates a finite state machine from it, and generates the Java code containing all the interfaces and classes that implement the fluent API. The Java compiler then compiles the generated code, and the JVM interprets the resulting byte code. Should we look at the generated code? Usually not. I actually did a lot because I wrote the generator, and so I had to debug it, but that is an exception. The generated code should perform as the definition says. The Present and the Future The next step is AI languages. This is where we are now, and it starts now. We use AI to write code based on some natural language description. The code is generated, and we have to look at it. This is different from any earlier steps in the evolution of programming languages. The reason is that the language AI interprets is not definite the same way as Java, C, or any DSL. It can be ambiguous. It is a human language, usually English. Or something resembling English when non-native speakers like me write it. Syntax-Free This is the advantage of AI programming. I do not need to remember the actual syntax. I can program in a language I rarely use and forget the exact syntax. I vaguely remember it, but it is not in my muscle memory. Library-Free It can also help me with my usual programming tasks. Something that was written by other people many times before. It has it in its memory, and it can help me. The conventional programming languages have it but with a limited scope. There are language constructs for the usual data structures and algorithms. There are libraries for the usual tasks. The problem is that you have to remember the one to use it. Sometimes, writing a few lines is easier than finding the library and the function that does it. It is the same philosophy as the Unix command line versus VMS. (You may not know VMS. It was the OS of the VAX VMS and Alpha machines from DEC.) If you needed to do something in VMS, there was a command for it. In Unix, you had simpler commands, but you could combine them. With AI programming, you can write down what you want using natural language, and the AI will find the code fragments in its memory that fit the best and adapt it. AI-Language Today, AI is generating and helping to write the code. In the future, we will tell the AI what to do, and it will execute it for us. We may not need to care about the data structure it stores the data or algorithms it applies to manage those. Today, we think of databases when we talk about structured data. That is because databases are the tools to support the limited functionality a computer can manage. Before the computers, we just told the accountant to calculate the last year, whatever profit, balance sheet, whatnot, and they did. The data was on paper, and the managers did not care how they were organized. It was expensive because accountants are expensive. The intelligence they applied, extracting data from the different paper-based documents, was their strong point; calculation was just a mechanical task. Computers came, and they were strong doing the calculations. They were weak in extracting data from the documents. The solution was to organize the data into databases. It needed more processing on the input, but it was still cheaper than having accountants do the calculations. With AI, computers can do calculations and extract data from documents. If it can be done cheaply, there is no reason anymore to keep the data in a structured way. It can get structured when we need them for a calculation on the fly. The advantage is that we can do any calculation, and we may not face the issue that the data structure is unsuitable for the calculation we need. We just tell the AI program using natural language. Is there a new patient coming to the practice? Just tell the program all the data, and it will remember like an assistant with unlimited memory who never forgets. Do you want to know when a patient last visited? Just ask the program. You do not need to care how the artificial simulated neurons store the information. It certainly will use more computing power and energy than a well-tuned database, but on the other hand, it will have higher flexibility, and the development cost will be significantly lower. This is when we will talk to the computers, which will help us universally. I am not shy about predicting this future because it will come when I will not be around anymore. But what should we expect in the near future? The Near Future Now, AI tools are interactive. We write some comments or code, and the AI generates the code for us, which is the story’s end. From that point on, our "source code" is the generated code. You can feel from the previous sentence the contradiction. It is like if we would write the code in Java once, then compile it into byte code, and then use the byte code to maintain it. We do not do that. Source code is what we write. Generated code is never source code. I expect meta-programming tools for various existing languages to extend them. You insert some meta-code (presumably into comments) into your application, and the tool will generate the code for you. However, the generated code is generated and not the source. You do not touch it. If you need to maintain the application, modify the comment, and the tool will generate the code again. It will be similar to what Java::Geci is doing. You insert some comments into your code, and the code generator inserts the generated code into the editor-fold block following the comment. Java::Geci currently does not have an AI-based code generator, or at least I do not know about any. It is an open-source framework for code generators; anyone could write a code generator utilizing AI tools. Later languages will include the possibility from the start. These languages will be some kind of hybrid solution. There will be some code described by human language, probably describing business logic, and some technical parts more like a conventional programming language. It is similar to how we apply DSL today, with the difference that the DSL will be AI-processed. As time goes forward, the AI part will grow, and the conventional programming part will shrink to the point when it will disappear from the application code. However, it will remain in the frameworks and AI tools, just like today’s machine code and assembly. Nobody codes in assembly anymore, but wait? There are still people who do — those who write the code generators. And those who will still maintain 200 years from now in the future the IBM mainframe assembly and COBOL programs. Conclusion and Takeaway I usually write a conclusion and a takeaway at the end of the article. So I do it now. That is all, folks.
Debugging complex code in Java is an essential skill for every developer. As projects grow in size and complexity, the likelihood of encountering bugs and issues increases. Debugging, however, is not just about fixing problems; it's also a valuable learning experience that enhances your coding skills. In this article, we'll explore effective strategies and techniques for debugging complex Java code, along with practical examples to illustrate each point. 1. Use a Debugger One of the most fundamental tools for debugging in Java is the debugger. Modern integrated development environments (IDEs) like IntelliJ IDEA, Eclipse, and NetBeans provide powerful debugging features that allow you to set breakpoints, inspect variables, and step through your code line by line. Java public class DebugExample { public static void main(String[] args) { int num1 = 10; int num2 = 0; int result = num1 / num2; // Set a breakpoint here System.out.println("Result: " + result); } } Here's a more detailed explanation of how to effectively use a debugger: Setting breakpoints: Breakpoints are markers you set in your code where the debugger will pause execution. This allows you to examine the state of your program at that specific point in time. To set a breakpoint, you typically click on the left margin of the code editor next to the line you want to pause at. Inspecting variables: While your code is paused at a breakpoint, you can inspect the values of variables. This is incredibly helpful for understanding how your program's state changes during execution. You can hover over a variable to see its current value or add it to a watch list for constant monitoring. Stepping through code: Once paused at a breakpoint, you can step through the code step by step. This means you can move forward one line at a time, seeing how each line of code affects the state of your program. This helps you catch any unintended behavior or logical errors. Call stack and call hierarchy: A debugger provides information about the call stack, showing the order in which methods were called and their relationships. This is especially useful in identifying the sequence of method calls that led to a specific error. Conditional breakpoints: You can set breakpoints that trigger only when certain conditions are met. For instance, if you're trying to identify why a loop is behaving unexpectedly, you can set a breakpoint to pause only when the loop variable reaches a specific value. Changing variable values: Some debuggers allow you to modify variable values during debugging. This can help you test different scenarios without having to restart your program. Exception breakpoints: You can set breakpoints that pause your program whenever an exception is thrown. This is particularly useful when dealing with runtime exceptions. 2. Print Statements Good old-fashioned print statements can be surprisingly effective. By strategically placing print statements in your code, you can trace the flow of execution and the values of variables at different stages. Java public class PrintDebugging { public static void main(String[] args) { int x = 5; int y = 3; int sum = x + y; System.out.println("x: " + x); System.out.println("y: " + y); System.out.println("Sum: " + sum); } } 3. Isolate the Problem If you encounter an issue, try to create a minimal example that reproduces the problem. This can help you isolate the troublesome part of your code and make it easier to find the root cause. Certainly! Isolating the problem through a minimal example is a powerful technique in debugging complex Java code. Let's explore this concept with a practical example: Imagine you're working on a Java program that calculates the factorial of a number using recursion. However, you've encountered a StackOverflowError when calculating the factorial of a larger number. To isolate the problem, you can create a minimal example that reproduces the issue. Here's how you could go about it: Java public class FactorialCalculator { public static void main(String[] args) { int number = 10000; // A larger number that causes StackOverflowError long factorial = calculateFactorial(number); System.out.println("Factorial of " + number + " is: " + factorial); } public static long calculateFactorial(int n) { if (n == 0) { return 1; } else { return n * calculateFactorial(n - 1); } } } In this example, the calculateFactorial the method calculates the factorial of a number using recursion. However, it's prone to a StackOverflowError for larger numbers due to the excessive number of recursive calls. To isolate the problem, you can create a minimal example by simplifying the code: Java public class MinimalExample { public static void main(String[] args) { int number = 5; // A smaller number to debug long factorial = calculateFactorial(number); System.out.println("Factorial of " + number + " is: " + factorial); } public static long calculateFactorial(int n) { if (n == 0) { return 1; } else { return n * calculateFactorial(n - 1); } } } By reducing the value of number, you're creating a scenario where the recursive calls are manageable and won't lead to a StackOverflowError. This minimal example helps you focus on the core problem and isolate it from other complexities present in your original code. Once you've identified the issue (in this case, the excessive recursion causing a StackOverflowError), you can apply your debugging techniques to understand why the original code behaves unexpectedly for larger numbers. In real-world scenarios, isolating the problem through minimal examples helps you narrow down the root cause, saving you time and effort in identifying complex issues within your Java code. 4. Rubber Duck Debugging Explaining your code to someone (or something) else, like a rubber duck, can help you spot mistakes. This technique forces you to break down your code step by step and often reveals hidden bugs. Certainly! The rubber duck debugging technique is a simple yet effective method to debug your Java code. Let's delve into it with a practical example: Imagine you're working on a Java program that calculates the Fibonacci sequence using recursion. However, you've noticed that the calculated sequence is incorrect for certain inputs. To use the rubber duck debugging technique, you'll explain your code step by step as if you were explaining it to someone else or, in this case, a rubber duck. Here's how you could apply the rubber duck debugging technique: Java public class FibonacciCalculator { public static void main(String[] args) { int n = 5; // Input for calculating the Fibonacci sequence long result = calculateFibonacci(n); System.out.println("Fibonacci number at position " + n + " is: " + result); } public static long calculateFibonacci(int n) { if (n <= 1) { return n; } else { return calculateFibonacci(n - 1) + calculateFibonacci(n - 2); } } } Now, let's imagine you're explaining this code to a rubber duck: "Hey, rubber duck! I'm trying to calculate the Fibonacci sequence for a given position n. First, I'm checking if n is less than or equal to 1. If it is, I return n because the Fibonacci sequence starts with 0 and 1. If not, I recursively calculate the sum of the Fibonacci numbers at positions n - 1 and n - 2. This should give me the Fibonacci number at position n. Hmmm, I think I just realized that for larger values of n, the recursion might be inefficient and lead to incorrect results!" By explaining your code to the rubber duck, you've broken down the logic step by step. This process often helps in revealing hidden bugs or logical errors. In this case, you might have identified that the recursive approach for calculating the Fibonacci sequence can become inefficient for larger values of n, leading to incorrect results. The rubber duck debugging technique encourages you to articulate your thought process and identify issues that might not be immediately apparent. It's a valuable method for tackling complex problems in your Java code and improving its quality. Version control: Version control systems like Git allow you to track changes and collaborate with others. Using descriptive commit messages can help you remember why you made a certain change, making it easier to backtrack and identify the source of a bug. Unit testing: Writing unit tests for your code helps catch bugs early in the development process. When debugging, you can use these tests to pinpoint the exact part of your code that's causing the issue. Review documentation and stack traces: Error messages and stack traces can be overwhelming, but they contain valuable information about what went wrong. Understanding the stack trace can guide you to the specific line of code that triggered the error. Binary search debugging: If you have a large codebase, narrowing down the source of a bug can be challenging. Using a binary search approach, you can comment out sections of code until you identify the problematic portion. Conclusion Debugging complex Java code is a skill that requires patience, practice, and a systematic approach. By leveraging tools like debuggers, print statements, and version control systems, you can effectively identify and fix bugs. Remember that debugging is not just about solving immediate issues; it's also a way to deepen your understanding of your codebase and become a more proficient Java developer. So, the next time you encounter a complex bug, approach it as an opportunity to refine your coding skills and create more robust and reliable Java applications.
The rise of large language models like OpenAI's GPT series has brought forth a whole new level of capability in natural language processing. As people experiment with these models, they realize that the quality of the prompt can make a big difference to the results and some people call this “prompt engineering.” To be clear: there is no such thing. At best it is “prompt trial and error.” Prompt “engineering” assumes that by tweaking and perfecting input prompts, we can predict and control the outputs of these models with precision. The Illusion of Control The idea of prompt engineering relies on the belief that, by carefully crafting input prompts, we can achieve the desired response from a language model. This assumes there is a deterministic relationship between the input and output of LLMs, which are complex statistical text models – this makes it impossible to predict the outcome of changing the prompt with any certainty. Indeed, the unpredictability of neural networks in general is one of the things that limits their ability to work without human supervision. The Butterfly Effect in Language Models The sensitivity of large language models to slight changes in input prompts, often compared to the butterfly effect in chaos theory, is another factor that undermines the concept of prompt engineering. The butterfly effect illustrates how small changes in initial conditions can have significantly different outcomes in dynamic systems. In the context of language models, altering a single word or even a punctuation mark can lead to drastically different responses, making it challenging to pinpoint the best prompt modification for a specific result. The Role of Bias and Variability Language models, including GPT series models, are trained on vast quantities of human-generated text data. As a result, they inherit the biases, inconsistencies, and idiosyncrasies present in these datasets. This inherent bias and variability in the training data contribute to the unpredictability of the model's outputs. The Uncertainty of Generalization Language models are designed to generalize across various domains and tasks, which adds another layer of complexity to the challenge of prompt engineering. While the models are incredibly powerful, they may not always have the detailed domain-specific knowledge required for generating an accurate and precise response. Consequently, crafting the "perfect" prompt for every possible situation is an unrealistic goal. The Cost of Trial and Error Given the unpredictability of language model outputs, editing prompts often becomes a time-consuming process of trial and error. Adjusting a prompt multiple times to achieve the desired response can take so long that it negates the efficiency gains these models are supposed to provide. In many cases, performing the task manually might be more efficient than investing time and effort in refining prompts to elicit the perfect output. The concept of prompt engineering in large language models is a myth rather than a practical reality. The inherent unpredictability of these models, combined with the impact of small changes in input prompts, the presence of biases and variability in training data, the models' generalization abilities, and the costly trial-and-error nature of editing prompts, make it impossible to predict and control their outcomes with certainty. Instead of focusing on prompt engineering as a magic bullet, it is essential to approach these models with a healthy dose of skepticism, recognizing their limitations while appreciating their remarkable capabilities in natural language processing.
Of late, I have started to pen down summaries of my talks for those not interested in sitting through a 45-minute video or staring at slides without getting much context. Here is one for AWS DevDay Hyderabad 2023 where I spoke about "Simplifying Your Kubernetes Infrastructure With CDK8s." CDK for Kubernetes, or CDK8s is an open-source CNCF project that helps represent Kubernetes resources and application as code (not YAML!). You can check out this link for the slides, and the recording is coming soon! Working on Kubernetes? Many of you work on Kubernetes, perhaps in different capacities: architect, DevOps engineer, SRE, platform engineer, software engineer, etc. No matter what we tell our friends, family, or even recruiters (we need to convince them how hard it is to work across the entire CNCF landscape, isn't it? ), the real question is... ...What do we actually do? (This is all light-hearted stuff, please don't take this personally.) Can you relate to this ^^ as well? Working on Kubernetes does not have to mean turning into a YAML engineer (unless that becomes the "next big thing"). But the reality is that there is... ... YAML Everywhere It has become the "lingua franca" of the Kubernetes world. Every tool set that we use including Helm, Kustomize, GitOps, Kubernetes operators, etc. involves using (and debugging) YAML. Avoiding YAML! But what I covered in the talk was not about removing or getting rid of YAML. To be honest, that's almost impossible. It's more about circumventing it or pushing it behind the scenes, using tools and frameworks - and one of those is CDK for Kubernetes. Helpful Content There is only so much I can cover in 30-40 minutes. If you are interested in learning more, you are more than welcome to check out this. It’s kind of a mini-book along with a GitHub repo that has all the code. I think you will find it useful! From Infra-As-Code to Infra-Is-Code You may have used Terraform or AWS CloudFormation. These fall into the Infrastructure as Code (IaC) category, where you write configuration to define your resources, be it YAML or JSON or a custom templating language. On the other end of the spectrum, there are tools like AWS CDK or Pulumi. This is where you step into the Infrastructure Is Code zone, where configuration takes a back seat and code dominates. Hello, CDK8s! And that’s where cdk8s comes into the picture since it embraces the Infrastructure Is code paradigm for Kubernetes applications. I am a Go developer and that's what I used for the demos and code samples, but CDK8s supports Python, Java, and JavaScript. And the best part is that you can use it anywhere. It's not specific to AWS or any other provider: it's an open-source CNCF project. Demo Time! I showed how to bootstrap a Go CDK8s application using the CLI (cdk8s init go-app), define an NGINX application (see code below), convert it to YAML manifest (yes, we can't get rid of it!), and deploy using kubectl apply -f dist/. CDK8s Components Before we move on, let's understand these concepts: Every CDK8s application consists of a tree of components, also called constructs. At the top of the tree is what you call an App. We create it using NewApp function. An App can contain multiple Charts and we create a chart using NewChart function. Within that Chart is where you would define your basic building blocks. In the above example, we defined a Kubernetes Deployment and Service, but it could be other types of constructs as well (covered later). Improve Productivity With CDK8s+ and Custom Constructs Since CDK8s is a low-level API, it can be a bit verbose, and that's where the CDK8s plus library comes in. Now look at this example which is directly from the CDK8s documentation - the difference is clear. But it can vary depending on the language - the example here is actually Nodejs. Here is a Go example: More than the lines of code and verbosity, I think the more important point to consider is the level of abstraction. CDK8s+ provides higher level abstraction: This means in most cases you will have fewer lines of code An ergonomic API with better defaults Overall, less mental overload/overhead As developers, we all know how that can improve productivity! Custom Constructs Imagine you are building a complex application on Kubernetes using CDK8s, and it is required by many teams in the company - perhaps each with a different configuration. Instead of each dev team writing their own version, you can externalize this in the form of Custom Construct and share it across the company. It's nothing fancy, to be honest - it’s the same as using a dependency or library - but this is still very powerful. The great thing is that there is already something called a Constructs Hub that has constructs published by AWS, the community, and other providers as well! Demo Time, Again! In this demo: I walked through a "WordPress" deployment on Kubernetes to demonstrate the ease of use and intuitiveness of the CDK8s+ API Demonstrated custom constructs, in the context of the same "WordPress" deployment. Here is what it would look like using a custom "WordPress" construct: Integrating with Kubernetes Custom Resource Definitions (CRDs) Then we moved into a slightly advanced topic. You can use the CDK8s library to represent native Kubernetes objects like a deployment, service, etc. But in any real-world implementation, it's highly likely that you are using custom resource definitions (CRD). So if you have existing CRDS, can you manage them as code using CDK8s??? A good example of a CRD and operator is AWS Controllers for Kubernetes (or ACK). With the ACK project, you can manage AWS services from Kubernetes. This is just an example of an Amazon RDS database, but the same applies to an Amazon DynamoDB table, an Amazon S3 bucket, an Amazon Elasticache cluster, or even an AWS Lambda function. The ACK project has different controllers/operators for each service. When you create an ACK resource, these operators do what’s needed to create and manage the corresponding service(s) in AWS. How Do You Use CRDs With CDK8s? To use any CRD with CDK8s: The first step is to run “CDK8s import” by pointing it to the CRD. This will result in APIs being auto-generated. Write your code using these APIs. Then business as usual - use cdk8s synth to generate the manifest for your Kubernetes application. Yes, Another Demo! It’s a simple URL shortener app that is deployed to Amazon EKS and uses DynamoDB as the database. The interesting part here is the fact that both the application and DynamoDB are defined using CDK8s. CDK8s and AWS CDK So far we have had one thing to think about. We brought in CDK8s and were able to represent both the AWS resources and the respective Kubernetes applications in the form of code. But what if we don’t want to use ACK, and stick to something like AWS CDK for AWS resources but use CDK8s for Kubernetes apps? CDK8s is great, and AWS CDK is awesome. So what’s stopping us from using them together? Deploying Apps to EKS With AWS CDK – The "Hard" Way Here is a glimpse of using only AWS CDK to deploy an app to an Amazon EKS cluster (note that this is just AWS CDK code, not cdk8s) First, you declare the Kubernetes deployment. Then, the Kubernetes service And finally, call the addManifest function declared on the EKS cluster object - this is equivalent to calling kubectl (but programmatically). addCdk8sChart to the Rescue! The addCdk8sChart method acts like a bridge between CDK and cdk8s - check out the CDK reference docs for details. This makes life much easier! Recap: We’re Almost Done! Key takeaways: It's not about removing YAML, it's about reducing it and embracing the Infrastructure Is code paradigm. Use CDK8s with regular programming languages and make use of the related tools and best practices (testing, OOP, etc.). Optimize with CDK8s plus, share and reuse common components as constructs, and extend with CRDs. Combine CDK8s with AWS CDK, or existing workflow/practices like GitOps, or even Helm charts. Hope to see you next time!
Taking the Fear Out of Migrations
September 4, 2023 by
12 Agile Principles for Successful Agile Development Practices
September 2, 2023 by
Topology Spread Constraints for Increased Cluster Availability and Efficiency
September 5, 2023 by
Exploring the Basics of EMQX MQTT Broker Clustering: An Introduction
September 4, 2023 by
Explainable AI: Making the Black Box Transparent
May 16, 2023 by
Navigation the Testing Challenges of Global Migration to ISO 20022
September 5, 2023 by
Topology Spread Constraints for Increased Cluster Availability and Efficiency
September 5, 2023 by
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
Topology Spread Constraints for Increased Cluster Availability and Efficiency
September 5, 2023 by
This Is How You Automate Creation of Recurring Tickets in Jira
September 5, 2023 by
Navigation the Testing Challenges of Global Migration to ISO 20022
September 5, 2023 by
Topology Spread Constraints for Increased Cluster Availability and Efficiency
September 5, 2023 by
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
Exploring the Basics of EMQX MQTT Broker Clustering: An Introduction
September 4, 2023 by
Exploring the World of Artificial Intelligence: A Beginner’s Guide
September 1, 2023 by
Five IntelliJ Idea Plugins That Will Change the Way You Code
May 15, 2023 by