Mastering Exception Handling in Java Lambda Expressions
Data Flow Diagrams for Software Engineering
The Modern DevOps Lifecycle
While DevOps is here to stay, as the years pass, we must continuously assess and seek improvements to our existing software processes, systems, and culture — and DevOps is no exception to that rule. With business needs and customer demands constantly shifting, so must our technology, mindsets, and architecture in order to keep pace.Now is the time for this movement that's all about "shifting left" to essentially shift.In our annual DevOps Trend Report, we explore both its fundamental principles as well as the emerging topics, methodologies, and challenges surrounding the engineering ecosystem. Within our "Key Research Findings" and featured articles from our expert community members, readers will find information on core DevOps topics as well as new insights on what's next for DevOps in 2024 and beyond. Join us to learn about the state of CI/CD pipelines, the impact of technical debt, patterns for supply chain management<>DevOps, the rise of platform engineering, and even more!
Core PostgreSQL
AI Automation Essentials
Extra panel in the link: https://turnoff.us/geek/too-many-indexes/#extra_panel
Angular, a powerful framework for building dynamic web applications, is known for its component-based architecture. However, one aspect that often puzzles new developers is the fact that Angular components do not have a display: block style by default. This article explores the implications of this design choice, its impact on web development, and how developers can effectively work with it. The world of front-end development is replete with frameworks that aim to provide developers with robust tools to build interactive and dynamic web applications. Among these, Angular stands out as a powerful platform, known for its comprehensive approach to constructing applications’ architecture. Particularly noteworthy is the way Angular handles components — the fundamental building blocks of Angular applications. Understanding Angular Components In Angular, components are the fundamental building blocks that encapsulate data binding, logic, and template rendering. They play a crucial role in defining the structure and behavior of your application’s interface. Definition and Role A component in Angular is a TypeScript class decorated with @Component(), where you can define its application logic. Accompanying this class is a template, typically an HTML file, that determines the component's visual representation, and optionally CSS files for styling. The component's role is multifaceted: it manages the data and state necessary for the view, handles user interactions, and can also be reusable throughout the application. TypeScript import { Component } from '@angular/core'; @Component({ selector: 'app-my-component', templateUrl: './my-component.component.html', styleUrls: ['./my-component.component.css'] }) export class MyComponent { // Component logic goes here } Angular’s Shadow DOM Angular components utilize a feature known as Shadow DOM, which encapsulates their markup and styles, ensuring that they’re independent of other components. This means that styles defined in one component will not leak out and affect other parts of the application. Shadow DOM allows for style encapsulation by creating a boundary around the component. As a developer, it’s essential to understand the structure and capabilities of Angular components to fully leverage the power of the framework. Recognizing the inherent encapsulation provided by Angular’s Shadow DOM is particularly important when considering how components are displayed and styled within an application. Display Block: The Non-Default in Angular Components Angular components are different from standard HTML elements in many ways, one of which is their default display property. Unlike basic HTML elements, which often come with a display value of block or inline, Angular components are assigned none as their default display behavior. This decision is intentional and plays an important role in Angular’s encapsulation philosophy and component rendering process. Comparison With HTML Elements Standard HTML elements like <div>, <p>, and <h1> come with a default styling that can include the CSS display: block property. This means that when you drop a <div> into your markup, it naturally takes up the full width available to it, creating a "block" on the page. <!-- Standard HTML div element --> <div>This div is a block-level element by default.</div> In contrast, Angular components start without any assumptions on their display property. That is, they don’t inherently behave as block or inline elements; they are essentially “display-agnostic” until specified. Rationale Behind Non-Block Default Angular’s choice to diverge from the typical block behavior of HTML elements is deliberate. One reason for this is to encourage developers to consciously decide how each component should be displayed within the application’s layout. It prevents unexpected layout shifts and the overwriting of global styles that may occur when components with block-level styles are introduced into existing content. By not having a display property set by default, Angular invites developers to think responsively and adapt their components to various screen sizes and layout requirements by setting explicit display styles that suit the component’s purpose within the context of the application. In the following section, we will explore how to work with the display properties of Angular components, ensuring that they fit seamlessly into your application’s design with explicit and intentional styling choices. Working With Angular’s Display Styling When building applications with Angular, understanding and properly implementing display styling is crucial for achieving the desired layout and responsiveness. Since Angular components come without a preset display rule, it’s up to the developer to define how each component should be displayed within the context of the application. 1. Explicitly Setting Display Styles You have complete control over how the Angular component is displayed by explicitly setting the CSS display property. This can be defined inline, within the component's stylesheet, or even dynamically through component logic. /* app-example.component.css */ :host { display: block; } <!-- Inline style --> <app-example-component style="display: block;"></app-example-component> // Component logic setting display dynamically export class ExampleComponent implements OnInit { @HostBinding('style.display') displayStyle: string = 'block'; } Choosing to set your component’s display style via the stylesheet ensures that you can leverage CSS’s full power, including media queries for responsiveness. 2. Responsive Design Considerations Angular’s adaptability allows you to create responsive designs by combining explicit display styles with modern CSS techniques. Using media queries, flexbox, and CSS Grid, you can responsively adjust the layout of your components based on the viewport size. CSS /* app-example.component.css */ :host { display: grid; grid-template-columns: repeat(auto-fill, minmax(150px, 1fr)); } @media (max-width: 768px) { :host { display: block; } } By setting explicit display values in style sheets and using Angular’s data-binding features, you can create a responsive and adaptive user interface. This level of control over styling reflects the thoughtful consideration that Angular brings to the development process, enabling you to create sophisticated, maintainable, and scalable applications. Next, we will wrap up our discussion and revisit the key takeaways from working with Angular components and their display styling strategies. Conclusion Throughout this exploration of Angular components and their display properties, it’s become apparent that Angular’s choice to use a non-block default for components is a purposeful design decision. This approach promotes a more thoughtful application of styles and supports encapsulation, a core principle within Angular’s architecture. It steers developers toward crafting intentional and adaptive layouts, a necessity in the diverse landscape of devices and screen sizes. By understanding Angular’s component architecture and the reasoning behind its display styling choices, developers are better equipped to make informed decisions. Explicit display settings and responsive design considerations are not afterthoughts but integral parts of the design and development process when working with Angular. Embracing these concepts allows developers to fully leverage the framework’s capabilities, leading to well-structured, maintainable, and responsive applications that stand the test of time and technology evolution. The information provided in this article aims to guide Angular developers to harness these tools effectively, ensuring that the user experiences they create are as robust as the components they comprise.
In Part 1 of this series, we looked at MongoDB, one of the most reliable and robust document-oriented NoSQL databases. Here in Part 2, we'll examine another quite unavoidable NoSQL database: Elasticsearch. More than just a popular and powerful open-source distributed NoSQL database, Elasticsearch is first of all a search and analytics engine. It is built on the top of Apache Lucene, the most famous search engine Java library, and is able to perform real-time search and analysis operations on structured and unstructured data. It is designed to handle efficiently large amounts of data. Once again, we need to disclaim that this short post is by no means an Elasticsearch tutorial. Accordingly, the reader is strongly advised to extensively use the official documentation, as well as the excellent book, "Elasticsearch in Action" by Madhusudhan Konda (Manning, 2023) to learn more about the product's architecture and operations. Here, we're just reimplementing the same use case as previously, but using this time, using Elasticsearch instead of MongoDB. So, here we go! The Domain Model The diagram below shows our *customer-order-product* domain model: This diagram is the same as the one presented in Part 1. Like MongoDB, Elasticsearch is also a document data store and, as such, it expects documents to be presented in JSON notation. The only difference is that to handle its data, Elasticsearch needs to get them indexed. There are several ways that data can be indexed in an Elasticsearch data store; for example, piping them from a relational database, extracting them from a filesystem, streaming them from a real-time source, etc. But whatever the ingestion method might be, it eventually consists of invoking the Elasticsearch RESTful API via a dedicated client. There are two categories of such dedicated clients: REST-based clients like curl, Postman, HTTP modules for Java, JavaScript, Node.js, etc. Programming language SDKs (Software Development Kit): Elasticsearch provides SDKs for all the most used programming languages, including but not limited to Java, Python, etc. Indexing a new document with Elasticsearch means creating it using a POST request against a special RESTful API endpoint named _doc. For example, the following request will create a new Elasticsearch index and store a new customer instance in it. Plain Text POST customers/_doc/ { "id": 10, "firstName": "John", "lastName": "Doe", "email": { "address": "john.doe@gmail.com", "personal": "John Doe", "encodedPersonal": "John Doe", "type": "personal", "simple": true, "group": true }, "addresses": [ { "street": "75, rue Véronique Coulon", "city": "Coste", "country": "France" }, { "street": "Wulfweg 827", "city": "Bautzen", "country": "Germany" } ] } Running the request above using curl or the Kibana console (as we'll see later) will produce the following result: Plain Text { "_index": "customers", "_id": "ZEQsJI4BbwDzNcFB0ubC", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 1, "_primary_term": 1 } This is the Elasticsearch standard response to a POST request. It confirms having created the index named customers, having a new customer document, identified by an automatically generated ID ( in this case, ZEQsJI4BbwDzNcFB0ubC). Other interesting parameters appear here, like _version and especially _shards. Without going into too much detail, Elasticsearch creates indexes as logical collections of documents. Just like keeping paper documents in a filing cabinet, Elasticsearch keeps documents in an index. Each index is composed of shards, which are physical instances of Apache Lucene, the engine behind the scenes responsible for getting the data in or out of the storage. They might be either primary, storing documents, or replicas, storing, as the name suggests, copies of primary shards. More on that in the Elasticsearch documentation - for now, we need to notice that our index named customers is composed of two shards: of which one, of course, is primary. A final notice: the POST request above doesn't mention the ID value as it is automatically generated. While this is probably the most common use case, we could have provided our own ID value. In each case, the HTTP request to be used isn't POST anymore, but PUT. To come back to our domain model diagram, as you can see, its central document is Order, stored in a dedicated collection named Orders. An Order is an aggregate of OrderItem documents, each of which points to its associated Product. An Order document references also the Customer who placed it. In Java, this is implemented as follows: Java public class Customer { private Long id; private String firstName, lastName; private InternetAddress email; private Set<Address> addresses; ... } The code above shows a fragment of the Customer class. This is a simple POJO (Plain Old Java Object) having properties like the customer's ID, first and last name, email address, and a set of postal addresses. Let's look now at the Order document. Java public class Order { private Long id; private String customerId; private Address shippingAddress; private Address billingAddress; private Set<String> orderItemSet = new HashSet<>() ... } Here you can notice some differences compared to the MongoDB version. As a matter of fact, with MongoDB, we were using a reference to the customer instance associated with this order. This notion of reference doesn't exist with Elasticsearch and, hence, we're using this document ID to create an association between the order and the customer who placed it. The same applies to the orderItemSet property which creates an association between the order and its items.The rest of our domain model is quite similar and based on the same normalization ideas. For example, the OrderItem document: Java public class OrderItem { private String id; private String productId; private BigDecimal price; private int amount; ... } Here, we need to associate the product which makes the object of the current order item. Last but not least, we have the Product document: Java public class Product { private String id; private String name, description; private BigDecimal price; private Map<String, String> attributes = new HashMap<>(); ... } The Data Repositories Quarkus Panache greatly simplifies the data persistence process by supporting both the active record and the repository design patterns. In Part 1, we used the Quarkus Panache extension for MongoDB to implement our data repositories, but there is not yet an equivalent Quarkus Panache extension for Elasticsearch. Accordingly, waiting for a possible future Quarkus extension for Elasticsearch, here we have to manually implement our data repositories using the Elasticsearch dedicated client. Elasticsearch is written in Java and, consequently, it is not a surprise that it offers native support for invoking the Elasticsearch API using the Java client library. This library is based on fluent API builder design patterns and provides both synchronous and asynchronous processing models. It requires Java 8 at minimum. So, what do our fluent API builder-based data repositories look like? Below is an excerpt from the CustomerServiceImpl class which acts as a data repository for the Customer document. Java @ApplicationScoped public class CustomerServiceImpl implements CustomerService { private static final String INDEX = "customers"; @Inject ElasticsearchClient client; @Override public String doIndex(Customer customer) throws IOException { return client.index(IndexRequest.of(ir -> ir.index(INDEX).document(customer))).id(); } ... As we can see, our data repository implementation must be a CDI bean having an application scope. The Elasticsearch Java client is simply injected, thanks to the quarkus-elasticsearch-java-client Quarkus extension. This way avoids lots of bells and whistles that we would have had to use otherwise. The only thing we need to be able to inject the client is to declare the following property: Properties files quarkus.elasticsearch.hosts = elasticsearch:9200 Here, elasticsearch is the DNS (Domain Name Server) name that we associate with the Elastic search database server in the docker-compose.yaml file. 9200 is the TCP port number used by the server to listen for connections.The method doIndex() above creates a new index named customers if it doesn't exist and indexes (stores) into it a new document representing an instance of the class Customer. The indexing process is performed based on an IndexRequest accepting as input arguments the index name and the document body. As for the document ID, it is automatically generated and returned to the caller for further reference.The following method allows to retrieve the customer identified by the ID given as an input argument: Java ... @Override public Customer getCustomer(String id) throws IOException { GetResponse<Customer> getResponse = client.get(GetRequest.of(gr -> gr.index(INDEX).id(id)), Customer.class); return getResponse.found() ? getResponse.source() : null; } ... The principle is the same: using this fluent API builder pattern, we construct a GetRequest instance in a similar way that we did with the IndexRequest, and we run it against the Elasticsearch Java client. The other endpoints of our data repository, allowing us to perform full search operations or to update and delete customers, are designed the same way. Please take some time to look at the code to understand how things are working. The REST API Our MongoDB REST API interface was simple to implement, thanks to the quarkus-mongodb-rest-data-panache extension, in which the annotation processor automatically generated all the required endpoints. With Elasticsearch, we don't benefit yet from the same comfort and, hence, we need to manually implement it. That's not a big deal, as we can inject the previous data repositories, shown below: Java @Path("customers") @Produces(APPLICATION_JSON) @Consumes(APPLICATION_JSON) public class CustomerResourceImpl implements CustomerResource { @Inject CustomerService customerService; @Override public Response createCustomer(Customer customer, @Context UriInfo uriInfo) throws IOException { return Response.accepted(customerService.doIndex(customer)).build(); } @Override public Response findCustomerById(String id) throws IOException { return Response.ok().entity(customerService.getCustomer(id)).build(); } @Override public Response updateCustomer(Customer customer) throws IOException { customerService.modifyCustomer(customer); return Response.noContent().build(); } @Override public Response deleteCustomerById(String id) throws IOException { customerService.removeCustomerById(id); return Response.noContent().build(); } } This is the customer's REST API implementation. The other ones associated with orders, order items, and products are similar.Let's see now how to run and test the whole thing. Running and Testing Our Microservices Now that we looked at the details of our implementation, let's see how to run and test it. We chose to do it on behalf of the docker-compose utility. Here is the associated docker-compose.yml file: YAML version: "3.7" services: elasticsearch: image: elasticsearch:8.12.2 environment: node.name: node1 cluster.name: elasticsearch discovery.type: single-node bootstrap.memory_lock: "true" xpack.security.enabled: "false" path.repo: /usr/share/elasticsearch/backups ES_JAVA_OPTS: -Xms512m -Xmx512m hostname: elasticsearch container_name: elasticsearch ports: - "9200:9200" - "9300:9300" ulimits: memlock: soft: -1 hard: -1 volumes: - node1-data:/usr/share/elasticsearch/data networks: - elasticsearch kibana: image: docker.elastic.co/kibana/kibana:8.6.2 hostname: kibana container_name: kibana environment: - elasticsearch.url=http://elasticsearch:9200 - csp.strict=false ulimits: memlock: soft: -1 hard: -1 ports: - 5601:5601 networks: - elasticsearch depends_on: - elasticsearch links: - elasticsearch:elasticsearch docstore: image: quarkus-nosql-tests/docstore-elasticsearch:1.0-SNAPSHOT depends_on: - elasticsearch - kibana hostname: docstore container_name: docstore links: - elasticsearch:elasticsearch - kibana:kibana ports: - "8080:8080" - "5005:5005" networks: - elasticsearch environment: JAVA_DEBUG: "true" JAVA_APP_DIR: /home/jboss JAVA_APP_JAR: quarkus-run.jar volumes: node1-data: driver: local networks: elasticsearch: This file instructs the docker-compose utility to run three services: A service named elasticsearch running the Elasticsearch 8.6.2 database A service named kibana running the multipurpose web console providing different options such as executing queries, creating aggregations, and developing dashboards and graphs A service named docstore running our Quarkus microservice Now, you may check that all the required processes are running: Shell $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 005ab8ebf6c0 quarkus-nosql-tests/docstore-elasticsearch:1.0-SNAPSHOT "/opt/jboss/containe…" 3 days ago Up 3 days 0.0.0.0:5005->5005/tcp, :::5005->5005/tcp, 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp, 8443/tcp docstore 9678c0a04307 docker.elastic.co/kibana/kibana:8.6.2 "/bin/tini -- /usr/l…" 3 days ago Up 3 days 0.0.0.0:5601->5601/tcp, :::5601->5601/tcp kibana 805eba38ff6c elasticsearch:8.12.2 "/bin/tini -- /usr/l…" 3 days ago Up 3 days 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp, 0.0.0.0:9300->9300/tcp, :::9300->9300/tcp elasticsearch $ To confirm that the Elasticsearch server is available and able to run queries, you can connect to Kibana at http://localhost:601. After scrolling down the page and selecting Dev Tools in the preferences menu, you can run queries as shown below: In order to test the microservices, proceed as follows: 1. Clone the associated GitHub repository: Shell $ git clone https://github.com/nicolasduminil/docstore.git 2. Go to the project: Shell $ cd docstore 3. Checkout the right branch: Shell $ git checkout elastic-search 4. Build: Shell $ mvn clean install 5. Run the integration tests: Shell $ mvn -DskipTests=false failsafe:integration-test This last command will run the 17 provided integration tests, which should all succeed. You can also use the Swagger UI interface for testing purposes by firing your preferred browser at http://localhost:8080/q:swagger-ui. Then, in order to test endpoints, you can use the payload in the JSON files located in the src/resources/data directory of the docstore-api project.Enjoy!
Parameterized tests allow developers to efficiently test their code with a range of input values. In the realm of JUnit testing, seasoned users have long grappled with the complexities of implementing these tests. But with the release of JUnit 5.7, a new era of test parameterization enters, offering developers first-class support and enhanced capabilities. Let's delve into the exciting possibilities that JUnit 5.7 brings to the table for parameterized testing! Parameterization Samples From JUnit 5.7 Docs Let's see some examples from the docs: Java @ParameterizedTest @ValueSource(strings = { "racecar", "radar", "able was I ere I saw elba" }) void palindromes(String candidate) { assertTrue(StringUtils.isPalindrome(candidate)); } @ParameterizedTest @CsvSource({ "apple, 1", "banana, 2", "'lemon, lime', 0xF1", "strawberry, 700_000" }) void testWithCsvSource(String fruit, int rank) { assertNotNull(fruit); assertNotEquals(0, rank); } @ParameterizedTest @MethodSource("stringIntAndListProvider") void testWithMultiArgMethodSource(String str, int num, List<String> list) { assertEquals(5, str.length()); assertTrue(num >=1 && num <=2); assertEquals(2, list.size()); } static Stream<Arguments> stringIntAndListProvider() { return Stream.of( arguments("apple", 1, Arrays.asList("a", "b")), arguments("lemon", 2, Arrays.asList("x", "y")) ); } The @ParameterizedTest annotation has to be accompanied by one of several provided source annotations describing where to take the parameters from. The source of the parameters is often referred to as the "data provider." I will not dive into their detailed description here: the JUnit user guide does it better than I could, but allow me to share several observations: The @ValueSource is limited to providing a single parameter value only. In other words, the test method cannot have more than one argument, and the types one can use are restricted as well. Passing multiple arguments is somewhat addressed by @CsvSource, parsing each string into a record that is then passed as arguments field-by-field. This can easily get hard to read with long strings and/or plentiful arguments. The types one can use are also restricted — more on this later. All the sources that declare the actual values in annotations are restricted to values that are compile-time constants (limitation of Java annotations, not JUnit). @MethodSource and @ArgumentsSource provides a stream/collection of (un-typed) n-tuples that are then passed as method arguments. Various actual types are supported to represent the sequence of n-tuples, but none of them guarantee that they will fit the method's argument list. This kind of source requires additional methods or classes, but it provides no restriction on where and how to obtain the test data. As you can see, the source types available range from the simple ones (simple to use, but limited in functionality) to the ultimately flexible ones that require more code to get working. Sidenote — This is generally a sign of good design: a little code is needed for essential functionality, and adding extra complexity is justified when used to enable a more demanding use case. What does not seem to fit this hypothetical simple-to-flexible continuum, is @EnumSource. Take a look at this non-trivial example of four parameter sets with 2 values each. Note — While @EnumSource passes the enum's value as a single test method parameter, conceptually, the test is parameterized by enum's fields, that poses no restriction on the number of parameters. Java enum Direction { UP(0, '^'), RIGHT(90, '>'), DOWN(180, 'v'), LEFT(270, '<'); private final int degrees; private final char ch; Direction(int degrees, char ch) { this.degrees = degrees; this.ch = ch; } } @ParameterizedTest @EnumSource void direction(Direction dir) { assertEquals(0, dir.degrees % 90); assertFalse(Character.isWhitespace(dir.ch)); int orientation = player.getOrientation(); player.turn(dir); assertEquals((orientation + dir.degrees) % 360, player.getOrientation()); } Just think of it: the hardcoded list of values restricts its flexibility severely (no external or generated data), while the amount of additional code needed to declare the enum makes this quite a verbose alternative over, say, @CsvSource. But that is just a first impression. We will see how elegant this can get when leveraging the true power of Java enums. Sidenote: This article does not address the verification of enums that are part of your production code. Those, of course, had to be declared no matter how you choose to verify them. Instead, it focuses on when and how to express your test data in the form of enums. When To Use It There are situations when enums perform better than the alternatives: Multiple Parameters per Test When all you need is a single parameter, you likely do not want to complicate things beyond @ValueSource. But as soon as you need multiple -— say, inputs and expected results — you have to resort to @CsvSource, @MethodSource/@ArgumentsSource or @EnumSource. In a way, enum lets you "smuggle in" any number of data fields. So when you need to add more test method parameters in the future, you simply add more fields in your existing enums, leaving the test method signatures untouched. This becomes priceless when you reuse your data provider in multiple tests. For other sources, one has to employ ArgumentsAccessors or ArgumentsAggregators for the flexibility that enums have out of the box. Type Safety For Java developers, this should be a big one. Parameters read from CSV (files or literals), @MethodSource or @ArgumentsSource, they provide no compile-time guarantee that the parameter count, and their types, are going to match the signature. Obviously, JUnit is going to complain at runtime but forget about any code assistance from your IDE. Same as before, this adds up when you reuse the same parameters for multiple tests. Using a type-safe approach would be a huge win when extending the parameter set in the future. Custom Types This is mostly an advantage over text-based sources, such as the ones reading data from CSV — the values encoded in the text need to be converted to Java types. If you have a custom class to instantiate from the CSV record, you can do it using ArgumentsAggregator. However, your data declaration is still not type-safe — any mismatch between the method signature and declared data will pop up in runtime when "aggregating" arguments. Not to mention that declaring the aggregator class adds more support code needed for your parameterization to work. And we ever favored @CsvSource over @EnumSource to avoid the extra code. Documentable Unlike the other methods, the enum source has Java symbols for both parameter sets (enum instances) and all parameters they contain (enum fields). They provide a straightforward place where to attach documentation in its more natural form — the JavaDoc. It is not that documentation cannot be placed elsewhere, but it will be — by definition — placed further from what it documents and thus be harder to find, and easier to become outdated. But There Is More! Now: Enums. Are. Classes. It feels that many junior developers are yet to realize how powerful Java enums truly are. In other programming languages, they really are just glorified constants. But in Java, they are convenient little implementations of a Flyweight design pattern with (much of the) advantages of full-blown classes. Why is that a good thing? Test Fixture-Related Behavior As with any other class, enums can have methods added to them. This becomes handy if enum test parameters are reused between tests — same data, just tested a little differently. To effectively work with the parameters without significant copy and paste, some helper code needs to be shared between those tests as well. It is not something a helper class and a few static methods would not "solve." Sidenote: Notice that such design suffers from a Feature Envy. Test methods — or worse, helper class methods — would have to pull the data out of the enum objects to perform actions on that data. While this is the (only) way in procedural programming, in the object-oriented world, we can do better. Declaring the "helper" methods right in the enum declaration itself, we would move the code where the data is. Or, to put in OOP lingo, the helper methods would become the "behavior" of the test fixtures implemented as enums. This would not only make the code more idiomatic (calling sensible methods on instances over static methods passing data around), but it would also make it easier to reuse enum parameters across test cases. Inheritance Enums can implement interfaces with (default) methods. When used sensibly, this can be leveraged to share behavior between several data providers — several enums. An example that easily comes to mind is separate enums for positive and negative tests. If they represent a similar kind of test fixture, chances are they have some behavior to share. The Talk Is Cheap Let's illustrate this on a test suite of a hypothetical convertor of source code files, not quite unlike the one performing Python 2 to 3 conversion. To have real confidence in what such a comprehensive tool does, one would end up with an extensive set of input files manifesting various aspects of the language, and matching files to compare the conversion result against. Except for that, it is needed to verify what warnings/errors are served to the user for problematic inputs. This is a natural fit for parameterized tests due to the large number of samples to verify, but it does not quite fit any of the simple JUnit parameter sources, as the data are somewhat complex.See below: Java enum Conversion { CLEAN("imports-correct.2.py", "imports-correct.3.py", Set.of()), WARNINGS("problematic.2.py", "problematic.3.py", Set.of( "Using module 'xyz' that is deprecated" )), SYNTAX_ERROR("syntax-error.py", new RuntimeException("Syntax error on line 17")); // Many, many others ... @Nonnull final String inFile; @CheckForNull final String expectedOutput; @CheckForNull final Exception expectedException; @Nonnull final Set<String> expectedWarnings; Conversion(@Nonnull String inFile, @Nonnull String expectedOutput, @NotNull Set<String> expectedWarnings) { this(inFile, expectedOutput, null, expectedWarnings); } Conversion(@Nonnull String inFile, @Nonnull Exception expectedException) { this(inFile, null, expectedException, Set.of()); } Conversion(@Nonnull String inFile, String expectedOutput, Exception expectedException, @Nonnull Set<String> expectedWarnings) { this.inFile = inFile; this.expectedOutput = expectedOutput; this.expectedException = expectedException; this.expectedWarnings = expectedWarnings; } public File getV2File() { ... } public File getV3File() { ... } } @ParameterizedTest @EnumSource void upgrade(Conversion con) { try { File actual = convert(con.getV2File()); if (con.expectedException != null) { fail("No exception thrown when one was expected", con.expectedException); } assertEquals(con.expectedWarnings, getLoggedWarnings()); new FileAssert(actual).isEqualTo(con.getV3File()); } catch (Exception ex) { assertTypeAndMessageEquals(con.expectedException, ex); } } The usage of enums does not restrict us in how complex the data can be. As you can see, we can define several convenient constructors in the enums, so declaring new parameter sets is nice and clean. This prevents the usage of long argument lists that often end up filled with many "empty" values (nulls, empty strings, or collections) that leave one wondering what argument #7 — you know, one of the nulls — actually represents. Notice how enums enable the use of complex types (Set, RuntimeException) with no restrictions or magical conversions. Passing such data is also completely type-safe. Now, I know what you think. This is awfully wordy. Well, up to a point. Realistically, you are going to have a lot more data samples to verify, so the amount of the boilerplate code will be less significant in comparison. Also, see how related tests can be written leveraging the same enums, and their helper methods: Java @ParameterizedTest @EnumSource // Upgrading files already upgraded always passes, makes no changes, issues no warnings. void upgradeFromV3toV3AlwaysPasses(Conversion con) throws Exception { File actual = convert(con.getV3File()); assertEquals(Set.of(), getLoggedWarnings()); new FileAssert(actual).isEqualTo(con.getV3File()); } @ParameterizedTest @EnumSource // Downgrading files created by upgrade procedure is expected to always pass without warnings. void downgrade(Conversion con) throws Exception { File actual = convert(con.getV3File()); assertEquals(Set.of(), getLoggedWarnings()); new FileAssert(actual).isEqualTo(con.getV2File()); } Some More Talk After All Conceptually, @EnumSourceencourages you to create a complex, machine-readable description of individual test scenarios, blurring the line between data providers and test fixtures. One other great thing about having each data set expressed as a Java symbol (enum element) is that they can be used individually; completely out of data providers/parameterized tests. Since they have a reasonable name and they are self-contained (in terms of data and behavior), they contribute to nice and readable tests. Java @Test void warnWhenNoEventsReported() throws Exception { FixtureXmls.Invalid events = FixtureXmls.Invalid.NO_EVENTS_REPORTED; // read() is a helper method that is shared by all FixtureXmls try (InputStream is = events.read()) { EventList el = consume(is); assertEquals(Set.of(...), el.getWarnings()); } } Now, @EnumSource is not going to be one of your most frequently used argument sources, and that is a good thing, as overusing it would do no good. But in the right circumstances, it comes in handy to know how to use all they have to offer.
“The Mixtral-8x7B Large Language Model (LLM) is a pre-trained generative Sparse Mixture of Experts.” When I saw this come out it seemed pretty interesting and accessible, so I gave it a try. With the proper prompting, it seems good. I am not sure if it’s better than Google Gemma, Meta LLAMA2, or OLLAMA Mistral for my use cases. Today I will show you how to utilize the new Mixtral LLM with Apache NiFi. This will require only a few steps to run Mixtral against your text inputs. This model can be run by the lightweight serverless REST API or the transformers library. You can also use this GitHub repository. The context can have up to 32k tokens. You can also enter prompts in English, Italian, German, Spanish, and French. You have a lot of options on how to utilize this model, but I will show you how to build a real-time LLM pipeline utilizing Apache NiFi. One key thing to decide is what kind of input you are going to have (chat, code generation, Q&A, document analysis, summary, etc.). Once you have decided, you will need to do some prompt engineering and will need to tweak your prompt. In the following section, I include a few guides to help you improve your prompt-building skills. I will give you some basic prompt engineering in my walk-through tutorial. Guides To Build Your Prompts Optimally Mixtral: Prompt Engineering Guide Getting Started with Mixtral 8X7B The construction of the prompt is very critical to make this work well, so we are building this with NiFi. Overview of the Flow Step 1: Build and Format Your Prompt In building our application, the following is the basic prompt template that we are going to use. Prompt Template { "inputs": "<s>[INST]Write a detailed complete response that appropriately answers the request.[/INST] [INST]Use this information to enhance your answer: ${context:trim():replaceAll('"',''):replaceAll('\n', '')}[/INST] User: ${inputs:trim():replaceAll('"',''):replaceAll('\n', '')}</s>" } You will enter this prompt in a ReplaceText processor in the Replacement Value field. Step 2: Build Our Call to HuggingFace REST API To Classify Against the Model Add an InvokeHTTP processor to your flow, setting the HTTP URL to the Mixtral API URL. Step 3: Query To Convert and Clean Your Results We use the QueryRecord processor to clean and convert HuggingFace results grabbing the generated_text field. Step 4: Add Metadata Fields We use the UpdateRecord processor to add metadata fields, the JSON readers and writers, and the Literal Value Replacement Value Strategy. The fields we are adding are adding attributes. Overview of Send to Kafka and Slack: Step 5: Add Metadata to Stream We use the UpdateAttribute processor to add the correct "application/json Content Type", and set the model type to Mixtral. Step 6: Publish This Cleaned Record to a Kafka Topic We send it to our local Kafka broker (could be Docker or another) and to our flank-mixtral8x7B topic. If this doesn't exist, NiFi and Kafka will automagically create one for you. Step 7: Retry the Send If something goes wrong, we will try to resend three times, then fail. Overview of Pushing Data to Slack: Step 8: Send the Same Data to Slack for User Reply The first step is to split into a single record to send one at a time. We use the SplitRecord processor for this. As before, reuse the JSON Tree Reader and JSON Record Set Writer. As usual, choose "1" as the Records Per Split. Step 9: Make the Generated Text Available for Messaging We utilize EvaluateJsonPath to extract the Generated Text from Mixtral (on HuggingFace). Step 10: Send the Reply to Slack We use the PublishSlack processor, which is new in Apache NiFi 2.0. This one requires your Channel name or channel ID. We choose the Publish Strategy of Use 'Message Text' Property. For Message Text, use the Slack Response Template below. For the final reply to the user, we will need a Slack Response template formatted for how we wish to communicate. Below is an example that has the basics. Slack Response Template =============================================================================================================== HuggingFace ${modelinformation} Results on ${date}: Question: ${inputs} Answer: ${generated_text} =========================================== Data for nerds ==== HF URL: ${invokehttp.request.url} TXID: ${invokehttp.tx.id} == Slack Message Meta Data == ID: ${messageid} Name: ${messagerealname} [${messageusername}] Time Zone: ${messageusertz} == HF ${modelinformation} Meta Data == Compute Characters/Time/Type: ${x-compute-characters} / ${x-compute-time}/${x-compute-type} Generated/Prompt Tokens/Time per Token: ${x-generated-tokens} / ${x-prompt-tokens} : ${x-time-per-token} Inference Time: ${x-inference-time} // Queue Time: ${x-queue-time} Request ID/SHA: ${x-request-id} / ${x-sha} Validation/Total Time: ${x-validation-time} / ${x-total-time} =============================================================================================================== When this is run, it will look like the image below in Slack. You have now sent a prompt to Hugging Face, had it run against Mixtral, sent the results to Kafka, and responded to the user via Slack. We have now completed a full Mixtral application with zero code. Conclusion You have now built a full round trip utilizing Apache NiFi, HuggingFace, and Slack to build a chatbot utilizing the new Mixtral model. Summary of Learnings Learned how to build a decent prompt for HuggingFace Mixtral Learned how to clean up streaming data Built a HuggingFace REST call that can be reused Processed HuggingFace model call results Send your first Kafka message Formatted and built Slack calls Built a full DataFlow for GenAI If you need additional tutorials on utilizing the new Apache NiFi 2.0, check out: Apache NiFi 2.0.0-M2 Out! For additional information on building Slack bots: Building a Real-Time Slackbot With Generative AI Building an LLM Bot for Meetups and Conference Interactivity Also, thanks for following my tutorial. I am working on additional Apache NiFi 2 and Generative AI tutorials that will be coming to DZone. Finally, if you are in Princeton, Philadelphia, or New York City please come out to my meetups for in-person hands-on work with these technologies. Resources Mixtral of Experts Mixture of Experts Explained mistralai/Mixtral-8x7B-v0.1 Mixtral Overview Invoke the Mixtral 8x7B model on Amazon Bedrock for text generation Running Mixtral 8x7b on M1 16GB Mixtral-8x7B: Understanding and Running the Sparse Mixture of Experts by Mistral AI Retro-Engineering a Database Schema: Mistral Models vs. GPT4, LLama2, and Bard (Episode 3) Comparison of Models: Quality, Performance & Price Analysis A Beginner’s Guide to Fine-Tuning Mixtral Instruct Model
Why Do Organizations Need Secure Development Environments? The need to secure corporate IT environments is common to all functions of organizations, and software application development is one of them. At its core, the need for securing IT environments in organizations arises from the digital corporate assets that they carry. It’s often data attached to privacy concerns, typically under regulations such as GDPR or HIPAA, or application source code, credentials, and most recently operational data that can have strategic significance. Threat scenarios attached to corporate data are not only bound to leaking data to outsiders but also preventing insiders with nefarious intent to exfiltrate data. Hence the security problem is multifaceted: it spans from careless asset handling to willful mishandling. In the case of environments for software application development, the complexity of the security problem lies in addressing the diversity of these environments’ settings. They range from data access needs and environment configuration to the developer’s relationship with the company; e.g., internal employee, consultant, temporary, etc. Security left aside, development environments have notoriously complex setups and often require significant maintenance because many applications and data are locally present on the device’s internal storage; for example, the integrated development environment (IDE) and the application’s source code. Hence, for these environments data protection against leaks will target locally stored assets such as source code, credentials, and potentially sensitive data. Assessing the Risk of Locally Stored Data Let’s first take a quick step back in ICT history and look at an oft-cited 2010 benchmark study named "The Billion Dollar Lost Laptop Problem". The study looks at 329 organizations over 12 months and reports that over 86,000 laptops were stolen or lost, resulting in a loss of 2.1 billion USD, an average of 6.4 million per organization. In 2010, the use of the Cloud as a storage medium for corporate data was nascent; hence today, the metrics to determine the cost and impact of the loss of a corporate laptop would likely look very different. For example, for many of the business functions that were likely to be impacted at that time, Cloud applications have brought today a solution by removing sensitive data from employees’ laptops. This has mostly shifted the discussion on laptop security to protecting the credentials required to access Cloud (or self-hosted) business resources, rather than protecting locally stored data itself. Figure 1: In 2024, most business productivity data has already moved to the cloud. Back in the 2010s, a notable move was CRM data, which ended up greatly reducing the risk of corporate data leaks. There is, though, a notable exception to the above shift in technology: the environments used for code development. For practical reasons, devices used for development today have a replica of projects’ source code, in addition to corporate secrets such as credentials, web tokens, cryptographic keys and perhaps strategic data to train machine learning models or to test algorithms. In other words, there is still plenty of interesting data stored locally in development environments that warrant protection against loss or theft. Therefore, the interest in securing development environments has not waned. There are a variety of reasons for malicious actors to go after assets in these environments, from accessing corporate intellectual property (see the hack of Grand Theft Auto 6), to understanding existing vulnerabilities of an application in order to compromise it in operation. Once compromised, the application might provide access to sensitive data such as personal user information, including credit card numbers. See, for example, the source code hack at Samsung. The final intent here is again to leak potentially sensitive or personal data. Recent and notorious hacks of this kind were password manager company LastPass and the Mercedes hack in early 2024. Despite all these potential downfalls resulting from the hacking of a single developer’s environment, few companies today can accurately determine where the replicas of their source code, secrets, and data are (hint: likely all over the devices of their distributed workforce), and are poorly shielded against the loss of a laptop or a looming insider threat. Recall that, using any online or self-hosted source code repositories such as GitHub does not get rid of any of the replicas in developers’ environments. This is because local replicas are needed for developers to update the code before sending it back to the online Git repository. Hence protecting these environments is a problem that grows with the number of developers working in the organization. Use Cases for Virtual Desktops and Secure Developer Laptops The desire to remove data from developers’ environments is prevalent across many regulated industries such as Finance and Insurance. One of the most common approaches is the use of development machines accessed remotely. Citrix and VMware have been key actors in this market by enabling developers to remotely access virtual machines hosted by the organization. In addition, these platforms implement data loss prevention mechanisms that monitor user activities to prevent data exfiltration. Figure 2: Left - Developers to remotely access virtual machines hosted by the organization. Right - Virtualization has evolved from emulating machines to processes, which is used as a staple for DevOps. Running and accessing a virtual machine remotely for development has many drawbacks in particular on the developer’s productivity. One reason is that the streaming mechanism used to access the remote desktop requires significant bandwidth to be truly usable and often results in irritating lags when typing code. The entire apparatus is also complex to set up, as well as costly to maintain and operate for the organization. In particular, the use of a virtual machine is quite a heavy mechanism that requires significant computational resources (hence cost) to run. Finally, such a setup is general-purpose; i.e., it is not designed in particular for code development and requires the installation of the entire development tool suite. For the reasons explained above, many organizations have reverted to securing developer laptops using end-point security mechanisms implementing data loss prevention measures. In the same way, as for the VDI counterpart, this is also often a costly solution because such laptops have complex setups. When onboarding remote development teams, organizations often send these laptops through the mail at great expense, which complicates the maintenance and monitoring process. The Case for Secure Cloud Development Environments Recently, virtualization has evolved from emulating entire machines to the granularity of single processes with the technology of software containers. Containers are well-suited for code development because they provide a minimal and sufficient environment to compile typical applications, in particular web-based ones. Notably, in comparison to virtual machines, containers start in seconds instead of minutes and require much fewer computational resources to execute. Containers are typically tools used locally by developers on their devices to isolate software dependencies related to a specific project in a way that the source code can be compiled and executed without interference with potentially unwanted settings. The great thing about containers is that they don’t have to remain a locally used development tool. They can be run online and used as an alternative to a virtual machine. This is the basic mechanism used to implement a Cloud Development Environment (CDE). Figure 3: Containers can be run online and become a lightweight alternative to a virtual machine. This is the basic mechanism to implement a Cloud Development Environment. Running containers online has been one of the most exciting recent trends in virtualization aligned with DevOps practices where containers are critical to enable efficient testing and deployments. CDEs are accessed online with an IDE via network connection (Microsoft Visual Studio Code has such a feature as explained here) or using a Cloud IDE (an IDE running in a web browser such as Microsoft Visual Studio Code, Eclipse Theia, and others.) A Cloud IDE allows a developer to access a CDE with the benefit that no environment needs to be installed on the local device. Access to the remote container is done transparently. Compared to a remotely executing desktop as explained before, discomfort due to a streaming environment does not apply here since the IDE is executing as a web application in the browser. Hence the developer will not suffer display lags in particular in low bandwidth environments as is the case with VDI and DaaS. Bandwidth requirements between the IDE and the CDE are low because only text information is exchanged between the two. Figure 4: Access to the remote container is done with an IDE running in a web browser; hence, developers will not suffer display lags, particularly in low bandwidth environments As a result, in the specific context of application development, the use of CDEs is a lightweight mechanism to remove development data from local devices. However, this still does not achieve the security delivered by Citrix and other VDI platforms, because CDEs are designed for efficiency and not for security. They do not provide any data loss prevention mechanism. This is where the case to implement secure Cloud Development Environments lies: CDEs with data loss prevention provide a lightweight alternative to the use of VDI or secure development laptops, with the additional benefit of an improved developer experience. The resulting platform is a secure Cloud Development platform. Using such a platform, organizations can significantly start to reduce the cost of provisioning secure development environments for their developers. Figure 5: To become a replacement for VDIs or secure laptops, Cloud Development Environments need to include security measures against data leaks. Moving From Virtual Desktops To Secure Cloud Development Environments As a conclusion to this discussion, below I briefly retrace the different steps to build the case for a secure Cloud-based development platform that combines the efficient infrastructure of CDE with end-to-end data protection against data exfiltration, leading to a secure CDE. Initially, secure developer laptops were used to directly access corporate resources sometimes using a VPN when outside the IT perimeter. According to the benchmark study that I mentioned at the beginning of this article, 41% of laptops routinely contained sensitive data according to the study that I mentioned at the beginning of this article. Then, the use of virtual machines and early access to web applications has allowed organizations to remove data from local laptop storage. But code development on remote virtual machines was and remains strenuous. Recently, the use of lightweight virtualization based on containers has allowed quicker access to online development environments, but all current vendors in this space do not have data security since the primary use case is productivity. Figure 6: A representation of the technological evolution of mechanisms used by organizations to provision secure development environments, across the last decade. Finally, a secure Cloud Development Environment platform (as shown in the rightmost figure below) illustrates the closest incarnation of the secure development laptop. Secure CDEs benefit from the experiences of pioneering companies like Citrix, seizing the chance to separate development environments from traditional hardware. This separation allows for a blend of infrastructure efficiency and security without compromising developers' experience.
As a developer, you may encounter situations where your application's database must handle large amounts of data. One way to manage this data effectively is through database sharding, a technique that distributes data across multiple servers or databases horizontally. Sharding can improve performance, scalability, and reliability by breaking up a large database into smaller, more manageable pieces called shards. In this article, we'll explore the concept of database sharding, discuss various sharding strategies, and provide a step-by-step guide to implementing sharding in MongoDB, a popular NoSQL database. Understanding Database Sharding Database sharding involves partitioning a large dataset into smaller subsets called shards. Each shard contains a portion of the total data and operates independently from the others. By executing queries and transactions on a single shard rather than the entire dataset, response times are faster, and resources are utilized more efficiently. Sharding Strategies There are several sharding strategies to choose from, depending on your application's requirements: Range-based sharding: Data is partitioned based on a specific range of values (e.g., users with IDs 1-1000 in Shard 1, users with IDs 1001-2000 in Shard 2). Hash-based sharding: A hash function is applied to a specific attribute (e.g., user ID), and the result determines which shard the data belongs to. This method ensures a balanced distribution of data across shards. Directory-based sharding: A separate lookup service or table is used to determine which shard a piece of data belongs to. This approach provides flexibility in adding or removing shards but may introduce an additional layer of complexity. Geolocation-based sharding: Data is partitioned based on the geographical location of the users or resources, reducing latency for geographically distributed users. Implementing Sharding in MongoDB MongoDB supports sharding out-of-the-box, making it a great choice for developers looking to implement sharding in their applications. Here's a step-by-step guide to set up sharding in MongoDB. We will use the MongoDB shell which uses JavaScript syntax for writing commands and interacting with the database: 1. Set up a Config Server The config server stores metadata about the cluster and shard locations. For production environments, use a replica set of three config servers. Shell mongod --configsvr --dbpath /data/configdb --port 27019 --replSet configReplSet 2. Initialize the Config Server Replica Set This command initiates a new replica set on a MongoDB instance running on port 27019. Shell mongo --port 27019 > rs.initiate() 3. Set Up Shard Servers Start each shard server with the --shardsvr option and a unique --dbpath. Shell mongod --shardsvr --dbpath /data/shard1 --port 27018 mongod --shardsvr --dbpath /data/shard2 --port 27017 4. Start the mongos Process The mongos process acts as a router between clients and the sharded cluster. Shell mongos --configdb configReplSet/localhost:27019 5. Connect to the mongos Instance and Add the Shards Shell mongo > sh.addShard("localhost:27018") > sh.addShard("localhost:27017") 6. Enable Sharding for a Specific Database and Collection Shell > sh.enableSharding("myDatabase") > sh.shardCollection("myDatabase.myCollection", {"userId": "hashed"}) In this example, we've set up a MongoDB sharded cluster with two shards and used hash-based sharding on the userId field. Now, data in the "myCollection" collection will be distributed across the two shards, improving performance and scalability. Conclusion Database sharding is an effective technique for managing large datasets in your application. By understanding different sharding strategies and implementing them using MongoDB, you can significantly improve your application's performance, scalability, and reliability. With this guide, you should now have a solid understanding of how to set up sharding in MongoDB and apply it to your own projects. Happy learning!!
In today's fast-paced digital landscape, DevOps has emerged as a critical methodology for organizations looking to streamline their software development and delivery processes. At the heart of DevOps lies the concept of collaboration between development and operations teams, enabled by a set of practices and tools aimed at automating and improving the efficiency of the software delivery lifecycle. One of the key enablers of DevOps practices is platform engineering. Platform engineers are responsible for designing, building, and maintaining the infrastructure and tools that support the development, deployment, and operation of software applications. In essence, they provide the foundation upon which DevOps practices can thrive. The Foundations of Platform Engineering Platform Engineering in the Context of DevOps Platform engineering in the context of DevOps encompasses the practice of designing, building, and maintaining the underlying infrastructure, tools, and services that facilitate efficient software development processes. Platform engineers focus on creating a robust platform that provides developers with the necessary tools, services, and environments to streamline the software development lifecycle. Below are the key aspects, responsibilities, and objectives of platform engineering in DevOps: Infrastructure Management and Infrastructure as Code (IaC): Designing, building, and maintaining the infrastructure that supports software development, testing, and deployment; implementing Infrastructure as Code practices to manage infrastructure using code, enabling automated provisioning and management of resources Automation: Automating repetitive tasks such as builds, tests, deployments, and infrastructure provisioning to increase efficiency and reduce errors Tooling selection and management: Selecting, configuring, and managing the tools and technologies used throughout the software development lifecycle, including version control systems, CI/CD pipelines, and monitoring tools Containerization and orchestration: Utilizing containerization technologies like Docker and orchestration tools such as Kubernetes to create scalable and portable environments for applications. Continuous Integration and Continuous Deployment (CI/CD) pipelines: Designing, implementing, and maintaining CI/CD pipelines to automate the build, test, and deployment processes, enabling rapid and reliable software delivery Observability: Implementing monitoring and logging solutions to track the performance, health, and behavior of applications and infrastructure, enabling quick detection and resolution of issues Security and compliance: Ensuring that the platform adheres to security best practices and complies with relevant regulations and standards, such as GDPR or PCI DSS Scalability and resilience: Designing the platform to be scalable and resilient, capable of handling increasing loads and recovering from failures gracefully Collaboration and communication: Facilitating collaboration between development, operations, and other teams to streamline workflows and improve communication for enhanced productivity Overall, the primary objective of platform engineering is to establish and maintain a comprehensive platform that empowers development teams to deliver high-quality software efficiently. This involves ensuring the platform's security, scalability, compliance, and reliability while leveraging automation and modern tooling to optimize the software development lifecycle within a DevOps framework. Stand-Alone DevOps vs Platform-Enabled DevOps Model Characteristic Stand-Alone DevOps Model Platform-Enabled DevOps Model Infrastructure management Each team manages its own infrastructure independently. Infrastructure is managed centrally and shared across teams. Tooling Teams select and manage their own tools and technologies. Common tools and technologies are provided and managed centrally. Standardization Limited standardization across teams, leading to variation Standardization of tools, processes, and environments Collaboration Teams work independently, with limited collaboration. Encourages collaboration and sharing of best practices Scalability Limited scalability due to disparate and manual processes Easier scalability through shared and automated processes Efficiency May lead to inefficiencies due to duplication of efforts Promotes efficiency through shared resources and automation Flexibility More flexibility in tool and process selection for each team Requires adherence to standardized processes and tools Management overhead Higher management overhead due to disparate processes Lower management overhead with centralized management Learning curve Each team must learn and manage their chosen tools. Teams can focus more on application development and less on tool management. Costs Costs may be higher due to duplication of infrastructure. Costs can be optimized through shared infrastructure and tools. These characteristics highlight the differences between the Stand-Alone DevOps model, where teams operate independently, and the Platform-Enabled DevOps model, where a centralized platform provides tools and infrastructure for teams to collaborate and work more efficiently. Image source Infrastructure as Code (IaC) Infrastructure as Code (IaC) plays a crucial role in platform engineering and DevOps by providing a systematic approach to managing and provisioning infrastructure. It allows teams to define their infrastructure using code, which can be version-controlled, tested, and deployed in a repeatable and automated manner. This ensures consistency across environments, reduces the risk of configuration errors, and increases the speed and reliability of infrastructure deployment. IaC also promotes collaboration between development, operations, and other teams by enabling them to work together on infrastructure configurations. By treating infrastructure as code, organizations can achieve greater efficiency, scalability, and agility in their infrastructure management practices, ultimately leading to improved software delivery and operational excellence. Automation and Orchestration Automation and orchestration are foundational to DevOps, providing the framework and tools to streamline and optimize the software development lifecycle. Automation eliminates manual, repetitive tasks, reducing errors and increasing efficiency. It accelerates the delivery of software by automating build, test, and deployment processes, enabling organizations to release new features and updates quickly and reliably. Orchestration complements automation by coordinating and managing complex workflows across different teams and technologies. Together, automation and orchestration improve collaboration, scalability, and reliability, ultimately helping organizations deliver better software faster and more efficiently. Platform engineers perform various automation and orchestration tasks to streamline the software development lifecycle and manage infrastructure efficiently. Here are some examples: Infrastructure provisioning: Using tools like Terraform or AWS CloudFormation, platform engineers automate the provisioning of infrastructure components such as virtual machines, networks, and storage. Configuration management: Tools like Ansible, Chef, or Puppet are used to automate the configuration of servers and applications, ensuring consistency and reducing manual effort. Continuous Integration/Continuous Deployment (CI/CD): Platform engineers design and maintain CI/CD pipelines using tools like Jenkins, GitLab CI/CD, or CircleCI to automate the build, test, and deployment processes. Container orchestration: Platform engineers use Kubernetes, Docker Swarm, or similar tools to orchestrate the deployment and management of containers, ensuring scalability and high availability. Monitoring and alerting: Automation is used to set up monitoring and alerting systems such as Prometheus, Grafana, or ELK stack to monitor the health and performance of infrastructure and applications. Scaling and auto-scaling: Platform engineers automate the scaling of infrastructure based on demand using tools provided by cloud providers or custom scripts. Backup and disaster recovery: Automation is used to set up and manage backup and disaster recovery processes, ensuring data integrity and availability. Security automation: Platform engineers automate security tasks such as vulnerability scanning, patch management, and access control to enhance the security posture of the infrastructure. Compliance automation: Tools are used to automate compliance checks and audits to ensure that infrastructure and applications comply with regulatory requirements and internal policies. Self-service portals: Platform engineers create self-service portals or APIs that allow developers to provision resources and deploy applications without manual intervention. These examples illustrate how platform engineers leverage automation and orchestration to improve efficiency, reliability, and scalability in managing infrastructure and supporting the software development lifecycle. Implementing and Managing Containers and Microservices Platform engineers play a crucial role in implementing and managing containers and microservices, which are key components of modern, cloud-native applications. They are responsible for designing the infrastructure and systems that support containerized environments, including selecting the appropriate container orchestration platform (such as Kubernetes) and ensuring its proper configuration and scalability. Platform engineers also work closely with development teams to define best practices for containerization and microservices architecture, including image management, networking, and service discovery. They are responsible for monitoring the health and performance of containerized applications, implementing automated scaling and recovery mechanisms, and ensuring that containers and microservices are deployed securely and compliant with organizational standards. Manage CI/CD Pipelines Platform engineers design and manage CI/CD pipelines to automate the build, test, and deployment processes, enabling teams to deliver software quickly and reliably. Here's how they typically do it: Pipeline design: Platform engineers design CI/CD pipelines to meet the specific needs of the organization, including defining the stages of the pipeline (such as build, test, and deploy) and the tools and technologies to be used at each stage. Integration with version control: They integrate the CI/CD pipeline with version control systems (such as Git) to trigger automated builds and deployments based on code changes. Build automation: Platform engineers automate the process of building software artifacts (such as executables or container images) using build tools (such as Jenkins, GitLab CI/CD, or CircleCI). Testing automation: They automate the execution of tests (such as unit tests, integration tests, and performance tests) to ensure that code changes meet quality standards before deployment. Deployment automation: Platform engineers automate the deployment of software artifacts to various environments (such as development, staging, and production) using deployment tools (such as Kubernetes, Docker, or AWS CodeDeploy). Monitoring and feedback: They integrate monitoring and logging tools into the CI/CD pipeline to provide feedback on the health and performance of deployed applications, enabling teams to quickly detect and respond to issues. Security and compliance: Platform engineers ensure that the CI/CD pipeline adheres to security and compliance requirements, such as scanning for vulnerabilities in dependencies and enforcing access controls. Monitoring and Logging Solutions Platform engineers implement and maintain monitoring and logging solutions to ensure the health, performance, and security of applications and infrastructure. They select and configure monitoring tools (such as Prometheus, Grafana, or ELK stack) to collect and visualize metrics, logs, and traces from various sources. Platform engineers set up alerting mechanisms to notify teams about issues or anomalies in real time, enabling them to respond quickly. They also design and maintain logging solutions to centralize logs from different services and applications, making it easier to troubleshoot issues and analyze trends. Platform engineers continuously optimize monitoring and logging configurations to improve performance, reduce noise, and ensure compliance with organizational policies and standards. Role of Platform Engineers in Ensuring Security and Compliance Platform engineers play a vital role in ensuring security and compliance in a DevOps environment by implementing and maintaining robust security practices and controls. They are responsible for designing secure infrastructure and environments, implementing security best practices, and ensuring compliance with regulatory requirements and industry standards. Platform engineers configure and manage security tools and technologies, such as firewalls, intrusion detection systems, and vulnerability scanners, to protect against threats and vulnerabilities. They also work closely with development and operations teams to integrate security into the software development lifecycle, including secure coding practices, regular security testing, and security training. Additionally, platform engineers monitor and audit infrastructure and applications for compliance with internal security policies and external regulations, taking proactive measures to address any issues that arise. Conclusion Platform engineering is a cornerstone of DevOps, providing the foundation upon which organizations can build efficient, scalable, and reliable software delivery pipelines. By designing and managing the infrastructure, tools, and processes that enable DevOps practices, platform engineers empower development teams to focus on delivering value to customers quickly and efficiently. Through automation, standardization, and collaboration, platform engineering drives continuous improvement and innovation, helping organizations stay competitive in today's fast-paced digital landscape. As organizations continue to embrace DevOps principles, the role of platform engineering will only become more critical, ensuring that they can adapt and thrive in an ever-changing environment.
Moving from coding wizard to the captain of the tech ship is no walk in the park. I've been on this wild ride for over 12 years, climbing from a newbie manager to calling the shots in senior management. It's more than just leveling up your tech game; it's a total mindset overhaul, a dance with the bigger picture, and a crash course in bossing up. This shift is a rollercoaster, demanding some serious soul-searching, skill buffing, and a promise to keep learning as the world keeps changing. This article spills the beans on the ins and outs of this evolution, dishing out a no-nonsense guide for anyone itching to ride this transformation wave. Introduction Swapping the coding grind for a leadership opportunity is a game-changer in your career. It's not just a job switch; it's a full-blown makeover that needs you to see the whole picture. Beyond just knowing your way around tech stuff, leading the tech pack needs a flair for teamwork, a knack for sparking innovation, and a hunger for hitting goals together. In the next bits, we're diving into the nitty-gritty, giving you the lowdown for those gunning to jump from solo coder to tech leader. Soul-Searching and Skill Buffing 1. Soul-Searching Start this ride by looking at your career goals, strengths, and spots that need some TLC. Check if you're genuinely into guiding and helping others. Leadership is more than just tech smarts; it's about teaming up, pushing boundaries, and scoring wins together. 2. Skill Buffing Grabbing and polishing the skills you need to lead is the name of the game. Talking well, thinking big, making calls, and handling a crew are your building blocks. Get your hands dirty with projects that let you flex these skills, even before you officially make the jump. Leading isn't just learned from a book; it's a hands-on experience. Plotting Your Career Path 3. Mentor Magic Get the information from the Senior leaders in your company or your field. A mentor's tips can be gold, giving you the lowdown on where your career can go and being your go-to for bouncing around ideas. Learning from the experienced hands can speed up your growth and keep you from tripping on common hurdles. 4. Leadership Bootcamp Sign up for the leadership bootcamps, workshops, and shindigs. Lots of companies have in-house workshops to whip up future leaders. These workshops teach you the A to Z of leadership, spill the beans on what works, and throw in real-life stories that show you the ropes. 5. Project Commander Throw your hat in the ring to lead projects or big moves in your current organization. Taking charge, wrangling teams, and showing off your leadership chops shouts loud and clear that you're game for the next level. Go looking for chances to show off your leadership game, even before you've got the official title. Elevating Your Leadership Swagger 6. Spotlight Moments and Handshakes Be front and center at industry events, conferences, and places where a lot of the Senior leaders participate. Building a network can open doors to leadership opportunities and let you see how different leaders operate. Join the scene where you can learn from the pros, swap stories with peers, and stay in the loop about what's hot in the tech town. 7. Grab That Leadership Role When you're really inclined towards this path, go all in for leadership roles. That might mean chasing promotions in-house or sniffing around for opportunities that match your career dreams. Tweak your resume to show off the good stuff that screams leadership potential. Be ready to talk up your vision for leadership when the interviews roll in. 8. Always Tweak, Always Grow The journey doesn't hit the brakes when you grab that leadership spot. Keep an eye out for chances to get better, soak up feedback, and adjust your leadership style as your team and company shake things up. Leading is a never-ending lesson, and being able to roll with the punches is key for sticking around at the top. Riding the Leadership Wave Switching from coding guru to tech leader is a wild ride that needs guts, some serious self-reflection, and a promise to keep growing. Every win or faceplant adds up to making you a leader worth following. Embrace the twists, learn from the faceplants, and savor the rush of guiding a team to victory. As you roll into this leadership adventure, remember that going from code jockey to tech leader is different for everyone. Sure, there are some rules and steps, but your journey's shaped by your story, your company culture, and the buzz in your team. Stay locked on your goals, stay flexible, and enjoy the ride of becoming a leader who gets tech and pushes others to win too. Conclusion Yeah, it's a bumpy ride going from coder to tech leader, but the wins at the end are worth the grind. This guide spills the tea for those eyeing leadership spots in the tech game. Mix in some soul-searching, skill flexing, mentor chats, and always looking to level up, and you'll level up your tech leadership career. Remember, the journey's just as important as the finish line, and each step gets you closer to being the Senior Leader making waves.
In the dynamic landscape of microservices, managing communication and ensuring robust security and observability becomes a Herculean task. This is where Istio, a revolutionary service mesh, steps in, offering an elegant solution to these challenges. This article delves deep into the essence of Istio, illustrating its pivotal role in a Kubernetes (KIND) based environment, and guides you through a Helm-based installation process, ensuring a comprehensive understanding of Istio's capabilities and its impact on microservices architecture. Introduction to Istio Istio is an open-source service mesh that provides a uniform way to secure, connect, and monitor microservices. It simplifies configuration and management, offering powerful tools to handle traffic flows between services, enforce policies, and aggregate telemetry data, all without requiring changes to microservice code. Why Istio? In a microservices ecosystem, each service may be developed in different programming languages, have different versions, and require unique communication protocols. Istio provides a layer of infrastructure that abstracts these differences, enabling services to communicate with each other seamlessly. It introduces capabilities like: Traffic management: Advanced routing, load balancing, and fault injection Security: Robust ACLs, RBAC, and mutual TLS to ensure secure service-to-service communication Observability: Detailed metrics, logs, and traces for monitoring and troubleshooting Setting Up a KIND-Based Kubernetes Cluster Before diving into Istio, let's set up a Kubernetes cluster using KIND (Kubernetes IN Docker), a tool for running local Kubernetes clusters using Docker container "nodes." KIND is particularly suited for development and testing purposes. # Install KIND curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.11.1/kind-$(uname)-amd64 chmod +x ./kind mv ./kind /usr/local/bin/kind # Create a cluster kind create cluster --name istio-demo This code snippet installs KIND and creates a new Kubernetes cluster named istio-demo. Ensure Docker is installed and running on your machine before executing these commands. Helm-Based Installation of Istio Helm, the package manager for Kubernetes, simplifies the deployment of complex applications. We'll use Helm to install Istio on our KIND cluster. 1. Install Helm First, ensure Helm is installed on your system: curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 chmod 700 get_helm.sh ./get_helm.sh 2. Add the Istio Helm Repository Add the Istio release repository to Helm: helm repo add istio https://istio-release.storage.googleapis.com/charts helm repo update 3. Install Istio Using Helm Now, let's install the Istio base chart, the istiod service, and the Istio Ingress Gateway: # Install the Istio base chart helm install istio-base istio/base -n istio-system --create-namespace # Install the Istiod service helm install istiod istio/istiod -n istio-system --wait # Install the Istio Ingress Gateway helm install istio-ingress istio/gateway -n istio-system This sequence of commands sets up Istio on your Kubernetes cluster, creating a powerful platform for managing your microservices. To enable the Istio injection for the target namespace, use the following command. kubectl label namespace default istio-injection=enabled Exploring Istio's Features To demonstrate Istio's powerful capabilities in a microservices environment, let's use a practical example involving a Kubernetes cluster with Istio installed, and deploy a simple weather application. This application, running in a Docker container brainupgrade/weather-py, serves weather information. We'll illustrate how Istio can be utilized for traffic management, specifically demonstrating a canary release strategy, which is a method to roll out updates gradually to a small subset of users before rolling it out to the entire infrastructure. Step 1: Deploy the Weather Application First, let's deploy the initial version of our weather application using Kubernetes. We will deploy two versions of the application to simulate a canary release. Create a Kubernetes Deployment and Service for the weather application: apiVersion: apps/v1 kind: Deployment metadata: name: weather-v1 spec: replicas: 2 selector: matchLabels: app: weather version: v1 template: metadata: labels: app: weather version: v1 spec: containers: - name: weather image: brainupgrade/weather-py:v1 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: weather-service spec: ports: - port: 80 name: http selector: app: weather Apply this configuration with kubectl apply -f <file-name>.yaml. Step 2: Enable Traffic Management With Istio Now, let's use Istio to manage traffic to our weather application. We'll start by deploying a Gateway and a VirtualService to expose our application. apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: weather-gateway spec: selector: istio: ingress servers: - port: number: 80 name: http protocol: HTTP hosts: - "*" --- apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: weather spec: hosts: - "*" gateways: - weather-gateway http: - route: - destination: host: weather-service port: number: 80 This setup routes all traffic through the Istio Ingress Gateway to our weather-service. Step 3: Implementing Canary Release Let's assume we have a new version (v2) of our weather application that we want to roll out gradually. We'll adjust our Istio VirtualService to route a small percentage of the traffic to the new version. 1. Deploy version 2 of the weather application: apiVersion: apps/v1 kind: Deployment metadata: name: weather-v2 spec: replicas: 1 selector: matchLabels: app: weather version: v2 template: metadata: labels: app: weather version: v2 spec: containers: - name: weather image: brainupgrade/weather-py:v2 ports: - containerPort: 80 2. Adjust the Istio VirtualService to split traffic between v1 and v2: apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: weather spec: hosts: - "*" gateways: - weather-gateway http: - match: - uri: prefix: "/" route: - destination: host: weather-service port: number: 80 subset: v1 weight: 90 - destination: host: weather-service port: number: 80 subset: v2 weight: 10 This configuration routes 90% of the traffic to version 1 of the application and 10% to version 2, implementing a basic canary release. Also, enable the DestinationRule as well. See the following: apiVersion: networking.istio.io/v1beta1 kind: DestinationRule metadata: name: weather-service namespace: default spec: host: weather-service subsets: - name: v1 labels: version: v1 - name: v2 labels: version: v2 This example illustrates how Istio enables sophisticated traffic management strategies like canary releases in a microservices environment. By leveraging Istio, developers can ensure that new versions of their applications are gradually and safely exposed to users, minimizing the risk of introducing issues. Istio's service mesh architecture provides a powerful toolset for managing microservices, enhancing both the reliability and flexibility of application deployments. Istio and Kubernetes Services Istio and Kubernetes Services are both crucial components in the cloud-native ecosystem, but they serve different purposes and operate at different layers of the stack. Understanding how Istio differs from Kubernetes Services is essential for architects and developers looking to build robust, scalable, and secure microservices architectures. Kubernetes Services Kubernetes Services are a fundamental part of Kubernetes, providing an abstract way to expose an application running on a set of Pods as a network service. With Kubernetes Services, you can utilize the following: Discoverability: Assign a stable IP address and DNS name to a group of Pods, making them discoverable within the cluster. Load balancing: Distribute network traffic or requests among the Pods that constitute a service, improving application scalability and availability. Abstraction: Decouple the front-end service from the back-end workloads, allowing back-end Pods to be replaced or scaled without reconfiguring the front-end clients. Kubernetes Services focuses on internal cluster communication, load balancing, and service discovery. They operate at the L4 (TCP/UDP) layer, primarily dealing with IP addresses and ports. Istio Services Istio, on the other hand, extends the capabilities of Kubernetes Services by providing a comprehensive service mesh that operates at a higher level. It is designed to manage, secure, and observe microservices interactions across different environments. Istio's features include: Advanced traffic management: Beyond simple load balancing, Istio offers fine-grained control over traffic with rich routing rules, retries, failovers, and fault injection. It operates at L7 (HTTP/HTTPS/GRPC), allowing behavior to be controlled based on HTTP headers and URLs. Security: Istio provides end-to-end security, including strong identity-based authentication and authorization between services, transparently encrypting communication with mutual TLS, without requiring changes to application code. Observability: It offers detailed insights into the behavior of the microservices, including automatic metrics, logs, and traces for all traffic within a cluster, regardless of the service language or framework. Policy enforcement: Istio allows administrators to enforce policies across the service mesh, ensuring compliance with security, auditing, and operational policies. Key Differences Scope and Layer Kubernetes Services operates at the infrastructure layer, focusing on L4 (TCP/UDP) for service discovery and load balancing. Istio operates at the application layer, providing L7 (HTTP/HTTPS/GRPC) traffic management, security, and observability features. Capabilities While Kubernetes Services provides basic load balancing and service discovery, Istio offers advanced traffic management (like canary deployments and circuit breakers), secure service-to-service communication (with mutual TLS), and detailed observability (tracing, monitoring, and logging). Implementation and Overhead Kubernetes Services are integral to Kubernetes and require no additional installation. Istio, being a service mesh, is an add-on layer that introduces additional components (like Envoy sidecar proxies) into the application pods, which can add overhead but also provide enhanced control and visibility. Kubernetes Services and Istio complement each other in the cloud-native ecosystem. Kubernetes Services provides the basic necessary functionality for service discovery and load balancing within a Kubernetes cluster. Istio extends these capabilities, adding advanced traffic management, enhanced security features, and observability into microservices communications. For applications requiring fine-grained control over traffic, secure communication, and deep observability, integrating Istio with Kubernetes offers a powerful platform for managing complex microservices architectures. Conclusion Istio stands out as a transformative force in the realm of microservices, providing a comprehensive toolkit for managing the complexities of service-to-service communication in a cloud-native environment. By leveraging Istio, developers and architects can significantly streamline their operational processes, ensuring a robust, secure, and observable microservices architecture. Incorporating Istio into your microservices strategy not only simplifies operational challenges but also paves the way for innovative service management techniques. As we continue to explore and harness the capabilities of service meshes like Istio, the future of microservices looks promising, characterized by enhanced efficiency, security, and scalability.
Building a Sustainable Data Ecosystem
March 15, 2024 by
Understanding Softmax Activation Function for AI/ML Engineers
March 15, 2024 by CORE
Explainable AI: Making the Black Box Transparent
May 16, 2023 by CORE
Understanding the 2024 Cloud Security Landscape
March 16, 2024 by
OpenTofu Vs. Terraform: The Great IaC Dilemma
March 16, 2024 by
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
OpenTofu Vs. Terraform: The Great IaC Dilemma
March 16, 2024 by
Understanding the 2024 Cloud Security Landscape
March 16, 2024 by
OpenTofu Vs. Terraform: The Great IaC Dilemma
March 16, 2024 by
Low Code vs. Traditional Development: A Comprehensive Comparison
May 16, 2023 by
Telemetry Pipelines Workshop: Introduction To Fluent Bit
March 15, 2024 by CORE
Building a Sustainable Data Ecosystem
March 15, 2024 by
Five IntelliJ Idea Plugins That Will Change the Way You Code
May 15, 2023 by