Modern Digital Website Security: Prepare to face any form of malicious web activity and enable your sites to optimally serve your customers.
Low-Code Development: Learn the concepts of low code, features + use cases for professional devs, and the low-code implementation process.
The Testing, Tools, and Frameworks Zone encapsulates one of the final stages of the SDLC as it ensures that your application and/or environment is ready for deployment. From walking you through the tools and frameworks tailored to your specific development needs to leveraging testing practices to evaluate and verify that your product or application does what it is required to do, this Zone covers everything you need to set yourself up for success.
LLM: Trust, but Verify
The Four Steps of Regression Testing
During project and product development, software engineering teams need to make architectural decisions to reach their goals. These decisions can be technical or process-related. Technical: Deciding to use JBOSS Data Grid as a caching solution vs Amazon Elasticache or deciding to use the AWS Network Load Balancer (NLB) vs AWS Application Load Balancer (ALB). Process: Deciding to use a Content Management portal for sharing documents or project-related artifacts. Making these decisions is a time-consuming and difficult process, and it's essential that teams justify, document, and communicate these decisions to relevant stakeholders. Three major anti-patterns often emerge when making architectural decisions: No decision is made at all out of fear of making the wrong choice. A decision is made without any justification, and most of the time, people don’t understand why it was made and the use case or the scenario that has been considered. This results in the same topic being discussed multiple times. The decision isn’t captured in an architectural decision repository, so team members forget or don’t know that the decision was made. What Is an ADR? An Architecture Decision Record (ADR) is a document that captures a decision, including the context of how the decision was made and the consequences of adopting the decision. When Will You Write an ADR? ADRs are typically written when a significant architectural decision needs to be made, such as when selecting a new technology, framework, or design pattern or when making a trade-off between different architectural goals, such as performance, scalability, and maintainability. ADRs are also useful for documenting decisions that have already been made to ensure that the rationale behind them is clear to all members of the development team. ADRs also ensure that you are aligned with the organization’s IT strategies. ADRs typically include information such as the problem being addressed, the options considered, the decision made, the reasons behind the decision, and any relevant technical details. They may also include any implications or risks associated with the decision, as well as any future work that may be required because of the decision. Writing ADRs can help to promote transparency and collaboration within a development team, as well as provide a valuable resource for future developers who may need to understand the reasoning behind past decisions. Best Practices When Writing an ADR When writing an Architecture Decision Record (ADR), it's important to follow some best practices to ensure that the ADR is clear, useful, and easy to understand. Here are some best practices for writing an ADR: Start with a clear title: The title of the ADR should be clear and concise and should summarize the decision being made. Define the problem: Begin the ADR by clearly defining the problem or challenge that the decision is addressing. This helps to provide context for the decision and ensures that everyone understands the problem being solved. Describe the decision: Clearly describe the decision that has been made, including the alternatives considered and the reasons for selecting the chosen option. This should include any trade-offs or compromises that were made, as well as any technical details that are relevant. Explain the rationale: Provide a clear and detailed explanation of the rationale behind the decision. This should include any relevant business or technical considerations, as well as any risks or potential drawbacks. Document any implications: Document any implications of the decision, including any dependencies on other parts of the system, any impacts on performance or scalability, and any risks or issues that may arise because of the decision. Keep it concise: ADRs should be concise and easy to read. Avoid including unnecessary information or technical jargon and focus on providing clear and concise explanations of the decision-making process and its rationale. Keep it up-to-date: ADRs should be kept up-to-date as the project progresses. If new information or considerations arise that impact the decision, the ADR should be updated to reflect these changes. By following these best practices, ADRs can provide a clear and useful record of important architectural decisions and help to ensure that everyone on the team is aligned and informed about the reasoning behind those decisions. Example ADR Now that we have defined what an ADR is and the best practices to be followed when writing an ADR let’s try and put those best practices in writing an ADR. For writing an example ADR, we will try and document one of the solutions described in the blog, migrating unstructured data (files) from on-premises storage to AWS. In the blog, there are three scenarios and a solution for each of those scenarios. For this ADR example, we will pick the solution for Migrating from NAS to AWS using AWS DataSync. Plain Text Title: Migrating from NAS to AWS using AWS DataSync Status: Accepted Date: 6th October 2023 Context: Application A picks up incoming files from an Application X, processes them and generates data files that are 50–300 GB. That, then, becomes the input for another Application Y to consume. The data is shared by means of an NFS Storage accessible to all three applications. Application A is being migrated to AWS and the Applications X and Y continue to remain on-premises. We used AWS Elastic File System (EFS) to replace NFS on AWS. However, that makes it difficult for the applications to read/write from a common storage solution, and network latency slows down Application X and Application Y Decision: We will use AWS DataSync Service to perform the initial migration of nearly 1 TB of data from the on-premises NFS storage to AWS EFS AWS DataSync can transfer data between any two network storage or object storage. These could be network file systems (NFS), server message block (SMB) file servers, Hadoop distributed file systems (HDFS), self-managed object storage, AWS Snowcone, Amazon Simple Storage Service (Amazon S3) buckets, Amazon Elastic File System (Amazon EFS) file systems, Amazon FSx for Windows File Server file systems, Amazon FSx for Lustre file systems and Amazon FSx for OpenZFS file systems. To solve the need for the applications to read/write from a common storage solution and address the network latency involved during read/write operations across the Direct Connect, we scheduled a regular synchronization of the specific input and output folders using the AWS DataSync service between the NFS and EFS. This means that all three applications look at same set of files after the sync is complete. Consequences: Positive • No fixed/upfront cost and only $0.0125 per gigabyte (GB) for data transferred. Negative • Syncs can be scheduled at minimum one-hour intervals. This soft limit can be modified for up to 15-minutes intervals, however, that leads to performance issues and subsequent sync schedules getting queued up, which forms a loop. • Bidirectional Syncs were configured to run in a queued fashion. That is, only one-way sync can be executed at a time. Applications will have to read the files after the sync interval is completed. In our case, files are generated only one time per day, so this challenge was mitigated by scheduling the read/writes in a timely fashion. • AWS DataSync Agent (virtual appliance) must be installed on a dedicated VM on-premises. Compliance: Notes: Author(s): Rakesh Rao and Santhosh Kumar Ramabadran Version: 0.1 Changelog: 0.1: Initial proposed version While the above is one format, an ADR can be created in any format agreed with the stakeholders. It could be as simple as a Word document, a spreadsheet, or a presentation. When Will You Not Write an ADR? While Architecture Decision Records (ADRs) can be helpful in documenting important architectural decisions, there may be some cases where writing an ADR is not necessary or appropriate. Here are a few examples: Minor decisions: If a decision has minimal impact on the architecture of the system or is relatively straightforward, it may not be necessary to write an ADR. For example, if a team decides to update a library to a newer version, and the update is expected to have little impact on the overall architecture, an ADR may not be necessary. Temporary decisions: If a decision is expected to be temporary or is only applicable to a specific context or situation, it may not be necessary to write an ADR. For example, if a team decides to implement a temporary workaround for a bug, and the workaround is not expected to be a long-term solution, an ADR may not be necessary. Routine decisions: If a team makes routine decisions that are not particularly significant or require little discussion or debate, an ADR may not be necessary. For example, if a team decides to follow an established design pattern or uses a commonly used technology, an ADR may not be necessary. Existing documentation: If the decision has already been documented elsewhere, such as in project requirements or design documentation, it may not be necessary to create an ADR specifically for that decision. Ultimately, the decision of whether to write an ADR depends on the significance of the decision and the context in which it is being made. If the decision has a significant impact on the architecture of the system, involves trade-offs or alternatives, or is likely to have long-term implications, it is generally a good idea to create an ADR to document the decision-making process. Alternatives to ADR While Architecture Decision Records (ADRs) are a common and effective way to document important architectural decisions, there are several alternative approaches that can be used depending on the specific context and needs of a project. Here are a few alternatives to ADRs: Code comments: One simple alternative to ADRs is to use code comments to document architectural decisions directly within the codebase. This can be a useful approach for smaller projects or for teams that prefer a more lightweight approach to documentation. However, code comments can become difficult to manage and may not provide enough context or detail for more complex decisions. Design documents: Design documents can provide a more comprehensive and detailed way to document architectural decisions. These documents can include diagrams, flowcharts, and other visual aids to help explain the architecture of a system. However, design documents can be time-consuming to create and may become outdated as the project evolves. Wikis or knowledge bases: Wikis or knowledge bases can be used to document architectural decisions in a more flexible and searchable way than ADRs. This approach can be particularly useful for large or complex projects, as it allows teams to easily find and reference information related to specific architectural decisions. However, wikis and knowledge bases can also become difficult to manage and may require additional effort to keep up-to-date. Meetings and discussions: Another approach to documenting architectural decisions is to hold regular meetings or discussions to review and document decisions. This approach can be useful for teams that prioritize face-to-face communication and collaboration but may not be as effective for remote teams or those with members in different time zones. Ultimately, the best approach to documenting architectural decisions depends on the specific needs and context of a project. Teams should consider factors such as project size, team size, and communication preferences when deciding which approach to use.
When doing unit tests, you have probably found yourself in the situation of having to create objects over and over again. To do this, you must call the class constructor with the corresponding parameters. So far, nothing unusual, but most probably, there have been times when the values of some of these fields were irrelevant for testing or when you had to create nested "dummy" objects simply because they were mandatory in the constructor. All this has probably generated some frustration at some point and made you question whether you were doing it right or not; if that is really the way to do unit tests, then it would not be worth the effort. That is to say, typically, a test must have a clear objective. Therefore, it is expected that within the SUT (system under test) there are fields that really are the object of the test and, on the other hand, others are irrelevant. Let's take an example. Let's suppose that we have the class "Person" with the fields Name, Email, and Age. On the other hand, we want to do the unit tests of a service that, receiving a Person object, tells us if this one can travel for free by bus or not. We know that this calculation only depends on the age. Children under 14 years old travel for free. Therefore, in this case, the Name and Email fields are irrelevant. In this example, creating Person objects would not involve too much effort, but let's suppose that the fields of the Person class grow or nested objects start appearing: Address, Relatives (List of People), Phone List, etc. Now, there are several issues to consider: It is more laborious to create the objects. What happens when the constructor or the fields of the class change? When there are lists of objects, how many objects should I create? What values should I assign to the fields that do not influence the test? Is it good if the values are always the same, without any variability? Two well-known design patterns are usually used to solve this situation: Object Mother and Builder. In both cases, the idea is to have "helpers" that facilitate the creation of objects with the characteristics we need. Both approaches are widespread, are adequate, and favor the maintainability of the tests. However, they still do not resolve some issues: When changing the constructors, the code will stop compiling even if they are fields that do not affect the tests. When new fields appear, we must update the code that generates the objects for testing. Generating nested objects is still laborious. Mandatory and unused fields are hard coded and assigned by default, so the tests have no variability. One of the Java libraries that can solve these problems is "EasyRandom." Next, we will see details of its operation. What is EasyRandom? EasyRandom is a Java library that facilitates the generation of random data for unit and integration testing. The idea behind EasyRandom is to provide a simple way to create objects with random values that can be used in tests. Instead of manually defining values for each class attribute in each test, EasyRandom automates this process, automatically generating random data for each attribute. This library handles primitive data types, custom classes, collections, and other types of objects. It can also be configured to respect specific rules and data generation restrictions, making it quite flexible. Here is a basic example of how EasyRandom can be used to generate a random object: Java public class EasyRandomExample { public static void main(String[] args) { EasyRandom easyRandom = new EasyRandom(); Person randomPerson = easyRandom.nextObject(Person.class); System.out.println(randomPerson); } } In this example, Person is a dummy class, and easyRandom.nextObject(Person.class) generates an instance of Person with random values for its attributes. As can be seen, the generation of these objects does not depend on the class constructor, so the test code will continue to compile, even if there are changes in the SUT. This would solve one of the biggest problems in maintaining an automatic test suite. Why Is It Interesting? Using the EasyRandom library for testing your applications has several advantages: Simplified random data generation: It automates generating random data for your objects, saving you from writing repetitive code for each test. Facilitates unit and integration testing: By automatically generating test objects, you can focus on testing the code's behavior instead of worrying about manually creating test data. Data customization: Although it generates random data by default, EasyRandom also allows you to customize certain fields or attributes if necessary, allowing you to adjust the generation according to your needs. Reduced human error: Manual generation of test data can lead to errors, especially when dealing with many fields and combinations. EasyRandom helps minimize human errors by generating consistent random data. Simplified maintenance: If your class requirements change (new fields, types, etc.), you do not need to manually update your test data, as EasyRandom will generate them automatically. Improved readability: Using EasyRandom makes your tests cleaner and more readable since you do not need to define test values explicitly in each case. Faster test development: By reducing the time spent creating test objects, you can develop tests faster and more effectively. Ease of use: Adding this library to our Java projects is practically immediate, and it is extremely easy to use. Where Can You Apply It? This library will allow us to simplify the creation of objects for our unit tests, but it can also be of great help when we need to generate a set of test data. This can be achieved by using the DTOs of our application and generating random objects to later dump them into a database or file. Where it is not recommended: this library may not be worthwhile in projects where object generation is not complex or where we need precise control over all the fields of the objects involved in the test. How To Use EasyRandom Let's see EasyRandom in action with a real example, environment used, and prerequisites. Prerequisites Java 8+ Maven or Gradle Initial Setup Inside our project, we must add a new dependency. The pom.xml file would look like this: XML <dependency> <groupId>org.jeasy</groupId> <artifactId>easy-random-core</artifactId> <version>5.0.0</version> </dependency> Basic Use Case The most basic use case has already been seen before. In this example, values are assigned to the fields of the person class in a completely random way. Obviously, when testing, we will need to have control over some specific fields. Let's see this as an example. Recall that EasyRandom can also be used with primitive types. Therefore, our example could look like this. Java public class PersonServiceTest { private final EasyRandom easyRandom = new EasyRandom(); private final PersonService personService = new PersonService(); @Test public void testIsAdult() { Person adultPerson = easyRandom.nextObject(Person.class); adultPerson.setAge(18 + easyRandom.nextInt(80)); assertTrue(personService.isAdult(adultPerson)); } @Test public void testIsNotAdult() { Person minorPerson = easyRandom.nextObject(Person.class); minorPerson.setAge(easyRandom.nextInt(17)); assertFalse(personService.isAdult(minorPerson)); } } As we can see, this way of generating test objects protects us from changes in the "Person" class and allows us to focus only on the field we are interested in. We can also use this library to generate lists of random objects. Java @Test void generateObjectsList() { EasyRandom generator = new EasyRandom(); //Generamos una lista de 5 Personas List<Person> persons = generator.objects(Person.class, 5) .collect(Collectors.toList()); assertEquals(5, persons.size()); } This test, in itself, is not very useful. It is simply to demonstrate the ability to generate lists, which could be used to dump data into a database. Generation of Parameterized Data Let's see now how to use this library to have more precise control in generating the object itself. This can be done by parameterization. Set the value of a field. Let's imagine the case that for our tests, we want to keep certain values constant (an ID, a name, an address, etc.) To achieve this, we would have to configure the initialization of objects using "EasyRandomParameters" and locate the parameters by their name. Let's see how: Java EasyRandomParameters params = new EasyRandomParameters(); // Asignar un valor al campo por medio de una función lamba params.randomize(named("age"),()-> 5); EasyRandom easyRandom = new EasyRandom(params); // El objeto tendrá siempre una edad de 5 Person person = easyRandom.nextObject(Person.class); Of course, the same could be done with collections or complex objects. Let's suppose that our class Person, contains an Address class inside and that, in addition, we want to generate a list of two persons. Let's see a more complete example: Java EasyRandomParameters parameters = new EasyRandomParameters() .randomize(Address.class, () -> new Address("Random St.", "Random City")) EasyRandom easyRandom = new EasyRandom(parameters); return Arrays.asList( easyRandom.nextObject(Person.class), easyRandom.nextObject(Person.class) ); Suppose now that a person can have several addresses. This would mean the "Address" field will be a list inside the "Person" class. With this library, we can also make our collections have a variable size. This is something that we can also do using parameters. Java EasyRandomParameters parameters = new EasyRandomParameters() .randomize(Address.class, () -> new Address("Random St.", "Random City")) .collectionSizeRange(2, 10); EasyRandom easyRandom = new EasyRandom(parameters); // El objeto tendrá una lista de entre 2 y 10 direcciones Person person = easyRandom.nextObject(Person.class); Setting Pseudo-Random Fields As we have seen, setting values is quite simple and straightforward. But what if we want to control the randomness of the data? We want to generate random names of people, but still names and not just strings of unconnected characters. This same need is perhaps clearer when we are interested in having randomness in fields such as email, phone number, ID number, card number, city name, etc. For this purpose, it is useful to use other data generation libraries. One of the best-known is Faker. Combining both libraries, we could get a code like this: Java EasyRandomParameters params = new EasyRandomParameters(); //Generar número entre 0 y 17 params.randomize(named("age"), () -> Faker.instance().number().numberBetween(0, 17)); // Generar nombre "reales" aleatorios params.randomize(named("name"), () -> Faker.instance().name().fullName()); EasyRandom easyRandom = new EasyRandom(params); Person person = easyRandom.nextObject(Person.class); There are a multitude of parameters that allow us to control the generation of objects. Closing EasyRandom is a library that should be part of your backpack if you develop unit tests, as it helps maintain unit tests. In addition, and although it may seem strange, establishing some controlled randomness in tests may not be a bad thing. In a way, it is a way to generate new test cases automatically and will increase the probability of finding bugs in code.
"Is it working now?" asked the Product Owner. "Well... I hope so. You know this bug, it can not really be reproduced locally, therefore I can not really test if it works now. The best I can do is deploy it to prod and wait." The answer did not make the Product Owner particularly happy, but he also knew the bug appears only when an API called by his application has a quick downtime exactly at the time when the user clicks on a specific button. The daily stand-up, which was the environment of the small conversation, went on, and nobody wanted to dedicate much time or attention to the bug mentioned - except for Jack, the latest addition to the team, who was concerned about this "hit deploy and roll the dice" approach. Later that day, he actually reached out to Bill - the one who fixed the bug. "Can you tell me some details? Can not we write some unit tests or so?" "Well, we can not. I actually did not really write much code. Still, I have strong faith, because I added @Retryable to ensure the API call is being re-tried if it fails. What's more, I added @Cacheable to reduce the amount of calls fired up against the API in the first place. As I said in the daily, we can not really test it, but it will work on prod." With that Bill wanted to close this topic and focus on the new task he picked up, but Jack was resistant: "I would still love to have automated tests on that," stated Jack. "On what? You should not unit-test Spring. Those guys know what they are doing." "Well, to be honest, I am not worried about Spring not working. I am worried about us not using it the right way." The Challenge This is the point: when we can abandon Jack and Bill, as we arrived at in the main message of this article, I have seen the following pattern multiple times. Someone resolves an issue by utilizing some framework-provided, out-of-the-box functionality. In many cases, it is just applying an annotation to a method or a class and the following sequence happens: The developer argues there is nothing to write automated tests for, as it is a standard feature of the framework that is being used. The developer might or might not at least test it manually (and like in the example above, the manual test might happen on a test environment, or might happen only on a prod environment). At some point, it breaks, and half of the team does not know why it is broken, the other half does not know why it used to work at all. Of course, this scenario can apply to any development, but my observation is that framework-provided features (such as re-try something, cache something, etc.) are really tempting the developers to skip writing automated tests. On a side note, you can find my more generic thoughts on testing in a previous article. Of course, I do not want to argue for testing a framework itself (although no framework is bug-free, you might find actual bugs within the framework). But I am strongly arguing for testing that you are using the framework properly. In many cases it can be tricky, therefore, in this tutorial, you will find a typically hard-to-test code, tips about how to rework and test it, and the final reworked version of the same code. Code That Is Hard To Test Take a look at the following example: Java @Slf4j @Component public class Before { @Retryable @Cacheable("titlesFromMainPage") public List<String> getTitlesFromMainPage(final String url) { final RestTemplate restTemplate = new RestTemplate(); log.info("Going to fire up a request against {}", url); final var responseEntity = restTemplate.getForEntity(url, String.class); final var content = responseEntity.getBody(); final Pattern pattern = Pattern.compile("<p class=\"resource-title\">\n(.*)\n.*</p>"); final Matcher matcher = pattern.matcher(content); final List<String> result = new ArrayList<>(); while (matcher.find()) { result.add(matcher.group(1).trim()); } log.info("Found titles: {}", result); return result; } } It is fair to say it is tricky to test. Probably your best shot would be to set up a mock server to respond to your call (for example, by using WireMock) as follows: @WireMockTest public class BeforeTest { private static final Before BEFORE_INSTANCE = new Before(); @Test public void testCall(final WireMockRuntimeInfo wmRuntimeInfo) { stubFor(get("/test-url").willReturn(ok( "<p class=\"resource-title\">\nFirst Title\n.*</p><p class=\"resource-title\">\nOther Title\n.*</p>"))); final var titles = BEFORE_INSTANCE.getTitlesFromMainPage("http://localhost:"+wmRuntimeInfo.getHttpPort()+"/test-url"); assertEquals(List.of("First Title", "Other Title"), titles); } } Many of the average developers would be happy with this test. Especially, after noticing, that this test generates 100% line coverage. And some of them would entirely forget to add @EnableCaching ... ... or add @EnableRetry ... ... or to create a CacheManager bean That would not only lead to multiple rounds of deployment and manual testing, but if the developers are not ready to admit (even to themselves) that it is their mistake, such excuses like, "Spring does not work," are going to be made. Let’s Make Life Better! Although my plan is to describe code changes, the point is not only to have nicer code but also to help developers be more reliable and lower the number of bug tickets. Let's not forget, that a couple of unforeseen bug tickets can ruin even the most carefully established sprint plans and can seriously damage the reputation of the developer team. Just think about the experience that businesses have: They got something delivered that does not work as expected The new (in progress) features are likely not to be delivered on time because the team is busy fixing bugs from previous releases. Back to the code: you can easily identify a couple of problems like the only method in the class is doing multiple things (a.k.a., has multiple responsibilities), and the test entirely ignores the fact that the class is serving as a Spring bean and actually depending on Spring's features. Rework the class to have more methods with less responsibility. In cases you are depending on something brought by annotations, I would suggest having a method that serves only as a proxy to another method: it will make your life seriously easier when you are writing tests to find out if you used the annotation properly. Step 1 probably led you to have one public method which is going to be called by the actual business callers, and a group of private methods (called by each other and the public method). Let's make them default visibility. This enables you to call them from classes that are in the same package - just like your unit test class. Split your unit test based on what aspect is tested. Although in several cases it is just straightforward to have exactly one test class for each business class, nobody actually restricts you to have multiple test classes for the same test class. Define what you are expecting: for example, in the test methods, when you want to ensure that retry is working, you do not care about the actual call (as for how the business result is created, you will test that logic in a different test anyway). There you have such expectations as: if I call the method and it throws an exception once, it will be called again. If it fails X times, an exception is thrown. You can also define your expectations against cache: you expect subsequent calls on the public method to lead to only one call of the internal method. Final Code After performing Steps 1 and 2, the business class becomes: Java @Slf4j @Component public class After { @Retryable @Cacheable("titlesFromMainPage") public List<String> getTitlesFromMainPage(final String url) { return getTitlesFromMainPageInternal(url); } List<String> getTitlesFromMainPageInternal(final String url) { log.info("Going to fire up a request against {}", url); final var content = getContentsOf(url); final var titles = extractTitlesFrom(content); log.info("Found titles: {}", titles); return titles; } String getContentsOf(final String url) { final RestTemplate restTemplate = new RestTemplate(); final var responseEntity = restTemplate.getForEntity(url, String.class); return responseEntity.getBody(); } List<String> extractTitlesFrom(final String content) { final Pattern pattern = Pattern.compile("<p class=\"resource-title\">\n(.*)\n.*</p>"); final Matcher matcher = pattern.matcher(content); final List<String> result = new ArrayList<>(); while (matcher.find()) { result.add(matcher.group(1).trim()); } return result; } } On a side note: you can, of course, spit the original class even to multiple classes. For example: One proxy class which only contains @Retryable and @Cacheable (contains only getTitlesFromMainPage method) One class that only focuses on REST calls (contains only getContentsOf method) One class that is responsible for extracting the titles from HTML content (contains only extractTitlesFrom method) One class which orchestrates fetching and processing the HTML content (contains only getTitlesFromMainPageInternal method) I am convinced that although in that case, the scope of the classes would be even more strict, the overall readability and understandability of the code would suffer from having many classes with 2-3 lines of business code. Steps 3 and 4 lead you to the following test classes: Java @ExtendWith(MockitoExtension.class) public class AfterTest { @Spy private After after = new After(); @Test public void mainFlowFetchesAndExtractsContent() { doReturn("contents").when(after).getContentsOf("test-url"); doReturn(List.of("title1", "title2")).when(after).extractTitlesFrom("contents"); assertEquals(List.of("title1", "title2"), after.getTitlesFromMainPage("test-url")); } @Test public void extractContent() { final String htmlContent = "<p class=\"resource-title\">\nFirst Title\n.*</p><p class=\"resource-title\">\nOther Title\n.*</p>"; assertEquals(List.of("First Title", "Other Title"), after.extractTitlesFrom(htmlContent)); } } Java @WireMockTest public class AfterWireMockTest { private final After after = new After(); @Test public void getContents_firesUpGet_andReturnsResultUnmodified(final WireMockRuntimeInfo wmRuntimeInfo) { final String testContent = "some totally random string content"; stubFor(get("/test-url").willReturn(ok(testContent))); assertEquals(testContent, after.getContentsOf("http://localhost:" + wmRuntimeInfo.getHttpPort() + "/test-url")); } } Java @SpringBootTest public class AfterSpringTest { @Autowired private EmptyAfter after; @Autowired private CacheManager cacheManager; @BeforeEach public void reset() { after.reset(); cacheManager.getCache("titlesFromMainPage").clear(); } @Test public void noException_oneInvocationOfInnerMethod() { after.getTitlesFromMainPage("any-test-url"); assertEquals(1, after.getNumberOfInvocations()); } @Test public void oneException_twoInvocationsOfInnerMethod() { after.setNumberOfExceptionsToThrow(1); after.getTitlesFromMainPage("any-test-url"); assertEquals(2, after.getNumberOfInvocations()); } @Test public void twoExceptions_threeInvocationsOfInnerMethod() { after.setNumberOfExceptionsToThrow(2); after.getTitlesFromMainPage("any-test-url"); assertEquals(3, after.getNumberOfInvocations()); } @Test public void threeExceptions_threeInvocationsOfInnerMethod_andThrows() { after.setNumberOfExceptionsToThrow(3); assertThrows(RuntimeException.class, () -> after.getTitlesFromMainPage("any-test-url")); assertEquals(3, after.getNumberOfInvocations()); } @Test public void noException_twoPublicCalls_InvocationsOfInnerMethod() { assertEquals(0, ((Map)cacheManager.getCache("titlesFromMainPage").getNativeCache()).size()); after.getTitlesFromMainPage("any-test-url"); assertEquals(1, after.getNumberOfInvocations()); assertEquals(1, ((Map)cacheManager.getCache("titlesFromMainPage").getNativeCache()).size()); after.getTitlesFromMainPage("any-test-url"); assertEquals(1, after.getNumberOfInvocations()); assertEquals(1, ((Map)cacheManager.getCache("titlesFromMainPage").getNativeCache()).size()); } @TestConfiguration public static class TestConfig { @Bean public EmptyAfter getAfter() { return new EmptyAfter(); } } @Slf4j public static class EmptyAfter extends After { @Getter private int numberOfInvocations = 0; @Setter private int numberOfExceptionsToThrow = 0; void reset() { numberOfInvocations = 0; numberOfExceptionsToThrow = 0; } @Override List<String> getTitlesFromMainPageInternal(String url) { numberOfInvocations++; if (numberOfExceptionsToThrow > 0) { numberOfExceptionsToThrow--; log.info("EmptyAfter throws exception now"); throw new RuntimeException(); } log.info("Empty after returns empty list now"); return List.of(); } } } Note that the usage of various test frameworks is separated: the class that actually tests if Spring features are used correctly has SpringRunner, but is not aware of WireMock and vice-versa. There is no "dangling" extra configuration in the test classes, which is used only by a fraction of the test methods in a given test class. On a side note to AfterSpringTest: Usage of @DirtiesContext on the class could be an alternative to manually resetting the cache in reset() method, but doing the clean-up manually is a more performant way. My advice on this question is: Do a manual reset if the scope of what to reset is small (this is normally the case in unit tests). Reset the beans by annotation if many beans are involved or cleaning the context would require complex logic (this is the case in many integration and system tests). You can find the complete code on GitHub. Final Thoughts After all the reworking and creating extra test classes, what would happen now if you delete @EnableRetry or @EnableCaching from the configuration? What would happen if someone would delete or even modify @Retryable or @Cacheable on the business method? Go ahead and try it out! Or trust me when I say unit tests would fail. And what would happen if a new member joins the team to work on such code? Based on the tests, he would know what is the expected behavior of the class. Tests are important. Quality tests can help you to produce code that others can better understand, can help you to be more reliable, and identify bugs faster. Tests can be tricky, and tests can be hard to write. But never forget, that if someone says, "That can not be tested," in the overwhelming majority of cases, it only means "I don't know how to test it and not caring to figure it out."
Allure Report is a utility that processes test results collected by a compatible test framework and produces an HTML report. Allure Report is indeed a powerful open-source tool designed to generate clear and concise test execution reports. It provides a detailed and visually appealing representation of test results, making it easier for teams to understand the test outcomes and identify issues. Cypress is a popular JavaScript testing framework that is known for its ease of use and fast execution speed. Integration of Allure with Cypress can be used to generate comprehensive and visually appealing reports of the test cases. Allure reports include detailed information about each test case, including its status, duration, steps, and screenshots. About Allure Report? Allure is an open-source framework designed for creating interactive and comprehensive test reports. It is commonly used in software testing to generate detailed and visually appealing reports for test execution results. Allure provides a clear and concise way to represent test outcomes, making it easier for both technical and non-technical stakeholders to understand the test results. Allure reports allow everyone participating in the development process to extract maximum useful information from the everyday execution of tests. Why Allure Reports? An open-source framework designed to create test execution reports that are clear to everyone on the team. From the dev/qa perspective, Allure reports shorten common defect life cycles: test failures can be divided into bugs and broken tests, also logs, steps, fixtures, attachments, timings, history, and integrations with TMS and bug-tracking systems can be configured, so the responsible developers and testers will have all information at hand. Reasons Why To Use Allure Reports Below are some key reasons why you choose Allure Reports: Clear Visualization: Allure creates dynamic reports with eye-catching graphs, charts, and other graphical features. Test results are provided in a way that is easy for stakeholders of both technical and non-technical backgrounds to understand. Detailed Test Steps: Each test step in Allure is well described, including the input data, anticipated findings, and actual results. The clarity of test execution can be improved by including screenshots and logs related to each stage. History and Trends: You may compare current results with previous runs using Allure’s historical test execution data storage feature. Teams may monitor the development of testing efforts over time by analyzing historical patterns. Automated Reports: Allure reports may be created automatically as an integral part of Continuous Integration (CI) pipelines, ensuring that the most recent reports are accessible during each test run. The reporting process is made simpler by integration with CI platforms, which also provide the development team with real-time feedback. Support for Multiple Languages and Frameworks: Programming languages supported by Allure include Java, Python, Ruby, and JavaScript. It easily integrates with well-known testing frameworks like JUnit, TestNG, NUnit, and others. Set-up Cypress Installing Cypress Below are the steps to install Cypress. However, you can go through this blog to get started with Cypress testing. Step 1: Create a folder and Generate package.json Create a project, naming it ‘talent500_Allure-Cypress’ Use the npm init command to create a package.json file. Step 2: Run the below command to install Cypress In the project folder, run > npm install — save-dev cypress@11.2.0 You can Cypress version 11.2.0 is reflected in package.json Set-up Allure Reports Step 1: Allure Installation Enter the below commands in your command line to install Allure. Using yarn: yarn add -D@shelex/cypress-allure-plugin Or Using npm: npm i -D @shelex/cypress-allure-plugin npm install –save-dev mocha-allure-reporter allure-commandline Step 2: Cypress Config File Update Cypress Config File with below code under file ‘cypress.config.js’ /// <reference types="@shelex/cypress-allure-plugin" /> const { defineConfig } = require("cypress"); const allureWriter = require("@shelex/cypress-allure-plugin/writer"); module.exports = defineConfig({ e2e: { setupNodeEvents(on, config) { allureWriter(on, config); return config; } } }); Here’s a breakdown of what this above configuration file is doing: <reference types=”@shelex/cypress-allure-plugin” /> : This line is a reference directive for TypeScript, indicating that the TypeScript compiler should include type definitions from the @shelex/cypress-allure-plugin package. const { defineConfig } = require(“cypress”): Importing the defineConfig function from the Cypress package. const allureWriter = require(“@shelex/cypress-allure-plugin/writer”): Importing the allureWriter function from the @shelex/cypress-allure-plugin/writer module. This function is likely used to set up Allure reporting. module.exports = defineConfig({ /* … */ }): Exporting a Cypress configuration object. Inside this object, you have an e2e property, which seems to be setting up a custom Node.js event called setupNodeEvents. This event likely hooks into Cypress’ test execution process. setupNodeEvents(on, config) { /* … */ }: This is where you configure the setupNodeEvents event. The allureWriter function is passed the on and config objects. You’re using this event to set up the Allure reporting with Cypress. allureWriter(on, config): This line invokes the allureWriter function, passing the on and config objects. The specifics of what this function does are determined by the @shelex/cypress-allure-plugin package. It’s likely responsible for integrating Allure reporting into your Cypress tests. return config: Finally, the modified config object is returned. This ensures that any changes made within the setupNodeEvents function are applied to the overall Cypress configuration. Step 3: Update index.js Then register the command of allure plugin under the path cypress/support/index.js file import ‘@shelex/cypress-allure-plugin’; Step 4: Update package.json with below Script { "name": "talent500_allure-cypress", "version": "1.0.0", "description": "", "main": "index.js", "scripts": { "clean:folders": "rm -R -f allure-report/* && rm -R -f allure-results/*", "cy:run-test": "cypress run - env allure=true", "generate-allure:report": "allure generate allure-results - clean -o allure-report && allure open allure-report", "tests": "npm run clean:folders && npm run cy:run-test && npm run generate-allure:report" }, "author": "Kailash Pathak", "license": "ISC", "devDependencies": { "@shelex/cypress-allure-plugin": "².40.0", "allure-commandline": "².24.1", "cypress": "¹¹.2.0", "mocha-allure-reporter": "¹.4.0" } } Walkthrough of Package.json File Here’s a breakdown of the key information in this file: Name: ‘talent500_allure-cypress’ — This is the name of your project or package. Version: 1.0.0 — This indicates the version of your project. Versions are typically in the format of major.minor.patch. Description: (empty) — A brief description of your project could be placed here. Main: index.js — This specifies the entry point of your application (the main JavaScript file). Scripts: clean:folders: This script removes the contents of allure-report and allure-results folders. cy:run-test: This script runs enable allure integration enabled. generate-allure:report: This script generates an Allure report using the results from the tests. tests: This is a custom script that runs in sequence: cleaning folders, running tests, and generating the Allure report. Author: Kailash Pathak — The author of the project. License: ISC — The type of license for your project. ISC is a permissive open source license similar to the MIT License. DevDependencies: Inside this section we have placed all devDependenciesi.e Cypress (¹¹.2.0), Allure plugin for Cypress (².40.0), Allure commandline tool (².24.1), and Mocha Allure reporter (¹.4.0). Create Test Case Let’s create a test case for Open the URL. Enter Email and Password. Verify User should be logged in by verifying the text “PROFILE”. Logout from the Application. Verify text “Opportunities Favor The Bold” after logout. /// <reference types="cypress" /> describe("https://talent500.co/ , Login & Logout ", () => { it("Logs in successfully and verifies links in header", () => { cy.visit("https://talent500.co/auth/signin"); cy.get('[name="email"]').focus(); cy.get('[name="email"]').type("applitoolsautomation@yopmail.com"); cy.get('[name="password"]').focus(); cy.get('[name="password"]').type("Test@123"); cy.get('[type="submit"]').eq(1).click({ force: true }); cy.contains("PROFILE").should("be.visible"); cy.get('img[alt="DropDown Button"]').click({ force: true }); cy.contains("Logout").click(); cy.contains("Opportunities favor the bold").should("be.visible"); }); }); Execute the Test Case Let's run the command: npm run tests npm run tests As we run the test case using the command ‘npm run tests’ . You can see the below command start executing in terminal “npm run clean:folders && npm run cy:run-test && npm run generate-allure:report” Commands are run in the below sequence: 1. First command, ‘npm run clean:folders’ will clean the folders (allure:report,allure:results ) if already created. 2. Second command, ‘npm run cy:run-test’ run the test cases from e2e folders. 3. Third command, ‘npm run generate-allure:report’ will generate the allure report. In the below screenshot, you can see one test case found for execution, and test case execution is started. In the below screenshot, you can see test cases are executed successfully. Finally, you can see the below command run and generate the allure report. ‘allure generate allure-results — clean -o allure-report && allure open allure-report.’ Fail Scenario Add some more test cases and failed some of the test cases. In the below screenshot, you can see I have added one more test case, and initially, both test cases are passing. Fail Case Let’s fail one of the test cases to see how the report looks when test cases fail. In the report below, you can see we have a total of three test cases in the first test case. Among three test cases, one test case is failing, and two test cases are passing. Wrapping Up Integrating Allure with Cypress elevates the quality of test reporting by generating comprehensive, visually appealing summaries of test cases. Allure reports provide a thorough analysis of each test, including crucial information like status, execution time, specific actions taken throughout the test, and supplemental screenshots. This degree of detail gives stakeholders and testers a thorough understanding. By leveraging this integration, development teams gain valuable insights into the performance and reliability of their applications, fostering continuous improvement and ensuring the delivery of high-quality software products.
When it comes to software development, application integration testing often finds itself in a somewhat nebulous space. While most are familiar with the processes of the unit and functional testing, integration testing remains an elusive subject for some. We often ask ourselves, "Why is it important?" or "How can it help me in my overall development workflow?" With a shift towards microservices and highly modular architectures, the importance of application integration testing has never been greater. In this discussion, we take a different approach to dissect this subject by emphasizing the need for strategic planning, scalability considerations, and ROI metrics. Strategy Before Code Strategies play an integral role in any software development lifecycle, including integration testing. While the common practice has been to delve into the code first, ask yourself: "Do I understand the dependencies and interactions my code will encounter in a live environment?" Your code doesn't exist in a vacuum; it interacts with other modules, third-party services, and databases. Thus, the lack of a strategic approach to integration testing is the root of many issues that materialize in production environments. Scale With Your Architecture As companies shift toward microservices and distributed architectures, integration testing has been burdened with complexity. However, look at this as an opportunity rather than a hindrance. The modular architecture of microservices actually makes it easier to pinpoint issues if your integration testing is in place. That's the difference between finding a needle in a haystack and finding it in a well-organized toolbox. Your integration testing methods should evolve in tandem with your architecture, ensuring that each new service or module doesn't jeopardize the system's integrity. Measuring ROI in Application Integration Testing In the software development world, the term 'Return on Investment' (ROI) is not commonly associated with testing practices, especially something as technical as Application Integration Testing. But understanding the ROI of this critical process can have profound implications on a company's efficiency, effectiveness, and overall profitability. Why Measure ROI in Integration Testing? Cost Savings: At its core, application integration testing is about identifying and fixing issues before they reach a production environment. Every defect caught during this phase saves multiple hours, and sometimes days, of debugging post-deployment. By avoiding potential rollbacks, hotfixes, or system outages, the organization can realize substantial cost savings. Enhanced Productivity: Efficient integration testing means that developers spend less time troubleshooting and more time on productive tasks like building new features or optimizing existing ones. This boost in productivity can speed up release cycles and foster innovation. Improved Reputation: Software glitches, especially ones in a live environment, can tarnish a company's reputation. Effective integration testing reduces the chances of such occurrences, ensuring that the company's public image remains untarnished. Metrics To Consider for ROI Calculation Defect Detection Rate: By tracking the number of defects identified during integration testing versus post-deployment, you can understand the efficacy of your tests. A higher rate during the testing phase indicates a better ROI. Time to Market: By measuring the release cycle duration before and after optimizing integration tests, companies can gauge the efficiency gains. Faster release cycles, without compromising on quality, indicate a favorable ROI. Downtime Metrics: Any downtime post-deployment can result in lost revenue, especially for businesses that heavily rely on online platforms. By measuring downtime before and after refining integration tests, companies can quantify the financial benefits. Cost of Testing vs. Cost of Failure: Calculate the resources spent on integration testing (tools, man-hours, etc.) and compare it against the estimated cost of potential failures, outages, or rollbacks in the absence of such testing. Customer Experience Scores: By monitoring metrics like Net Promoter Score (NPS) or Customer Satisfaction (CSAT) post-release, companies can gauge the tangible benefits of flawless software integrations on user experience. A Case in Point Imagine a company that, after implementing a comprehensive application integration testing regime, notices a 50% reduction in post-deployment defects, a 20% faster release cycle, and a significant decrease in downtimes. While the costs associated with setting up the testing processes might have been high (hiring experts, purchasing tools, etc.), the subsequent benefits, quicker time to market, enhanced user satisfaction, and reduced post-deployment firefighting all contribute to a positive ROI. The Takeaway While application integration testing may not be the most glamorous aspect of software development, its role is undeniable. The subject has a depth that's often not entirely appreciated until we take a step back to look at it from these three different angles—strategic planning, scalability, and ROI. Once we shift our approach to focus on these aspects, the significance and methods of application integration testing become evident. Whether you're a developer or a CTO, this shift in perspective can open new doors and create opportunities for both efficiency and innovation in your software development process.
In the world of software architecture, which is still in its infancy in absolute terms, change is still rapid and structural. Deep paradigms can be overturned, changing the structure of information systems. As coupling has become a key issue for many information systems, new architectures have emerged, notably SOA (Service Oriented Architecture), followed by microservices. These two fairly widespread architectures certainly make it possible to solve coupling problems, but as always there are certain trade-offs to be made. The complexity of testing is one of them. Like a balanced equation, an information system is an equivalence between a technical expression and a functional expression, and if one changes, we need to be able to guarantee the validity of the equivalence between the two. To attain this, we need to be able to test all parties using tools that can establish this equality. When releasing a new version of a microservice, it’s fairly easy to run unit and even integration tests before and during deployment to validate its internal operation. But how do you guarantee that its integration with the other microservices will always be valid? How can you be sure that a system made up of a constellation of microservices will behave as expected if you only run tests on each component? How can we guarantee that the new technical expression of our equation will always give the same functional expression? By testing it. It can be manual, of course, but it can also be automated*. Let’s take a look at this automation, using two technologies we’ve used, Selenium and Karate. The aim of this study is not to make a theoretical comparison, of which there are so many, but a concrete one. If a developer today wants to use behavior-driven development, what will he have to do with one of these options? The study will first provide a quick analysis of the functionalities offered by both frameworks. We will then delve into the technical aspects, using a specific use case with a focus on Programming and CI/CD. Finally, we will examine the communities surrounding both frameworks. Selenium will not be studied on its own; in order to compare a level of functionality equivalent to that of Karate, it will be used with Cucumber. This will make it possible to test technical packages that allow automatic tests to be written in natural language, thus satisfying a BDD requirement. In our case, we will opt for the Java version of Selenium, although other alternatives do exist. Features Selenium/Cucumber Selenium Selenium IDE: Enables recording of actions performed on a browser, thisFirefox plug-in saves recorded scenarios as ”side” files for future use. Selenium WebDriver: This is a toolkit for interacting with different web browsers using the Gecko and ChromeDriver drivers. We’ll be using this toolkit if we opt for Selenium. It’s available in several languages, including Java, JavaScript, and Python. Selenium Grid: Enables WebDriver scripts to be executed on remote (or real) machines by sending commands from the client to remote browser instances. The aim is to provide a straightforward method of running tests concurrently on multiple machines. Cucumber Cucumber is an open-source tool for behaviour-based testing BDD. It describes expected software behavior in a natural language that can be understood by both technical and non-technical stakeholders. This language is referred to as Gherkinand is used to explain functionalities in a clear and structured manner. Each test can be automated through code (automated behavior using Selenium). This program is known as glue code and can be written in various languages such as Java, Csharp, and Ruby, among others. However, adhering to the specifics outlined in the introduction, we will focus solely on the Java implementation. It can also produce comprehensive execution reports to facilitate the reading of test execution outcomes. Karate This framework was originally based on Cucumber until its release 0.8.0, when it was separated from it. This decision proved to be beneficial. Nevertheless, it still uses Gherkin expressions for improved clarity, readability, and test organization similar to Cucumber. API test automation: Karate’s initial foundation is the creation of API tests from Gherkin files. Other features have been subsequently integrated to enhance its capabilities. It is a direct competitor to REST Assured. Mocks: This section facilitates the generation of API mocks, which are highly advantageous in microservice scenarios or for separating front-end and back-end teams. Performance testing: Based on API testing, Karate incorporates Gatling to avoid having to rewrite user flows, by applying performance testing to pre-existing API tests. UI automation: Finally, Karate provides UI tests that automate user behavior by interacting with the DOM. These tests are written in Karate DSL, based on the Gherkin language. Programming Use Case Description Open a Google browser page. Search for Martin Fowler. Click on the first result that contains Martin Fowler. Check that you are on ”https://martinfowler.com/”. Selenium/Cucumber Here, we will work in three stages. Write the Gherkin Cucumber scenarios that describe the test cases. Create the Glue Code to link the previous scenario steps to the code using the Cucumber framework. Use the Selenium library to interact with the browser and write any necessary utility functions. Gherkin: Plain Text Feature: Demonstration use case Scenario: search for Martin Fowler website Given I navigate to ’https://google.com’ And I search ’Martin Fowler’ in google search bar When I click on result containing ’Martin Fowler’ Then the current url is ’https://martinfowler.com/’ Glue Code: Java @When("ˆI navigate to \"([ˆ\"]*)\$") public void navigate_to_url(final String urlToTest) { navigateToUrl(urlToTest); } @When("ˆI search \"([ˆ\"]*)\ in google search bar$") public void search_data_in_google(final String searchText) { WebElement element = getElementByName("q"); fillElementWithText(element, searchText); clickElementByContain("Recherche Google") } @When("ˆI click on result containing \"([ˆ\"]*)\$") public void click_on_first_result(final String searchText) { clickElementByContain(searchText); } @When("ˆthe current url is \"([ˆ\"]*)\$") public void current_url_test(final String urlToTest) { checkCurrentUrl(urlToTest); } Then we will utilize the Selenium library. The Selenium Toolbox includes this.config.getDriver(), allowing access to functions like navigate() or findElement(…). Java public void navigateToUrl(final String url) { this.config.getDriver().navigate().to(url); } public WebElement getElementsByName(final String name) { return this.config.getDriver() .findElement(By.name(name)); } public void checkCurrentUrl(final String urlToCheck) { assertEquals(this.config.getDriver() .getCurrentUrl(), urlT); } public void clickElementByContain(final String contain) { WebElement element = this.config.getDriver().findElement( By.xpath(String.format("//[contains(text(),'%s')]", contain); element.click(); } Karate With Karate, tasks are much faster: all you need is a scenario file using the Karate DSL (Domain-specific language) to achieve the same desired outcome. Plain Text Feature: Demonstration use case Scenario: search for Martin Fowler website Given driver ’https://google.com’ And input(’input[name=q]’, Martin Fowler) And click(’{ˆ}Recherche Google’) When waitForText(’Martin Fowler’,’APPEARED’).click(’{ˆh3}MartinFowler’) Then match driver.url ’https://martinfowler.com/’ Analysis Here we can see a distinct difference in the amount of code required. Karate has already integrated the DOM interaction functions down to the Gherkin language level, which is a significant advantage in terms of development speed. However, this may affect the readability of the scenario file, particularly in a BDD context. As a result, it is reasonable to question whether BDD can be effectively executed using Karate. The answer may vary depending on the project’s context, its users, and the technical expertise of those involved. However, using Karate can greatly reduce maintenance expenses due to having less code and fewer bugs. This is a critical factor in the profitability of automated testing, which is contingent upon its simplicity, maintainability, and durability. CI/CD, Performance and Scalability In both cases, we presume that we will be using the followingprocess: Basic automation process The issue of data is not relevant in our case. Although it is an important factor when discussing test automation, both Selenium and Karate encounter the same problem and it is unrelated to their core functionality. So, our main focus will be on how both technologies can be integrated into a CI/CD environment. Selenium Here we will explore the use of Selenium Grid to compare the full range of features offered by Selenium. Required Components Selenium Grid Hub: The central control point of the Selenium Grid architecture which manages the distribution of test execution to different nodes (machines or virtual environments). The hub receives test requests from test scripts and routes them to available nodes according to desired capabilities, such as browser, platform, version, etc. Nodes: They are individual machines or virtual environments that are responsible for executing the tests. Each node registers with the hub and advertises its capabilities, including supported browsers and operating systems. Test scripts connect to the hub, which in turn redirects them to appropriate nodes for execution based on the desired capabilities. WebDriver Instances: WebDriver instances are indispensable for interacting with browsers and automating UI tests. The Remote WebDriver instance is used in the test script to send commands to browsers running on the nodes. These instances act as a bridge between the test script and the browser, enabling actions like clicking, inputting text, and validating content. The architecture of the aforementioned components is as follows: Selenium Grid components architecture Another option is to use Selenoid, an open-source project that offers a lightweight and efficient method for implementing Selenium Grid through Docker containers. It simplifies the process of running Selenium tests across various browsers and versions. Selenoid brings containerization benefits to Selenium Grid, which facilitates the handling of test execution environments and reduces resource overhead. Selenoid also offers built-in video recordings of test sessions. This is especially helpful for diagnosing test failures, as you can watch the video to comprehend the failure context. Selenium tests execution on Selenoid The key distinction lies in the technology employed. Selenoid utilizes Docker containers to achieve browser isolation, whereas Selenium Grid relies on separate nodes with Remote WebDriver instances. The objective of both approaches is to furnish uniform and replicable browser environments for test execution, alleviating problems that may crop up due to shared browser instances. In summary, both Selenium Grid and Selenoid utilize specialized browser instances for every test session to guarantee a stable and separate testing environment. Though approaches may vary, the fundamental principle of browser isolation persists. Karate For Karate things are much more simple. Two Docker images are available and should be deployed on the CI server in order to emulate the browser. Then you can deposit your Karate scenarios and launch them in different manners: Using a standalone version of Karate (In that case youwill prefer to use this Docker image) Using a Java jar containing the Karate library Karate CI/CD architecture with standalone Karate jar It is important to note that Karate enables native multithreading. Instead of using multiple browser instances to run tests, tests can be executed concurrently by adding a custom parameter. The figure below shows a multi-threading inside a container with three threads. Karate multi-threading Communities and Usage It is noteworthy that when it comes to e2e, Selenium is the leading framework and enjoys widespread adoption in the community. Therefore, we will commence by conducting a comparative analysis, followed by a closer examination of the activity surrounding Karate. Comparative Analysis GitHub Stars This initial metric measures the amount of "stars" granted to various repositories by the GitHub community. However, this criterion alone is not conclusive as bots may artificially inflate the value. As a result, we utilized the Astronomer tool, which provides a confidence score for Github repositories based on the subsequent criteria: The average amount of lifetime contributions among stargazers The average amount of private contributions The average amount of public-created issues The average amount of public authored commits The average amount of publicly opened pull requests The average amount of public code reviews The average weighted contribution score (weighted by making older contributions more trustworthy) Every 5th percentile, from 5 to 95, of the weighted contribution score The average account age, older is more trustworthy Analysis of the Intuit/Karate repository with Astronomer The achieved grade of "A" confirms the quality of the information analyzed within the repository. Therefore, we consider the ”stars” criterion as reliable. The figure below displays Cypress and Cucumber for additional comparison points besides the two examined frameworks. The y-axis represents the number of Github stars, and the x-axis shows the date. Expectedly, Selenium surpasses its rival. However, it is worth mentioning that Karate has gained significant ground and even outperformed Cucumber, which is a highly prevalent framework utilized for BDD development with Gherkin. Cypress remains popular, particularly within the JavaScript community, due to its significant reputation. Comparison of the number of GitHub Stars — Star history StackOverflow Trends We will now examine the "trends" criterion on Stack Overflow to gauge the activity of the community involved in a particular technology. By correlating the number of users with the corresponding tag on Stack Overflow, we can assess the level of support available for the technology, as the site is extensively used by the developer community. This ensures varying levels of support (courtesy of the community, given that these are open-source projects). The greater the frequency of occurrences, the simpler it is to discover solutions to specific problems. The initial graph examines the following technologies: Selenium, Cucumber, Cypress, and Karate. The y-axis presents the proportion of questions posted on Stack Overflow that contain the corresponding tag, while the x-axis displays the months/years. Stack Overflow trends - including Selenium Once again, Selenium is in the lead, confirming the previous result. To improve our analysis, we will display the same graph without Selenium to avoid compressing the curves (the drop in the Selenium curve is due to the fact that it has been moved by StackExchange to another website dedicated to software quality). StackOverflow trends — without Selenium Karate has a high percentage, experiencing a significant rise since its inception. Cucumber has remained stable, closely trailing Karate. Cypress is still on top but seems to know an important decrease. A correlation can be established between the acceleration depicted on the ” GitHub Stars” chart and the level of occurrence here. Conclusion We note that Karate is a more code-efficient framework, enabling simple writing due to its design to avoid Selenium’s complexity. Its CI/CD capabilities are powerful for most projects. However, Selenium Grid still offers specific features that Karate does not for certain integrations. The established and strong community around Selenium is a valuable aspect, as is the variety of supported programming languages. On the other hand, Karate only offers one ”language” — its own DSL. Whilst this is quite easy to learn and intuitive for programmers, it can still be a bit complicated for non-technical users, especially in a BDD context. The community around this framework is growing, and many improvements have been made since its inception. Peter Thomas is very responsive on Stack Overflow and his own GitHub, answering questions promptly, and the extensive documentation is clear and exhaustive. However, the project is still very closely linked to Peter Thomas for the moment. Also, it’s crucial to bear in mind that we’re solely referring to Karate-UI and that this framework provides several other functionalities, such as API testing and performance testing using Gatling based on these API tests, which is highly engaging. Karate is a contemporary and intriguing testing framework that presents a viable option to consider for your project, depending on its specific characteristics.
To ensure traffic safety in railway transport, non-destructive inspection of rails is regularly carried out using various approaches and methods. One of the main approaches to determining the operational condition of railway rails is ultrasonic non-destructive testing [1]. Currently, the search for images of rail defects using the received flaw patterns is performed by a human. The successful development of algorithms for searching and classifying data makes it possible to propose the use of machine learning methods to identify rail defects and reduce the workload on humans by creating expert systems. The complexity of creating such systems is described in [1, 3-6, 22] and is due, on the one hand, to the variety of graphic images obtained during multi-channel ultrasonic inspection of rails, and on the other hand, to the small number of data copies with defects (not balanced). One of the possible ways to create expert systems in this area is an approach based on the decomposition of the complex task of analyzing the entire multichannel defectogram into individual channels or their sets, characterizing individual types of defects. One of the most common rail defects is a radial bolt hole crack, referred to in the literature as “Star Crack” (Fig. 1). This type of defect is mainly detected by a flaw detector channel with a preferred central inclined angle of ultrasound input in the range of 380- 450 [1-2]. Despite the systematic introduction of continuous track on the railway network, the diagnosis of bolt holes is an important task [1-2], which was the reason for its identification and consideration in this work. Figure 1 – Example of a radial crack in a rail bolt hole Purpose of the work: to compare the effectiveness of various machine learning models in solving the problem of classifying the states of bolt holes of railway rails during their ultrasonic examination. The set purpose is achieved by solving the following problems: Data set generation and preparation Exploratory data analysis Selection of protocols and metrics for evaluating the operation of algorithms Selection and synthesis of implementations of classification models Evaluating the effectiveness of models on test data Data Set Generation and Preparation When a flaw detector equipped with piezoelectric transducers (PZT) travels along a railway track, ultrasonic pulses are emitted into the rail within a specified period. At the same time, receiving the PZT register reflected waves. The detection of defects by the ultrasonic method is based on the principle of reflecting waves from metal inhomogeneities, since cracks, including other inhomogeneities, differ in their acoustic resistance from the rest of the metal [1-5]. During ultrasonic scanning of rails, their structural elements and defects have acoustic responses, which are displayed on the defectogram in the form of characteristic graphic images (scans). Figure 2 shows examples of defectograms in the form of B-scan (“Bright-scan”) of sections of rails with a bolted connection, which were obtained by measuring systems of various types at an inclined angle of ultrasound input. Figure 2 - Examples of individual flaw detection channels (B-scan) when scanning bolt holes of rails by flaw detectors of various companies Individual scans of bolt holes (frames) can be separated from such flaw patterns, for example, by applying amplitude criteria (Fig. 3). Figure 3 - Selection of frames with alarms of bolt holes from B-scan Width W and length L of each frame are the same (Fig. 3), and are selected based on the maximum possible dimensions of bolt holes signaling and their flaws. Each such frame (instance) represents a part of the B-scan and therefore contains data on the coordinates, measurements, and sizes of each point, data from each of the two ultrasonic signal inputs on the rail (+/- 400). In work [3] such data frames are converted into a matrix with shape (60, 75), size 60 * 75 = 4500 elements in grayscale, classification network based on deep learning methods is built and successfully trained. However, the work does not consider alternative and less capacious options for data frame formats and does not show the capabilities of basic methods and machine learning models, so this work is intended to fill this drawback. Various forms of radial cracks of rail bolt holes, their locations, and reflective properties of the surface lead to changing graphic images and, together with a defect-free state, generate a data set with the ability to distinguish 7 classes. In binary classification practice, it is common to assign class “1” to the rarer outcomes or conditions of interest, and class “0” to the common condition. In relation to the identification of defects, we will define a common and often encountered in practice defect-free state - class “0”, and defective states “1”-“6”. Each defect class is displayed on the flaw pattern as a characteristic image that is visible to experts during data decryption (Fig. 4). Although the presence or absence of a flaw (binary classification) is crucial during the operation of the railway track, we will consider the possibilities of classification algorithms and quantify which types of defects or flaws are more likely to be falsely classified as defect-free, which is a dangerous case in the diagnosis of rails. Therefore, the classification problem is reduced in this work to an unambiguously multiclass classification. Figure 4 - Examples of B-scan frames (60, 75) with characteristic images of bolt holes with different radial cracks assigned to one of the 7 classes Each instance of a class can be represented as a basic structure - rectangular data. To equalize the size of the instances, set the length of the table format to k = 60 records (30% more than the maximum possible), fill empty cells with zero values (Fig. 5a). Then the original instance can have the form (6, 60) or be reduced to the form of a flat array and described in 6*60=360 dimensional space (Fig. 5c), and on the B-scan graph it will look like Fig. 5b. Figure 5 – Representation of a rectangular data instance Selecting an Assessment Protocol Collecting and annotating data from ultrasonic testing of rails is associated with significant difficulties, which are described in [3], so we will use a synthesized data set obtained using mathematical modeling. The essence of this approach is reflected in Fig. 6, and its applicability is shown in [3]. The term “synthesized data” is widely discussed when creating real-world visual objects, for example, on the nVIDEA blog [23]. This work extends the application of synthesized data into the field of non-destructive testing. Figure 6 - Application of ML model A sufficiently large number of data instances obtained on the basis of mathematical modeling allows us to avoid the problem of a rare class and choose a protocol for evaluating models in the form of separate balanced sets: training, verification, and testing. Let's limit the data sets: training data = 50,000, test data = 50,000, and validation data = 10,000 instances. Choosing a Measure of Success The absence of a difference in the relative sizes of classes (class balance) allows us to choose the accuracy indicator as a measure of success when training algorithms as a value equal to the ratio of the number of correctly classified instances to their total number. A single metric cannot evaluate all aspects of model applicability for a situation, so the model test step uses a confusion matrix, precision, and completeness measures for each class classifier. Exploratory Data Analysis Information about the balance of classes in the training, test, and validation set is presented in Fig. 7. Figure 7 - Data Quantity Summary The representation of the distribution of normalized alarm depths and their coordinates for the positive measuring channel Ch + 400 and classes 0, 2, 3, and 6 is shown in Fig. 8. Distributions for Ch-400 and channels 0, 1, 4, and 5 have a symmetrical pattern. Figure. 8 – Distribution of normalized values of coordinates and depths of data from the Ch+400 measurement channel for classes 0, 2, 3, 6 The Principal Component Analysis (PCA) method was used as an Exploratory Data Analysis and determination of data redundancy, the two-dimensional representation of which can be represented in the form of Fig. 9. The central class is class 0, from which classes 2, 3, 6 and 1, 4, 5 are located on opposite sides, which corresponds to their graphical display on B-scan. Figure 9 - Visualization of the two-dimensional representation of the PCA method In general, a two-dimensional representation of classes has weak clustering, which indicates the need to use a higher dimension of data to classify them in the limit to the original flat size 6 * 60 = 360. The graph of the integral explainable variance as a function of the number of components of the PCA method (Figure 10a) shows that with 80 components, 98% of the variance is already explained, which indicates a high level of redundancy in the original data. This can be explained by the sparseness of the data, which shows the independence of the obtained 80 components of the PCA method from zero values (Fig. 10b). Figure 10 – PCA: a) integral explained variance as a function of the number of components of the PCA method, b) contributions of predictive variables of a flat data array in projections on the PCA axes Let's consider the assessment of the occupancy of data instances with non-zero values for each class (Fig. 11). Fig. 11 - Assessment of occupancy with non-zero values of class instances Note: Similarity of ranges and quartiles of classes Class 0 has the lowest median as the defect-free condition of the bolt hole is devoid of additional crack alarms. Classes 5 and 6 have the highest median values, indicating high data filling due to the presence of alarms from the lower and upper radial crack of the bolt hole. Classes 1-4 have similar median values, indicating that they are filled with data due to the presence of alarms only from the upper or lower radial crack of the bolt hole. Classes 1 and 2, 3 and 4, 5 and 6, respectively, have similar medians and distributions, due to the symmetry of the data relative to the center of the bolt hole. The level of 80 components of the PCA method is lower than the median for classes 1-6, but is sufficient to describe 98% of the variances, which may indicate redundancy caused not only by zero values in the data. A possible explanation could be the fact that the amplitude values of the alarms do not change much in each class and have a weak effect on the variance. This fact is confirmed by the practice of searching for defects, in which flaw detectors do not often use the amplitude parameter. To assess the complexity of the upcoming Exploratory Data Analysis task, the multidimensional structure of the input data was studied using Manifold learning techniques (Manifold learning): Random projection embedding Isometric Mapping (Isomap) Standard Locally linear embedding (LLE) Modified Locally linear embedding (LLE) Local Tangent Space Alignment embedding (LTSA) Multidimensional scaling (MDS) Spectral embedding t-distributed Stochastic Neighbor Embedding (t-SNE) Also, techniques that can be used for controlled dimensionality reduction and allowing data to be projected into a lower dimension: Truncated SVD embedding Random Trees embedding Neighborhood Components Analysis (NCA) Linear Discriminant Analysis (LDA) The results of the algorithms for embedding data from 3,000 samples of the original shape (6, 60) into two-dimensional space are presented in Fig. 12. Figure 12 – Embedding data into two-dimensional space using various techniques (the color of the dots represents the class) For Manifold learning methods, the data in the graphs is poorly spaced in parametric space, which characterizes the predictable complexity of data classification by simple supervised algorithms. Note also that the method of controlled dimensionality reduction Linear Discriminant Analysis shows good data grouping and can be a candidate for a classification model. Development of Data Classification Models Basic Model The prediction accuracy of each class out of seven possible with a random classifier is 1 / 7 = 0.143 and is the starting point for assessing the statistical power (quality) of future models. As a base model, we will choose Gaussian Naive Bayes, which is often used in such cases. A code fragment for fitting a model on training data and its prediction on test data: Python from sklearn.naive_bayes import GaussianNB from sklearn.metrics import accuracy_score from sklearn.metrics import confusion_matrix batch = 50000 train_gen = Gen_2D_Orig_Arr(train_dir, batch_size=batch, num_classes=7) Xtrain, ytrain = next(train_gen) # Xtrain.shape = (50 000, 6, 60, 1), # ytrain.shape = (50 000, 7) Xtrain = np.reshape(Xtrain, (batch, Xtrain.shape[1] * Xtrain.shape[2])) # Xtrain.shape = (50 000, 360) ytrain = np.argmax(ytrain, axis=1) # ytrain.shape = (50 000,) test_gen = Gen_2D_Orig_Arr(test_dir, batch_size=batch, num_classes=7) Xtest, ytest = next(test_gen) Xtest = np.reshape(Xtest, (batch, Xtest.shape[1] * Xtest.shape[2])) ytest = np.argmax(ytest, axis=1) # ytest.shape = (50 000,) model = GaussianNB() model.fit(Xtrain, ytrain) start_time = time() y_model = model.predict(Xtest) # y_model.shape = (50 000,) timing = time() - start_time acc = accuracy_score(ytest, y_model) print('acc = ', acc) print(classification_report(ytest, y_model, digits=4)) mat = confusion_matrix(ytest, y_model) # mat.shape = (7, 7) fig = plt.figure(figsize=(10, 6), facecolor='1') ax = plt.axes() sns.heatmap(mat, square=True, annot=True, cbar=False, fmt="d", linewidths=0.5) plt.xlabel('сlass qualifiers') plt.ylabel('true value'); print(f'{timing:.3f} s - Predict time GaussianNB') The resulting difference matrix and summary report on model quality are presented in Fig. 13 a and b. The trained model has statistical power, as it has an overall accuracy of 0.5819, which is 4 times higher than the accuracy of a random classifier. Despite the rather low accuracy of the model, we will consider the specific relationship between the qualitative indicators of its work and the graphical representation of the projected data using the Linear Discriminant Analysis method (Fig. 13c). Figure 13 - Summary Quality Assessments of the Gaussian Naïve Bayesian Model: a) dissimilarity matrix showing model misclassification rates; b) a report on the qualitative indicators of the model’s performance in the form of various accuracy metrics; c) projection of data into two-dimensional space using the Linear Discriminant Analysis method The projected data of class 6 are the most distant from most points of other classes (Fig. 13c), which is reflected in the high precision of its classifier equal to 0.9888, however, the closeness of the representation of class 3 reduced the recall of classifier 6 to 0.5688 due to false negative predictions, which are expressed by the error rate equal to 2164 in the dissimilarity matrix. The projection of class 5 is also removed, which is reflected in the high precision of its classifier equal to 0.9916, however, it has intersections with classes 1 and 4, which affected the completeness of the classifier equal to 0.4163, due to erroneous predictions with frequencies of 2726 and 1268 for classifiers 1 and 4, respectively. The projection of class 1 has intersections with classes 5, 4, and 0, while, accordingly, classifier 1 has false positives with a frequency of 2726 for class 5 and false negatives with frequencies of 2035 and 3550 in favor of classifiers 0 and 4. Similar relationships are observed for other classes. One of the interesting ones is the behavior of classifier 0. Defect-free class 0 is in the middle of the projections, which corresponds to its graphic image, which is closest to classes 1, 2, 3, and 4 and most distinguishable from classes 5 and 6 (Fig. 4). Classifier 0 recognizes its data class well, which causes the highest recall score of 0.9928, but has numerous false positives in classes 1, 2, 3, 4 with a precision of 0.4224, that is, classes with defects are often classified as a defect-free class (class 0 ), which makes the Gaussian Naive Bayes model completely unsuitable for flaw detection purposes. The resulting Gaussian Naive Bayes classifier is simple enough to describe complex data structure. Linear Discriminant Analysis (LDA) Classifier Model Preliminary analysis based on data dimensionality reduction showed a good grouping of classes within the Linear Discriminant Analysis method (Fig. 12), which served as a motivation for its use as one of the next models: Python from sklearn.discriminant_analysis import LinearDiscriminantAnalysis lda_model = LinearDiscriminantAnalysis(solver='svd', store_covariance=True) lda_model.fit(Xtrain, ytrain) y_model = lda_model.predict(Xtest) acc = accuracy_score(ytest, y_model) print(classification_report(ytest, y_model, digits=4)) mat = confusion_matrix(ytest, y_model) fig = plt.figure(figsize=(10, 5), facecolor='1') sns.heatmap(mat, square=True, annot=True, cbar=False, fmt='d', linewidths=0.5) plt.xlabel('сlass qualifiers') plt.ylabel('true value'); The results of its training and prediction are presented in Fig. 14. The overall accuracy of the model was 0.9162, which is 1.57 times better than the accuracy of the basic model. However, classifier 0 has a large number of false positives for classes 2 and 4 and its precision is only 0.8027, which is also not a satisfactory indicator for the purposes of its practical application. Figure 14 – Summary assessments of the quality of work of the Linear Discriminant Analysis (LDA) classifier The hypothesis about the possible lack of training data set to increase the accuracy of the LDA model is not confirmed, since the constructed “learning curve” presented in Fig. 15 shows a high convergence of the training and testing accuracy dependencies at the level of 0.92 with a training set size of 5000 – 6000 items: Python from sklearn.model_selection import learning_curve gen = Gen_2D_Orig_Arr(train_dir, batch_size=8000, num_classes=7) x1, y1 = next(gen) # x1.shape = (8000, 6, 60, 1), y1.shape = (8000, 7) x1 = np.reshape(x1, (batch, x1.shape[1] * x1.shape[2])) # x1.shape = (8000, 360) y1 = np.argmax(y1, axis=1) # y1.shape = (8000,) N, train_sc, val_sc = learning_curve(lda_model, x1, y1, cv=10, train_sizes=np.linspace(0.04, 1, 20)) rmse_tr = (np.std(train_sc, axis=1)) rmse_vl = (np.std(train_sc, axis=1)) fig, ax = plt.subplots(figsize=(10, 6)) ax.plot(N, np.mean(train_sc, 1), '-b', marker='o', label='training') ax.plot(N, np.mean(val_sc, 1), '-r', marker='o', label='validation') ax.hlines(np.mean([train_sc[-1], val_sc[-1]]), 0, N[-1], color='gray', linestyle='dashed') ax.fill_between(N, np.mean(train_sc, 1) – 3 * rmse_tr, np.mean(train_sc, 1) + 3 * rmse_tr, color='blue', alpha=0.2) ax.fill_between(N, np.mean(val_sc, 1) – 3 * rmse_vl, np.mean(val_sc, 1)+ 3 * rmse_vl, color='red', alpha=0.2) ax.legend(loc='best'); ax.set_xlabel('training size') ax.set_ylabel('score') ax.set_xlim(0, 6500) ax.set_ylim(0.7, 1.02); Figure 15: High convergence of the training and testing accuracy dependencies at the level of 0.92 with a training set size of 5000 – 6000 items When trying to create such a classification system, manufacturers of flaw detection equipment are faced with the difficulty of assessing the dependence of the predictive accuracy of the system on the number of items in the data set that need to be obtained in the process of rail diagnostics. The resulting learning curve based on model data allows in this case to estimate this number in the range of 5,000 – 6,000 copies, to achieve an accuracy of 0.92 within the framework of the LDA algorithm. The decreasing dependence of the learning curve on the training data (blue color in Fig. 15) of the LDA classifier shows that it is simple enough for a complex data structure and there is a justified need to find a more complex model to improve prediction accuracy. Dense Network One of the options for increasing forecast accuracy is the use of fully connected networks. The option of such a structure with the found optimal parameters using the tool Keras_tuner shown in Figure 16a, the accuracy of the model increased compared to the previous method to 0.974 (Figure 16b), and the precision of the zero class classifier to 0.912. Fig. 16 – Model structure and accuracy indicators The progressive movement to increase the accuracy of the prediction due to the use of more complex (computationally costly) machine learning algorithms shows the justifiability of actions to create increasingly complex models. Support Vector Machine (SVM) The use of a support vector algorithm with a core as a radial basis function and over-the-grid model hyperparameters using the GridSearchCV class from the scikitlearn machine learning library made it possible to obtain a model with improved quality parameters Fig. 17. Fig. 17 - Summary of Classifier Performance Estimates Based on Support Vector Method (SVM) The use of the SVM method increased both the overall prediction accuracy to 0.9793 and the precision of the null classifier to 0.9447. However, the average running time of the algorithm on a test data set of 50,000 instances with the initial dimension of each 360 was 9.2 s and is the maximum for the considered classifiers. Reducing the model's operating time through the use of pipelines in the form of techniques for reducing the dimensionality of the source data and the SVM algorithm did not allow for maintaining the achieved accuracy. Random Forest Classifier The RandomForestClassifier classifier based on an ensemble of random trees implemented in the scikitlearn package is one of the candidates for increasing the classification accuracy of the data under consideration: Python from sklearn.ensemble import RandomForestClassifier rf = RandomForestClassifier(n_estimators=50) The performance estimates of the random forest algorithm of 50 trees on the test set are shown in Fig. 18. The algorithm made it possible to increase both the overall prediction accuracy to 0.9991 and an important indicator in the form of precision of the null classifier to 0.9968. The zero class classifier makes the most mistakes in classes 1-4 that are similar in graphical representation. The precision of classifiers of classes 1-4 is high and is reduced due to errors in favor of classes 5 and 6, which is not critical in identifying flaws. The average running time of the algorithm according to the prediction of the test data set on the CPU was 0.7 s, which is 13 times less than the running time of SVM with an increase in accuracy of 0.02%. Fig. 18 – Summary assessments of the quality of the classifier based on the random trees method The learning curve of the RandomForestClassifier classifier is presented in Fig. 19 shows a high level of optimality of the constructed model: With increasing training data, the efficiency of the model does not decrease, which indicates the absence of the undertraining effect. The efficiency assessment at the training and testing stage converges and has high values with a difference of no more than 0.028, which may indicate that the model is not overtrained. Fig. 19 – Learning curve of the RandomForestClassifier The resulting learning curve allows us to estimate the minimum required number of samples of each class to achieve acceptable accuracy at the level of 0.98: 1550 data copies, that is, 1550 / 7 = 220 samples for each of the 7 classes. The high accuracy and speed of the random forest algorithm allow you to assess the magnitude of the influence (importance) of all 360 predictive variables on the overall accuracy of the model. The evaluation was carried out by obtaining the average decreasing accuracy of the model by randomly mixing one of the variables, which had the effect of removing its predictive power. Fig. 20 shows the result of a code fragment for assessing the importance of variables on the accuracy of the model: Python rf = RandomForestClassifier(n_estimators=10) gen = Gen_2D_Orig_Arr(train_dir, batch_size=8000) x1, y1 = next(gen) # x1.shape = (8000, 6, 60, 1), y1.shape = (8000, 7) x1 = np.reshape(x1, (batch, x1.shape[1] * x1.shape[2])) # x1.shape = (8000, 360) y1 = np.argmax(y1, axis=1) # y1.shape = (8000,) Xtrain, Xtest, ytrain, ytest = train_test_split(x1, y1, test_size=0.5) rf.fit(Xtrain, ytrain) acc = accuracy_score(ytest, rf.predict(Xtest)) print(acc) scores = np.array([], dtype=float) for _ in range(50): train_X, valid_X, train_y, valid_y = train_test_split(x1, y1, test_size=0.5) rf.fit(train_X, train_y) acc = accuracy_score(valid_y, rf.predict(valid_X)) for column in range(x1.shape[1]): X_t = valid_X.copy() X_t[:, column] = np.random.permutation(X_t[:, column]) shuff_acc = accuracy_score(valid_y, rf.predict(X_t)) scores = np.append(scores, ((acc - shuff_acc) / acc)) scores = np.reshape(scores, (50, 360)) sc = scores.mean(axis=0) fig, ax = plt.subplots(figsize=(10, 4)) ax.plot(sc, '.-r') ax.set_xlabel('Predictive variables') ax.set_ylabel('Importance') ax.set_xlim([0, 360]) ax.xaxis.set_major_locator(plt.MaxNLocator(6)) Figure 20 - Assessment of the importance of predictive variables by RandomForestClassifier-based model accuracy The graph in Fig. 20 shows that the most important predictive variables are from 60 to 85 for channel +40 and from 240 to 265 for channel -40, which determine the depth of the alarm. The presence of peaks at the beginning and end of each range indicates even greater predictive importance of the depths of the beginning and end of alarms. The total number of important variables can be estimated at 50. The importance of variables determining the coordinates and amplitude of the alarm in each data instance is much lower. This assessment is consistent with the assumptions made during the exploration analysis. Training RandomForestClassifier on the entire training data set without amplitudes showed an overall accuracy of 0.9990, without amplitudes and coordinates - 0.9993. Excluding from consideration such parameters as amplitude and coordinate for each data instance reduces the size of the data under consideration to (2, 60) = 120 predictive variables without reducing accuracy. The obtained result allows us to use only the alarm depth parameter for the purpose of data classification. The accuracy achieved by RandomForestClassifier is sufficient and solves the problem of classifying defects in bolt holes. However, for the purpose of generalizing the capabilities, let’s consider a class of deep learning models based on a convolutional neural network. Deep Learning (DL) Model Synthesis and training of a convolutional network requires an iterative process with the search for the best structures and optimization of their hyperparameters. Fig. 21 shows the final version of a simple network structure in the form of a linear stack of layers and the process of its training. Figure 21 – Model structure and report on its training process The results of training and forecasting of the convolutional neural network are presented in Fig. 21. The overall accuracy of the model on test data was 0.9985, which is 1.71 times better than the accuracy of the base model. The number of false positives of classifier 0 is 2+24+6+2=34 out of all 42893 defective instances (Fig. 22a). The average time for predicting test data on the CPU was 4.55 s. Figure 22 – Summary assessments of the quality of work of a trained network based on CNN on test data One of the important tasks of the resulting classifier in its practical use will be the accurate determination of the defect-free class (class 0), which will eliminate the false classification of defective samples as non-defective. It is possible to reduce the number of false positives for a defect-free class by changing the probability threshold. To estimate the applicable threshold cut-off level, a binarization of the multi-class problem was carried out with the selection of a defect-free state and all defective states, which corresponds to the “One vs Rest” strategy. By default, the threshold value for binary classification is set to 0.5 (50%). With this approach, the binary classifier has the quality indicators shown in Fig. 23. Fig. 23 – Qualitative indicators of a binary classifier at a cutoff threshold of 0.5 The resulting precision for the “No defect” class was 0.9952, the same as for the multiclass classifier for class “0”. The use of the sklearn.metrics.precision_recall_curve function allows you to reflect changes in the precision and completeness of a binary classifier depending on a changing cutoff threshold (Fig. 24a). At a cutoff threshold of 0.5, the value of false positives is 34 samples (Fig. 24b). The maximum level of precision and completeness of the classifier is achieved at the point of intersection of their graphs, which corresponds to a cutoff threshold of 0.66. At this point, the classifier reduces the number of false positives for the “No defect” class to level 27 (Fig. 24b). Increasing the threshold to the level of 0.94 allows you to reduce false positives to a value of 8, due to an increase in false negatives to 155 samples (Fig. 24b) (decreasing the completeness of the classifier). A further increase in the cutoff threshold significantly reduces the completeness of the classifier to an unacceptable level (Fig. 24a). Figure 24 – Effect of the cut-off threshold: a) graph of changes in precision and completeness depending on the change in the cut-off threshold value (precision-recall curve); b) dissimilarity matrices at different cutoff thresholds With a set cutoff threshold of 0.94, the qualitative assessments of the classifier are shown in Fig. 25. Precision for the “No defect” class increased to 0.9989. Fig. 25 – Qualitative indicators of a binary classifier with a cutoff threshold of 0.94 Eight false-positive classified data samples with characteristic graphical signs of defects highlighted are shown in Fig. 26. Fig. 26 - Eight false-positive classified samples Of the above graphic images, the controversial ones are the samples marked “controversial”, which indicate the presence of a very short radial crack, which is difficult to classify as a defect. The remaining 6 samples are the classifier error. Note the qualitative indicator in the form of the absence of false classification of samples with defects in the form of a significant length of the radial crack. Such samples are most easily classified during manual analysis by flaw detectors. Further increase in model accuracy is possible through the use of ensembles of the resulting models: DL and RandomForestClassifier. The considered models can be added to the ensemble, but obtained with a different input data format, including the direct Bscan format, as shown in [3]. Conclusions and Discussion The main quality indicators of the developed models for classifying defects in bolt holes are summarized in the diagram in Fig. 27. The gradual and reasonable complication of classification models is reflected in the diagram in the form of an increase in both the overall accuracy of the models (blue color) and an important indicator in the form of class 0 precision (orange color). Maximum accuracy rates above 0.99 were achieved by models based on random forest and convolutional neural networks. At the same time, the random forest model has the advantage of less time spent on prediction. Fig. 27 – Highlighted quality indicators of the considered classification models In This Work The possibility of searching for defects on an ultrasonic flaw detector is shown by decomposing it into separate channels with data and allocating individual diagnostic sites. An assessment is made of the influence of predictive variables in the form of amplitude and coordinates on the quality of classification. An estimate is given of the required amount of data set to build a classification model of defects in bolt holes with an accuracy of 98%, which can serve as a guide for manufacturers of flaw detection equipment when creating automatic expert systems. The possibility of achieving high accuracy rates for classifying the states of rail bolt holes based on classical machine learning algorithms is shown. Qualitative assessments of the operation of the deep learning model are obtained and show the possibility and feasibility of using a convolutional neural network architecture for the synthesis of segmentation networks for searching for defects in continuous flaw patterns (B-scan). References [1] Markov AA, Kuznetsova EA. Rail flaw detection. Formation and analysis of signals. Book 2. Decoding of defectograms. Saint Petersburg: Ultra Print; 2014. [2] Markov AA, Mosyagin VV, Shilov MN, Fedorenko DV. AVICON-11: New Flaw-Detector for One Hundred Percent Inspection of Rails. NDT World Review. 2006; 2 (32): 75-78. Available from: http://www.radioavionica.ru/activities/sistemy-nerazrushayushchego-kontrolya/articles/files/razrab/33.zip [Accessed 14th March 2023]. [3] Kaliuzhnyi A. Application of Model Data for Training the Classifier of Defects in Rail Bolt Holes in Ultrasonic Diagnostics. Artificial Intelligence Evolution [Internet]. 2023 Apr. 14 [cited 2023 Jul. 28];4(1):55-69. DOI: https://doi.org/10.37256/aie.4120232339 [4] Kuzmin EV, Gorbunov OE, Plotnikov PO, Tyukin VA, Bashkin VA. Application of Neural Networks for Recognizing Rail Structural Elements in Magnetic and Eddy Current Defectograms. Modeling and Analysis of Information Systems. 2018; 25(6): 667-679. Available from: doi:10.18255/1818-1015-2018-6-667-679 [5] Bettayeb F, Benbartaoui H, Raouraou B. The reliability of the ultrasonic characterization of welds by the artificial neural network. 17th World Conference on Nondestructive Testing; 2008; Shanghai, China. [Accessed 14th March 2023] [6] Young-Jin C, Wooram C, Oral B. Deep Learning-Based Crack Damage Detection Using Convolutional Neural Networks. Computer-Aided Civil and Infrastructure Engineering. 2017; 32(5): 361-378. Available from: doi: 10.1111/mice.12263 [7] Heckel T, Kreutzbruck M, Rühe S, High-Speed Non-Destructive Rail Testing with Advanced Ultrasound and Eddy-Current Testing Techniques. 5th International workshop of NDT experts - NDT in progress 2009 (Proceeding). 2009; 5: 101-109. [Accessed 14th March 2023]. [8] Papaelias M, Kerkyras S, Papaelias F, Graham K. The future of rail inspection technology and the INTERAIL FP7 project. 51st Annual Conference of the British Institute of Non-Destructive Testing 2012, NDT 2012. 2012 [Accessed 14th March 2023]. [9] Rizzo P, Coccia S, Bartoli I, Fateh M. Non-contact ultrasonic inspection of rails and signal processing for automatic defect detection and classification. Insight. 2005; 47 (6): 346-353. Available from: doi: 10.1784/insi.47.6.346.66449 [10] Nakhaee MC, Hiemstra D, Stoelinga M, van Noort M. The Recent Applications of Machine Learning in Rail Track Maintenance: A Survey. In: Collart-Dutilleul S, Lecomte T, Romanovsky A. (eds.) Reliability, Safety, and Security of Railway Systems. Modelling, Analysis, Verification, and Certification. RSSRail 2019. Lecture Notes in Computer Science(), vol 11495. Springer, Cham; 2019.pp.91-105. Available from: doi: 10.1007/978-3-030-18744-6_6. [11] Jiaxing Y, Shunya I, Nobuyuki T. Computerized Ultrasonic Imaging Inspection: From Shallow to Deep Learning. Sensors. 2018; 18(11): 3820. Available from: doi:10.3390/s18113820 [12] Jiaxing Y, Nobuyuki T. Benchmarking Deep Learning Models for Automatic Ultrasonic Imaging Inspection. IEEE Access. 2021; 9: pp 36986-36994. Available from: doi:10.1109/ACCESS.2021.3062860 [13] Cantero-Chinchilla S, Wilcox PD, Croxford AJ. Deep learning in automated ultrasonic NDE - developments, axioms, and opportunities. Eprint arXiv:2112.06650. 2021. Available from: doi: 10.48550/arXiv.2112.06650 [14] Cantero-Chinchilla S, Wilcox PD, Croxford AJ. A deep learning-based methodology for artifact identification and suppression with application to ultrasonic images. NDT & E International. 202; 126, 102575. Available from: doi: 10.1016/j.ndteint.2021.102575 [15] Chapon A, Pereira D, Toewsb M, Belanger P. Deconvolution of ultrasonic signals using a convolutional neural network. Ultrasonics. Volume 111, 2021; 106312. Available from: doi: 10.1016/j.ultras.2020.106312 [16] Medak D, Posilović L, Subasic M, Budimir M. Automated Defect Detection From Ultrasonic Images Using Deep Learning. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control. 2021; 68(10): 3126 – 3134. Available from: doi: 10.1109/TUFFC.2021.3081750 [17] Virkkunen I, Koskinen T. Augmented Ultrasonic Data for Machine Learning. Journal of Nondestructive Evaluation. 2021; 40: 4. Available from: doi:10.1007/s10921-020-00739-5 [18] Veiga JLBC, Carvalho AA, Silva IC. The use of artificial neural network in the classification of pulse-echo and TOFD ultra-sonic signals. Journal of the Brazilian Society of Mechanical Sciences and Engineering. 2005; 27(4): 394-398 Available from: doi:10.1590/S1678-58782005000400007 [19] Posilovića L, Medaka D, Subašića M, Budimirb M, Lončarića S. Generative adversarial network with object detector discriminator for enhanced defect detection on ultrasonic B-scans. Eprint arXiv:2106.04281v1 [eess.IV]. 2021. Available from: https://arxiv.org/pdf/2106.04281.pdf [20] Markov AA, Mosyagin VV, Keskinov MV. A program for 3D simulation of signals for ultrasonic testing of specimens. Russian Journal of Nondestructive Testing. 2005; 41: 778-789. Available from: doi: 10.1007/s11181-006-0034-3 [21] Shilov M.N., Methodical, algorithmic, and software for registration and analysis of defectograms during ultrasonic testing of rails [dissertation]. [Saint-Petersburg]: Saint-Petersburg State University of Aerospace Instrumentation; 2007. p 153. [22] A. Kaliuzhnyi. Using Machine Learning To Detect Railway Defects. [23] NVIDIA Blog
Microservices architecture has become extremely popular in recent years because it allows for the creation of complex applications as a collection of discrete, independent services. Comprehensive testing, however, is essential to guarantee the reliability and scalability of the software due to the microservices’ increased complexity and distributed nature. Due to its capacity to improve scalability, flexibility, and resilience in complex software systems, microservices architecture has experienced a significant increase in popularity in recent years. The distributed nature of microservices, however, presents special difficulties for testing and quality control. In this thorough guide, we’ll delve into the world of microservices testing and examine its significance, methodologies, and best practices to guarantee the smooth operation of these interconnected parts. Understanding Microservices The functionality of an application is provided by a collection of independent, loosely coupled microservices. Each microservice runs independently, has a database, and uses its business logic. This architecture supports continuous delivery, scalability, and flexibility. In order to build a strong foundation, we must first understand the fundamentals of microservices architecture. Microservices are teeny, independent services that join forces to create a full software program. Each service carries out a particular business function and communicates with other services using clear APIs. Organizations can more effectively develop, deploy, and scale applications using this modular approach. However, with the increase in services, thorough testing is essential to find and fix any potential problems. Challenges in Microservices Testing Testing microservices introduces several unique challenges, including: Distributed nature: Microservices are distributed across different servers, networks, and even geographical locations. This requires testing to account for network latency, service discovery, and inter-service communication. Dependency management: Microservices often rely on external dependencies such as databases, third-party APIs, and message queues. Testing must consider these dependencies and ensure their availability during testing. Data consistency: Maintaining data consistency across multiple microservices is a critical challenge. Changes made in one service should not negatively impact the functionality of other services. Deployment complexity: Microservices are typically deployed independently, and coordinating testing across multiple services can be challenging. Versioning, rollbacks, and compatibility testing become vital considerations. Integration testing: Microservices architecture demands extensive integration testing to ensure seamless communication and proper behavior among services. Importance of Microservices Testing Microservices testing plays a vital role in guaranteeing the overall quality, reliability, and performance of the system. The following points highlight its significance: Isolation and Independence: Testing each microservice individually ensures that any issues or bugs within a specific service can be isolated, minimizing the impact on other services. Continuous Integration and Delivery (CI/CD): Microservices heavily rely on CI/CD pipelines to enable frequent deployments. Effective testing enables faster feedback loops, ensuring that changes and updates can be delivered reliably without causing disruptions. Fault Isolation and Resilience: By testing the interactions between microservices, organizations can identify potential points of failure and design resilient strategies to handle failures gracefully. Scalability and Performance: Testing enables organizations to simulate high loads and stress scenarios to identify bottlenecks, optimize performance, and ensure that microservices can scale seamlessly. Types of Microservices Testing Microservices testing involves various types of testing to ensure the quality, functionality, and performance of individual microservices and the system as a whole. Here are some important types of testing commonly performed in microservices architecture: Unit Testing Unit testing focuses on testing individual microservices in isolation. It verifies the functionality of each microservice at a granular level, typically at the code level. Unit tests ensure that individual components or modules of microservices behave as expected and meet the defined requirements. Mocking frameworks are often used to isolate dependencies and simulate interactions for effective unit testing. Integration Testing Integration testing verifies the interaction and integration between multiple microservices. It ensures that microservices can communicate correctly and exchange data according to the defined contracts or APIs. Integration tests validate the interoperability and compatibility of microservices, identifying any issues related to data consistency, message passing, or service coordination. Contract Testing Contract testing validates the contracts or APIs exposed by microservices. It focuses on ensuring that the contracts between services are compatible and adhere to the agreed-upon specifications. Contract testing verifies the request and response formats, data structures, and behavior of the services involved. This type of testing is essential for maintaining the integrity and compatibility of microservices during development and evolution. End-to-End Testing End-to-end (E2E) testing evaluates the functionality and behavior of the entire system, including multiple interconnected microservices, databases, and external dependencies. It tests the complete flow of a user request through various microservices and validates the expected outcomes. E2E tests help identify issues related to data consistency, communication, error handling, and overall system behavior. Performance Testing Performance testing assesses the performance and scalability of microservices. It involves testing the system under different loads, stress conditions, or peak usage scenarios. Performance tests measure response times, throughput, resource utilization, and other performance metrics to identify bottlenecks, optimize performance, and ensure that the microservices can handle expected loads without degradation. Security Testing Security testing is crucial in microservices architecture due to the distributed nature and potential exposure of sensitive data. It involves assessing the security of microservices against various vulnerabilities, attacks, and unauthorized access. Security testing encompasses techniques such as penetration testing, vulnerability scanning, authentication, authorization, and data protection measures. Chaos Engineering Chaos engineering is a proactive testing approach where deliberate failures or disturbances are injected into the system to evaluate its resilience and fault tolerance. By simulating failures or stress scenarios, chaos engineering validates the system’s ability to handle failures, recover gracefully, and maintain overall stability. It helps identify weaknesses and ensures that microservices can handle unexpected conditions without causing a system-wide outage. Data Testing Data testing focuses on validating the accuracy, integrity, and consistency of data stored and processed by microservices. It involves verifying data transformations, data flows, data quality, and data integration between microservices and external systems. Data testing ensures that data is correctly processed, stored, and retrieved, minimizing the risk of data corruption or inconsistency. These are some of the key types of testing performed in microservices architecture. The selection and combination of testing types depend on the specific requirements, complexity, and characteristics of the microservices system being tested. A comprehensive testing strategy covering these types of testing helps ensure the reliability, functionality, and performance of microservices-based applications. Best Practices for Microservices Testing Microservices testing presents unique challenges due to the distributed nature of the architecture. To ensure comprehensive testing and maintain the quality and reliability of microservices, it’s essential to follow best practices. Here are some key best practices for microservices testing: Test at Different Levels Microservices testing should be performed at multiple levels, including unit testing, integration testing, contract testing, end-to-end testing, performance testing, and security testing. Each level of testing verifies specific aspects of the microservices and their interactions. Comprehensive testing at various levels helps uncover issues early and ensures the overall functionality and integrity of the system. Prioritize Test Isolation Microservices are designed to be independent and loosely coupled. It’s crucial to test each microservice in isolation to identify and resolve issues specific to that service without impacting other services. Isolating tests ensures that failures or changes in one microservice do not cascade to other parts of the system, enhancing fault tolerance and maintainability. Use Mocking and Service Virtualization Microservices often depend on external services or APIs. Mocking and service virtualization techniques allow for testing microservices independently of their dependencies. By replacing dependencies with mocks or virtualized versions of the services, you can control the behavior and responses during testing, making it easier to simulate different scenarios, ensure test repeatability, and avoid testing delays caused by external service availability. Implement Contract Testing Microservices rely on well-defined contracts or APIs for communication. Contract testing verifies the compatibility and compliance of these contracts between services. By testing contracts, you ensure that services can communicate effectively, preventing integration issues and reducing the risk of breaking changes. Contract testing tools like Pact or Spring Cloud Contract can assist in defining and validating contracts. Automate Testing Automation is crucial for effective microservices testing. Implementing a robust test automation framework and CI/CD pipeline allows for frequent and efficient testing throughout the development lifecycle. Automated testing enables faster feedback, reduces human error, and facilitates the continuous delivery of microservices. Tools like Cucumber, Postman, or JUnit can be leveraged for automated testing at different levels. Emphasize Performance Testing Scalability and performance are vital aspects of microservices architecture. Conduct performance testing to ensure that microservices can handle expected loads and perform optimally under various conditions. Load testing, stress testing, and performance profiling tools like Gatling, Apache JMeter, or Locust can help assess the system’s behavior, identify bottlenecks, and optimize performance. Implement Chaos Engineering Chaos engineering is a proactive testing methodology that involves intentionally injecting failures or disturbances into a microservices environment to evaluate its resilience. By simulating failures and stress scenarios, you can identify weaknesses, validate fault tolerance mechanisms, and improve the overall robustness and reliability of the system. Tools like Chaos Monkey, Gremlin, or Pumba can be employed for chaos engineering experiments. Include Security Testing Microservices often interact with sensitive data and external systems, making security testing crucial. Perform security testing to identify vulnerabilities, ensure data protection, and prevent unauthorized access. Techniques such as penetration testing, vulnerability scanning, and adherence to security best practices should be incorporated into the testing process to mitigate security risks effectively. Monitor and Analyze System Behavior Monitoring and observability are essential during microservices testing. Implement monitoring tools and techniques to gain insights into the behavior, performance, and health of microservices. Collect and analyze metrics, logs, and distributed traces to identify issues, debug problems, and optimize the system’s performance. Tools like Prometheus, Grafana, ELK stack, or distributed tracing systems aid in monitoring and analyzing microservices. Test Data Management Managing test data in microservices testing can be complex. Ensure proper test data management by using techniques like data virtualization or synthetic data generation. These approaches allow for realistic and consistent test scenarios, minimizing dependencies on production data and external systems. By following these best practices, organizations can establish a robust testing process for microservices, ensuring quality, reliability, and performance in distributed systems. Adapting these practices to specific project requirements, technologies, and organizational needs is important to achieve optimal results. Test Environment and Infrastructure Creating an effective test environment and infrastructure is crucial for successful microservices testing. A well-designed test environment ensures that the testing process is reliable and efficient and replicates the production environment as closely as possible. Here are some key considerations for setting up a robust microservices test environment and infrastructure: Containerization and Orchestration Containerization platforms like Docker and orchestration tools such as Kubernetes provide a flexible and scalable infrastructure for deploying and managing microservices. By containerizing microservices, you can encapsulate each service and its dependencies, ensuring consistent environments across testing and production. Container orchestration tools enable efficient deployment, scaling, and management of microservices, making it easier to replicate the production environment for testing purposes. Environment Configuration Management Maintaining consistent configurations across different testing environments is crucial. Configuration management tools like Ansible, Chef, or Puppet help automate the setup and configuration of test environments. They allow you to define and manage environment-specific configurations, such as database connections, service endpoints, and third-party integrations, ensuring consistency and reproducibility in testing. Test Data Management Microservices often interact with databases and external systems, making test data management complex. Proper test data management ensures that test scenarios are realistic and cover different data scenarios. Techniques such as data virtualization, where virtual test data is generated on the fly, or synthetic data generation, where realistic but non-sensitive data is created, can be employed. Additionally, tools like Flyway or Liquibase help manage database schema migrations during testing. Service Virtualization Service virtualization allows you to simulate or virtualize the behavior of dependent microservices that are not fully developed or available during testing. It helps decouple testing from external service dependencies, enabling continuous testing even when certain services are unavailable or undergoing changes. Tools like WireMock, Mountebank, or Hoverfly provide capabilities for creating virtualized versions of dependent services, allowing you to define custom responses and simulate various scenarios. Continuous Integration and Delivery (CI/CD) Pipeline A robust CI/CD pipeline is essential for continuous testing and seamless delivery of microservices. The CI/CD pipeline automates the build, testing, and deployment processes, ensuring that changes to microservices are thoroughly tested before being promoted to higher environments. Tools like Jenkins, GitLab CI/CD, or CircleCI enable the automation of test execution, test result reporting, and integration with version control systems and artifact repositories. Test Environment Provisioning Automated provisioning of test environments helps in reducing manual effort and ensures consistency across environments. Infrastructure-as-Code (IaC) tools like Terraform or AWS CloudFormation enable the provisioning and management of infrastructure resources, including virtual machines, containers, networking, and storage, in a programmatic and reproducible manner. This allows for quick and reliable setup of test environments with the desired configurations. Monitoring and Log Aggregation Monitoring and log aggregation are essential for gaining insights into the behavior and health of microservices during testing. Tools like Prometheus, Grafana, or ELK (Elasticsearch, Logstash, Kibana) stack can be used for collecting and analyzing metrics, logs, and traces. Monitoring helps identify performance bottlenecks, errors, and abnormal behavior, allowing you to optimize and debug microservices effectively. Test Environment Isolation Isolating test environments from production environments is crucial to prevent any unintended impact on the live system. Test environments should have separate infrastructure, networking, and data resources to ensure the integrity of production data. Techniques like containerization, virtualization, or cloud-based environments provide effective isolation and sandboxing of test environments. Scalability and Performance Testing Infrastructure Microservices architecture emphasizes scalability and performance. To validate these aspects, it is essential to have a dedicated infrastructure for load testing and performance testing. This infrastructure should include tools like Gatling, Apache JMeter, or Locust, which allow simulating high loads, measuring response times, and analyzing system behavior under stress conditions. By focusing on these considerations, organizations can establish a robust microservices test environment and infrastructure that closely mirrors the production environment. This ensures accurate testing, faster feedback cycles, and reliable software delivery while minimizing risks and ensuring the overall quality and reliability of microservices-based applications. Test Automation Tools and Frameworks Microservices testing can be significantly enhanced by utilizing various test automation tools and frameworks. These tools help streamline the testing process, improve efficiency, and ensure comprehensive test coverage. In this section, we will explore some popular microservices test automation tools and frameworks: Cucumber Cucumber is a widely used tool for behavior-driven development (BDD) testing. It enables collaboration between stakeholders, developers, and testers by using a plain-text format for test scenarios. With Cucumber, test scenarios are written in a Given-When-Then format, making it easier to understand and maintain test cases. It supports multiple programming languages and integrates well with other testing frameworks and tools. Postman Postman is a powerful API testing tool that allows developers and testers to create and automate tests for microservices APIs. It provides a user-friendly interface for sending HTTP requests, validating responses, and performing functional testing. Postman supports scripting and offers features like test assertions, test data management, and integration with CI/CD pipelines. Rest-Assured Rest-Assured is a Java-based testing framework specifically designed for testing RESTful APIs. It provides a rich set of methods and utilities to simplify API testing, including support for request and response specification, authentication, data validation, and response parsing. Rest-Assured integrates well with popular Java testing frameworks like JUnit and TestNG. WireMock WireMock is a flexible and easy-to-use tool for creating HTTP-based mock services. It allows you to simulate the behavior of external dependencies or unavailable services during testing. WireMock enables developers and testers to stub out dependencies, define custom responses, and verify requests made to the mock server. It supports features like request matching, response templating, and record/playback of requests. Pact Pact is a contract testing framework that focuses on ensuring compatibility and contract compliance between microservices. It enables teams to define and verify contracts, which are a set of expectations for the interactions between services. Pact supports various programming languages and allows for generating consumer-driven contracts that can be used for testing both the provider and consumer sides of microservices. Karate Karate is an open-source API testing framework that combines API testing, test data preparation, and assertions in a single tool. It uses a simple and expressive syntax for writing tests and supports features like request chaining, dynamic payloads, and parallel test execution. Karate also provides capabilities for testing microservices built on other protocols like SOAP and GraphQL. Gatling Gatling is a popular open-source tool for load and performance testing. It allows you to simulate high user loads, measure response times, and analyze system behavior under stress conditions. Gatling provides a domain-specific language (DSL) for creating test scenarios and supports distributed load generation for scalability. It integrates well with CI/CD pipelines and offers detailed performance reports. Selenium Selenium is a widely used web application testing framework that can also be leveraged for testing microservices with web interfaces. It provides a range of tools and APIs for automating browser interactions and performing UI-based tests. Selenium supports various programming languages and offers capabilities for cross-browser testing, test parallelization, and integration with test frameworks like TestNG and JUnit. These are just a few examples of the many tools and frameworks available for microservices test automation. The choice of tool depends on factors such as project requirements, programming languages, team expertise, and integration capabilities with the existing toolchain. It’s essential to evaluate the features, community support, and documentation of each tool to select the most suitable one for your specific testing needs. Monitoring and Observability Monitoring and observability are essential for gaining insights into the health, performance, and behavior of microservices. Key monitoring aspects include: Log Aggregation and Analysis: Collecting and analyzing log data from microservices helps in identifying errors, diagnosing issues, and understanding the system’s behavior. Metrics and Tracing: Collecting and analyzing performance metrics and distributed traces provides visibility into the end-to-end flow of requests and highlights bottlenecks or performance degradation. Alerting and Incident Management: Establishing effective alerting mechanisms enables organizations to proactively respond to issues and incidents. Integrated incident management workflows ensure timely resolution and minimize disruptions. Distributed Tracing: Distributed tracing techniques allow for tracking and visualizing requests as they traverse multiple microservices, providing insights into latency, dependencies, and potential bottlenecks. Conclusion The performance, scalability, and reliability of complex distributed systems depend on the reliability of microservices. Organizations can lessen the difficulties brought about by microservices architecture by adopting a thorough testing strategy that includes unit testing, integration testing, contract testing, performance testing, security testing, chaos testing, and end-to-end testing. The overall quality and resilience of microservices-based applications are improved by incorporating best practices like test automation, containerization, CI/CD, service virtualization, scalability testing, and efficient monitoring, which results in better user experiences and successful deployments. The performance, dependability, and quality of distributed software systems are all dependent on the results of microservices testing. Organizations can find and fix problems at different levels, from specific microservices to end-to-end scenarios, by implementing a thorough testing strategy. Teams can successfully validate microservices throughout their lifecycle with the right test environment, infrastructure, and monitoring tools, facilitating quicker and more dependable software delivery. In today’s fast-paced technological environment, adopting best practices and using the appropriate testing tools and frameworks will enable organizations to create robust, scalable, and resilient microservices architectures, ultimately improving customer satisfaction and business success.
Automation testing is one of those technical fields where you never stop learning. Whether you’ve been doing it for a few months, a couple of years, or over a decade, there is always something new and exciting to find out. This is why this roundtable is a must-read for anyone working with automation testing or even considering it. To get a comprehensive idea about the state of the automation industry and how to be good at automation testing, we’ve talked to some of the finest QA minds. Here are the industry leaders we interviewed for this article: Jesse Friess, Director of Software Engineering & Administration, Triad Financial Services Gokul Sridharan, Director of Solution Engineering, Katalon Oksana Wojtkiewicz, Head of Sales and Marketing, SOLWIT SA Nitish Bhardwaj, Senior Product Manager, SmartBear Marcus Merrell, Vice President of Technology Strategy, Sauce Labs Yarden Ingber, Head of Quality, Applitools Andrew Knight, Software Quality Advocate, AutomationPanda.com And this is what we learned from them. Automation Testing: Why Is It Important and What’s a Good Strategy for Automating Testing? How do you know when you need to automate a testing project and how do you approach it in the most sensible way? Our experts know exactly how! 1. What Are the Short-Term and Long-Term Business Benefits of Automating Testing? Jesse Friess: Automated testing is crucial because it allows you to cover a large number of scenarios quickly. However, it is not the be-all and end-all of QA. There still needs to be unit testing done by the developer. This ensures the quality of the code at a micro level, while automation covers the macro. Gokul Sridrahan: Some of the short-term benefits of automation testing I’ve noted include upgrading skills within the team, making regression testing more repetitive, getting the ability to do different types of testing and better use of the application inventory. Long-term benefits of automation include a broader strategy with a broader goal, a mission and vision statement on what to accomplish using automation, better planning for your finances as an organization, more variety in choosing the technology or tool to complete your goal, and better accountability across the organization. Oksana Wojtkiewicz: Business benefits can only be weighed by considering automation, technology, and tools used in creating the product. In some cases, we can find out the results of tests even within a few hours using commercial automation tools. In the case of more complex test items and the multitude of automation tools or customized frameworks used, this time will significantly expand, even up to 2-3 months. Despite this, it is worth noting that this is not time wasted for business. The time saved by automated testing can be used to perform other, more complex tasks. The most substantial long-term benefits include quick feedback on the quality of an application, the ability to test more often, and releasing manual testers’ time and resources. 2. Is Testing Automation Required for Every Software Project, or Are There Situations Where It’s Not Needed? Jesse Friess: Automation is not always needed. For instance, if you purchase a COTS (commercial off-the-shelf) solution, then there is no need to spend on testing. If you’re developing your own solutions in-house, then there obviously needs to be testing before you roll it out. Gokul Sridrahan: In my experience, there are specific application candidates that you can choose for automation. There are some really complex applications with multiple layers of elements and objects where you will spend a ton of time creating the script in the first place. Script creation time and time to value are two important things to think about when considering automation. Other candidates include legacy applications written in legacy programming languages. The benefits of automating these applications outweigh the cost. Oksana Wojtkiewicz: Automation is not always required, and there is no business case for doing so in many cases. This is particularly true for short-term projects, where automation can take longer than product development. Another example would include strictly hardware-related projects requiring manual actions, e.g., replacing a chip, rewiring expansion cards, etc. In this situation, the cost of setting up automated testing equipment may be compared to the benefits. A project focused only on the graphical part of the application (UI), which changes frequently, would also be an example. 3. Are There Some Criteria, Maturity or Otherwise, a Company Should Meet To Be Able To Take Advantage of Automation? Gokul Sridrahan: In my experience, maturity occurs when scale occurs. In general, you want to start automating when a few things happen: your application under test is extensive and it takes a lot of time to get test coverage; you have a large team, but an ineffective usage of your resources has led you to believe that automation is a necessity; you have multiple applications and only one team to manage it all; you have multiple teams using multiple technologies. 4. Do We Always Need an Automation Testing Strategy, or Can We Do Without It? Gokul Sridrahan: Smaller teams may not need the formality of building a complete end-to-end automation strategy and probably don’t require a strategy with an end game in mind. They have the luxury of experimenting. Any company with seriousness and attention towards user experience should really pay attention to the same and have an end goal — what success means to them. Oksana Wojtkiewicz: The automation strategy should include, among other things: The definition of the scope of automation and the level of testing The definition of the framework and tools for automation Identification of test environments Creation of the tests themselves and running them It seems impossible to complete automation by skipping any of the above steps. It is worthwhile to keep in mind that not having a strategy can also be a strategy. 5. What Features Should Companies Consider When Designing an Automation Testing Strategy? Gokul Sridrahan: These are the questions I suggest answering when creating a strategy: What’s the test coverage? How do you ensure edge cases are covered? What types of applications can be tested? How frequently do you test and how long does it take? How complex is it to build? How can it be maintained and built upon? How can it be effective long-term? How can you drive an effective process? How do you increase accountability and visibility across the organization and show why automation is much more important? If one approach doesn’t work, what’s the alternative, and how to evolve as tech evolves? Oksana Wojtkiewicz: These features are due to the steps included in the strategy: Test type and level The test team’s resources and skills The desired features of the framework and automation tools The purpose of test automation 6. How Do You Know Your Automation Testing Strategy Is Actually Good? Gokul Sridrahan: I would suggest the team ask itself the following questions about an automation strategy: Is it scalable? Does it consider all the difficult parts of your organization and growth potential? Will it work? As your organization grows, is it true that a test framework will also grow along with your maturity? Did we consider the time to learn and time to evolve when considering an automation strategy? Oksana Wojtkiewicz: A strategy is appropriate when it achieves the defined objectives with an acceptable ROI. In light of these two criteria, we can say without a doubt that the automation path chosen is the right one. 7. What Are Some Best Practices or Trends You Are Witnessing in the Software Testing Automation Field? Gokul Sridrahan: Some of the trends I’ve noticed lately in testing automation include: Incorporating AI into automation Analytics and insights are everything Applications are getting complex — even more so than 10 years ago Visual testing is much more than image comparison Running your tests on the cloud and scaling multiple parallels DDT and BDD are still popular Oksana Wojtkiewicz: The context of recent trends suggests that many tools are adopting AI and taking a codeless approach to capture the market. It would also be worth mentioning the Playwright framework (open source), something the testing community has been hearing more and more about. Automation Testing Team: Who’s in Charge of the Operation? Next, here's a quick guide to building an automated testing team and helping it achieve maximum efficiency through management and communication. 1. Should There Be a Separate Automation Testing Team, or Does It Need To Operate as Part of the General QA Department? Andrew Knight: It really depends on the organization’s needs and who they expect to develop the test automation. For example, if they are developing a web application and it’s a small startup with only developers, it probably doesn’t make sense for them to just bring in four or five QA engineers into the company rather than really bake it into their development process. On the other hand, if you’ve got a larger organization where you already have teams of developers and a whole QA division, it can be very hard to shift left because organizational structures are already in place. This may be the case where you need to work with a QA organization and shift left from there. Yarden Ingber: Depends on the product and the organization. What I want to see in an automation team is a team of developers. I think the most important task of automation is writing code. And in order to write code, you need to be a developer. So, I think a team of automation developers should be software developers. Nitish Bhardwaj: In my opinion, there should be a dedicated automation testing team (or a smaller squad if working on a tight budget) that operates as part of the general QA department. While the QA team can also have some level of automation testing expertise, having a dedicated automation testing team ensures that automation testing is given the attention and resources it needs to be successful. 2. What Are the Typical Roles in an Ideal Automation QA Team? What Are Their Responsibilities? Nitish Bhardwaj: This depends on the size of the team, the budget for automation, and the maturity of the existing test automation system. If an organization is just starting automation — it should have at least one automation lead and a couple of automation engineers to start with. With bigger organizations or more mature setups, it will be a bigger team that has similar roles like a development team, an engineering manager, an architect, senior automation engineers, automation engineers, and operations engineers. 3. Which Qualities Are You Looking For in Candidates To Fill Each Typical Automation QA Team Role? Yarden Ingber: If we are looking at automation developers as software developers, I would like to see software programming capabilities. I would test them as you would test a software developer. Another side I think is very important is the DevOps side. Also, I want to know if they have experience specifically with the product that I’m using. For example, testing mobile native apps is very different from testing web apps. So if I have a mobile native application, I could hire someone that worked in testing web applications, but it will take them much more time to get to the point. Andrew Knight: If I’m specifically looking to hire an automation engineer or SDET, I want to make sure they can actually do the job of test automation because many people inflate their skills in this area. I look for people who actually have decent programming skills — that’s a bare minimum. Then I want to see that they actually worked on test automation projects somewhere. I need to know if they have done some sort of unit testing, API testing, and web UI testing. With entry-level specialists who don’t have a large portfolio in testing, I would gauge their interest and their aptitude for testing automation. Because some people want to get their foot in the door, I don’t want to hire someone who thinks of testing automation as a stepping stone. Nitish Bhardwaj: For all the roles, it’s very important that they understand the role of a QA and test automation and they are passionate about test automation. One mistake many organizations make is hiring developers to write test automation frameworks and systems. These developers might have better technical knowledge and more coding skills, but their lack of empathy for QA and its importance doesn’t work for QA teams in the long term. For an Automation QA Lead, I would look for in-depth knowledge of automation testing tools and frameworks, experience developing and implementing automation testing strategies, and strong leadership and project management skills. For an Automation QA Engineer, I want to see good programming skills, experience developing automation testing scripts and frameworks, and an ability to troubleshoot issues and bugs with automation testing tools and scripts. 4. What Is the Smallest Possible Team Size for an Automation Testing Project? Yarden Ingber: It really depends on the size of the company and the scale of the project. It also depends on the number of developers. So if you have fifteen software developers and one or two automation developers, the automation developers can become a bottleneck in the project pipeline. It’s very hard to decide on a perfect number, but there should be some balance between the number of developers and the number of automation developers. I would say it’s one QA for every five developers, but it can be more or less than that. Andrew Knight: One automation engineer. It’s not ideal, but having at least one SDET or test automation engineer can add significant value. I know because that’s exactly what I did at my previous job. I was the first SDET they hired. And even though I was working alone for a long time, I was still able to add value. It’s not like we were aiming for 100% completion, but the tests I wrote added significant value, even if they only had a fraction of coverage, they were still catching bugs every time. Nitish Bhardwaj: As I mentioned, it depends on the scope, complexity, and budget. However, if you have to start small, start with an automation QA lead. Do not make the mistake of hiring one or two QA engineers first and a lead later on. One QA lead with one or two QA automation engineers would be a great start. 5. Who Do You Think Makes Better Automation QAs: People Who Switch From Software Development or From Manual QA? Yarden Ingber: I think a good strategy is “shift left”. The manual QAs can write the test cases manually. The developers will then write the code based on the cases that the manual QAs created. I think that’s a good starting point for small companies. And when you grow into bigger companies with more products and more teams that develop multiple projects at once, I think having a strong automation team that mostly consists of software developers can be a good approach for bigger companies that have multiple teams and would like to have a shared CI system for multiple development teams. Andrew Knight: I would say that engineers who have a developer mindset are better at automation initially. As an automation QA, you create software that is meant to test other software. You need to have the same development skills like code reviews, reading other people’s code, and testing your own code. People with a development background already have that skill set, they are just applying it to a different domain. Whereas when you have engineers who are traditionally manual testers, they know how to write and execute good tests, but they don’t necessarily have those development-oriented skills. 6. Let’s Say You Are Hired To Build an Automation Process From Scratch at an Organization, and There Is No Concern About the Budget. What Is Your Ideal Team Setup? Andrew Knight: My ideal setup is a small core team of SDETs who really have that developer mindset and background to approach a problem in testing automation and exist within a greater organization or company to empower other teams to have success with testing automation and quality. They would almost be like an internal consultancy team, the ones who are providing tools and recommendations, and coming alongside teams to help them get started with it. But I also believe that it’s the individual product teams that need to be responsible for their own quality. They should not be writing code, throwing it over the fence, and having somebody else take care of the software quality. 7. How to Effectively Manage an Automation QA Team and Who Should Do It? Nitish Bhardwaj: The automation QA team should be managed by the automation QA Lead, who should provide guidance, support, and feedback to team members. It is essential to ensure that team members have the necessary resources and tools to perform their jobs effectively. Regular communication, team meetings, and performance reviews can help to ensure that the team is meeting its goals and objectives. The QA automation lead can report to the QA manager/director, an engineering manager, or a technology leader. 8. When You Already Have a Team, How Do You Evaluate It? How Do You Know It’s Efficient? Are There Some Metrics You Can Use? Yarden Ingber: It can be challenging to track the success of an automation developer because they don’t develop the product itself, meaning you don’t have bug reports from the customers. Still, there are definitely some metrics you can use. For example, positive feedback from product developers — If they are happy with the test suite that the automation developers created — that’s a sign of a good automation developing team. I think the stability of the test suite is another very important metric because if the test suite is unstable, then the trust of the entire company in the test suite fails. At the same time, I don’t like the idea of testing or measuring the team by the number of tests created per week or the number of tests fixed, or similarly hard metrics. I would rather have softer metrics. Andrew Knight: You essentially want to build a scorecard that indicates the health and maturity of the team in their testing journey. These metrics shouldn’t be punitive, but rather a way to reveal opportunities for areas of improvement. If you make the metrics punitive, then the whole thing is off. You’ve got to make sure that you message that appropriately. I believe metrics should reflect part of the story, but not the whole story. Some of the things I would look for when scoring a team would be things like: How long does it take you to develop a new test? Does the framework help you or hinder you from making progress? How long does it take to run a test suite —15 minutes, an hour, 20 hours? What is the amount of time for the feedback, from the developer pushing the change to getting the result? 9. Is It Possible To Run a Distributed Automation Testing Team? How To Build Efficient Communication With a Distributed/Offshore Team? Yarden Ingber: All of my automation developers are outsourced and do not share an office with me. On the other hand, most of the software developers do share an office with me. What I see is that there is a very big advantage for people that work together. And you can see it in day-to-day actions, when I sit with someone to eat lunch. We talk about work and ideas can pop up. This is something you can hardly have if you just communicate through text or Zoom meetings. But we’re trying! It’s very important for me to open each meeting with at least 15-20 minutes of talking about everything except work. We talk about the weather in everyone’s country, personal matters, family stuff, traveling, and even how their weekend was. It can be complicated to have this personal communication when everyone is in different locations, so these personal discussions absolutely help. Andrew Knight: I definitely enjoy working with an in-house team more, but we also successfully adjusted to the shift to remote work. Plus, working in tech, we were more prepared to go remote than many other industries. But there isn’t much I’m doing differently than with in-house teams. You do have to be more intentional with communication to make sure that you can build both personal and professional connections. I would love to be able to meet up in person regularly, but when that is not available, you can still make it work. Nitish Bhardwaj: I think with remote working and effective collaboration tools, it’s not going to be a challenge. It is essential to establish clear communication channels and set expectations for communication and feedback. Additionally, it is important to ensure that the team members have the necessary resources and support to perform their jobs effectively, regardless of their location — for example, it’s vital to ensure equal access to the test environment from all locations, and take into account things like network lags. 10. What Is Team Culture to You When We’re Talking About Automation Testing? Yarden Ingber: I think that both manual and automation developers are very dependent on the product developers. So both for manual and automation QAs, I value the ability to communicate well. They can explain their ideas and what is blocking the progress. When they have a request from the developers and are able to communicate their needs, the developers can quickly help. As for other aspects of team culture, I don’t think a company should be absolutely homogenous. Different people with different views and different personalities can benefit from one another, so differences should not be an issue. Andrew Knight: I think there are certain things that people on a test automation team would understand that people from other teams would not, such as test automation diversity or different testing conferences and events. That’s part of the team culture. You also want a healthy team culture. You want people who are respectful of each other, optimistic, energetic, with a go-getter attitude, who love what they’re doing and enjoy working with people on the team, who are not afraid to give critical feedback but do so in a healthy manner. Those are the qualities I look for in a team. Automation Testing Tools: Choosing the Right One for the Job Find out how to pick the perfect tool and technology for your testing automation project. 1. What Are Your Favorite Automation Testing Tools and Why? Marcus Merrell: I don’t tend to choose tools until I understand what the project is and what they’ll be used for. I guess my answer is that my favorite automation testing tool is the one that fits best into the culture of the team and the rest of the stack the developers are using. The tools I’ve used the most often in my automation career have been some variations of Java/TestNG/Selenium, but I would never say that those are adequate for all testing purposes. It usually takes over a dozen different tools sprinkled throughout the SDLC to make automated testing happen. 2. Do You Pick Tools Based on the Project Specifics, or Do You Have a Universal Go-To Tool Set? Marcus Merrell: I think there’s no such thing as a universal go-to tool set. Even if you’re going from one eCommerce web app to another, the tool selection should reflect the team, the programming languages they prefer, the CI/CD product they’re using, and a bunch of other factors. I tend to gravitate toward projects that use Java, just because I’ve found that to be the best language for distributed, multinational teams of variable skill levels working on the same stack at the same time. But I understand that’s not the right answer for everyone, and that the market is rapidly shifting toward JavaScript, TypeScript, Rust, and other, newer languages and philosophies. 3. Who Is Involved in the Process of Selecting the Tools at Your Organization? Marcus Merrel: It’s generally left up to the teams, but in some cases, we don’t have many choices. Our teams are oriented towards microservices to provide APIs to internal and external customers. We have some flexibility, but Sauce Labs is a software company that’s fundamentally in the business of managing hardware. When you have several thousand iOS and Android devices in a lab, you’re somewhat limited to the choices offered by Apple and Google, not to mention all the low-level networking and USB cable-level code we have to deal with. 4. What Are the Features You Are Looking For in an Ideal Tool? Marcus Merrell: Personally, I start by looking for open-source tools with a strong community. The community around an open-source project is a lot like the corporation behind a commercial product: a strong community means strong support and ongoing maintenance/feature work. A weak community is similar to what you get when you buy a small product from a large corporation. When choosing commercial tools, I rely on references and my network to help me understand the product’s roadmap, stability, support, etc. I also favor tools that do one thing and do it well. I’ve been involved with projects where people ask us to put some random features in, because “competitive projects have XYZ features, why don’t you?”. My preference is always to explain why they don’t actually want us to implement a poor version of XYZ, but to integrate with an amazing tool that specializes in XYZ. This way, we can continue to be experts in our thing, while they stay experts in theirs. I’m a big fan of specializing. 5. What Other Principles Do You Use in Your Decision-Making Process? Marcus Merrell: I come from the open-source world, so I have a natural inclination toward open-source tooling. I like projects that have a rich ecosystem of commercial support that surrounds an open source product — this means I pay people to help me use a product that’s free, rather than paying some seat or consumption price for the software. There are pros and cons to this approach, and I’ve definitely bought my share of commercial software, but this is my initial approach. 6. Do the Available Automation Tools Fully Meet Your Needs, or Do You Feel Like the Market Is Missing Something? Marcus Merrell: The market is absolutely missing some critical things, and as far as I can tell, nobody is even seriously trying to attain them. While the entire ecosystem of testing products is trying to automate the human tester out of a job, we’re not getting answers to simple questions like “What is the value of my automated tests?”, “How can I make my testing better, more effective, and faster?”, “What kind of testing am I missing?”. The analysis of test signals is only being considered by a couple of products, and it drives me a bit crazy because this is such a solvable problem. We have all this data but no idea how to analyze it. In my opinion, this is more valuable than an effort at low-code test automation.
In today’s tech world, where speed is the key to modern software development, we should aim to get quick feedback on the impact of any change, and that is where CI/CD comes in place. Setting up CI/CD is crucial for automated test cases because it gives more consistent results, agility, and efficient assessment of minor changes. With the shorter release and fast development cycles, we need to check that automated tests are passing on every code change, which can be easily done by configuring the CI/CD pipeline. Let’s try to understand the impact of CI/CD with an example here: You want to execute the automation suite whenever anyone pushes the code in the Git repository. So, how will you do that? With no CI/CD set up for your project: To run the automation suite, you have to pull the latest code and then run it locally on your machine, which is a tedious and lengthy process. With CI/CD implemented: You will set up your project’s CI/CD pipeline. So, your automation test suite would run whenever anyone pushes a code or creates a pull request automatically. It will take the latest code from the Git repository and run the test. Setting up CI/CD helps to save time. Over the past few years, DevOps has become an important part of the software life cycle, leading to the growth of various DevOps tools and practices. There are many CI/CD tools available in the market used by automation testers, but these days, most people are using GitHub for version control and collaboration. So, GitHub Actions is becoming the favorite choice for automation testers/developers for CI/CD. GitHub Actions is offered by GitHub as a SaaS offering. With GitHub Actions, you can achieve scheduling and parallelization and run your tests on Docker Containers, and you can do much more. In this blog on Cypress with GitHub Actions, we will learn how to set up a CI/CD pipeline for Cypress test cases using GitHub Actions. What Is Cypress? Cypress is an open-source framework that is built on Node.js and comes packaged as an npm module. It uses JavaScript for writing the test cases. As it runs tests directly inside the browser, it is fast compared to the other automation testing tools (or frameworks). Cypress automation tool has received huge acceptance from QA and Developers and is constantly gaining popularity due to its features like easy installation, easy debugging, real-time reloads, and ease of use. Below are some of the features that make it unique from other web automation tools in the market: Easy Installation. Easy Debugging. In-built support for Retries, Screenshots, and Videos (No Extra code needs to be written). Automatic Reload (Cypress reloads your test case whenever you make any changes to your script). Can easily run your tests on different viewports (iPad, iPhone, Samsung). Flake Resistant (Cypress waits for commands and assertions before moving on to the next step, which avoids flakiness in a test). What Is GitHub Actions? GitHub Actions is a Continuous Integration (build, test, merge) and Continuous Delivery (automatically released to the repository) (CI/CD) platform. It automates and executes the software development workflows right from GitHub. With GitHub Actions, you can build, test, and publish across multiple platforms, operating systems, and languages all within the same workflow and then see the status checks displayed within the pull request. All GitHub Actions are handled via workflows, which are .yml files placed under the .github/workflows directory in a repository that defines automated processes. It brings automation directly into the software development lifecycle via event-driven triggers such as creating a pull request or pushing a code into a remote branch. With GitHub Actions, the pain of keeping your plugins updated is gone (which is the issue for other CI/CD tools like Jenkins). You don’t have to go to a separate tab for running CI/CD and pushing your code. It is all done in GitHub. Some of the features of GitHub Actions are: Ease of setup. There is no installation required as it is on the cloud. Support of multiple languages and frameworks. Provides actions to every GitHub event. It can be shared via GitHub Marketplace. No explicit plugin is required for caching. You can write your own caching mechanism if you require caching. Asynchronous CI/CD is achieved using GitHub Actions. At present, GitHub Actions is free to use for public repositories, and for private, it has a pay-as-you-go mechanism. How To Set Up GitHub Actions in Your Git Repository GitHub Actions lets you run workflows when a certain triggering event occurs, such as a code push or pull request. A GitHub repository can have multiple workflows. For example, we can create one workflow to run our test case whenever there is a code push and one workflow to test the pull requests. As part of this Cypress with GitHub Actions tutorial, we will learn how to set up the GitHub Action pipeline whenever there is a code push to the remote branch. Below are the steps to set up GitHub Actions in the Git repository: Log in to your GitHub account. Create a workflow: Click on Add file > Create new file. 3. Enter .github/workflows/main.yml (It has to be in the same sequence). main.yml is the file name. You can name it as per your choice. 4. Create a workflow file. Follow the below sample code to create a GitHub Actions workflow (.yml file) JavaScript name: Add Action on: push: branches: - main jobs: Cypress-Test: runs-on: ubuntu-latest steps: - name: Checkout GitCode uses: actions/checkout@v2 name: It is the name of the workflow file. It appears in the “Actions” tab of the GitHub Repository with the same name. on: On is the trigger condition. It specifies the trigger for the workflow. We have to add the events (push, pull_request, etc.) as subsections. In the above example, the workflow would be triggered whenever anyone pushes a change to the repository. jobs: Groups all the jobs/tasks that are defined inside the job section in the yml file. We can define the name of jobs just in the next line. runs-on: ubuntu-latest: It configures the job to run on the latest version of Ubuntu. The job will execute on a fresh virtual machine hosted by GitHub. You can configure any value of the OS where you would like to execute the GitHub Actions workflow on. Steps: It groups together all the steps that are defined in the yml file. There can be multiple steps in a .yml file. It is placed at the same level as the on section. Uses: Syntax to use the GitHub Actions in the .yml file. You can pass the GitHub Actions to run the test case/ to check out the code. Now, if you add this simple workflow to your repo and push your code on the main branch, you can check that a workflow has been added under the Actions tab. If you click on the workflow run, you’ll see some details such as the commit, the status, the time duration, as well as the jobs that have run. As per the above screenshot, we have only one job, i.e., Cypress-Test. We can also view the logs about setting up the job and completing it just by clicking Cypress-Test. How To Run Cypress Tests Using GitHub Actions Cypress is a test automation tool for web applications, and GitHub Actions is a service that allows developers to automate various tasks, such as building, testing, and deploying code. Using Cypress with GitHub Actions allows us to automate the testing of a web application as part of the continuous integration and continuous deployment process. This can help ensure that the application is working properly and save time and effort for the development team. To use Cypress with GitHub Actions, you would need to create a workflow file that defines the steps for running your Cypress tests. This file would be stored in your GitHub repository and would be executed by GitHub Actions whenever a specified event, such as pushing code to the repository, occurs. We will see, step by step, how to run the tests using Cypress with GitHub Actions. Prerequisites GitHub Repository with working Cypress code Knowledge of Cypress Test Scenario Open URL Log in to the application. Search the product. Verify the searched product. Below is the Cypress end-to-end test code where the user opens the browser, searches for a product, and verifies it in the end. JavaScript describe("Automation using Cypress", () => { it("Open website and enter username, password", () => { cy.visit( "https://ecommerce-playground.lambdatest.io/index.php?route=account/login" ); }); it("Login into the application using credentials", () => { cy.get('[id="input-email"]').type("lambdatest.Cypress@disposable.com"); cy.get('[id="input-password"]').type("Cypress123!!"); cy.get('[type="submit"]').eq(0).click(); }); it("Search the Product", () => { cy.get('[name="search"]').eq(0).type("Macbook"); cy.get('[type="submit"]').eq(0).click(); }); it("Verify Product after search ", () => { cy.contains("Macbook"); }); }); In order to run the above code locally through the terminal, we would run the command “npx cypress run”. It will run the Cypress test automation and show you the test result in the terminal just like shown below: Let’s learn how to run the above tests using Cypress with GitHub Actions. If you have followed setting up GitHub Actions in the above section of this blog on Cypress with GitHub Actions, by now, you will know how to create a workflow .yml file. Follow the below code to run the tests using Cypress with GitHub Actions: JavaScript name: Cypress Tests on: [push] jobs: Cypress-Test: runs-on: ubuntu-latest steps: - name: Checkout GitCode uses: actions/checkout@v2 - name: Run Cypress Test uses: cypress-io/github-action@v4 with: command: npx cypress run browser: chrome In order to run it using Cypress with GitHub Actions workflow, we will use the Cypress official GitHub Actions (cypress-io/github-action@v4) (ref : Line11) We have provided the name of the workflow as “Cypress Test.” 2. We provided the trigger condition. In the above case, it would be pushed. So, whenever there is a code push to the remote branch, this workflow is executed. 3. Jobs’ name is passed as “Cypress Test,” under which we are providing the OS info to run it on the latest Ubuntu OS. 4. There are two steps added as part of the shared workflow .yml file. The first step would be to check the code from the GitHub repository using GitHub Actions “actions/checkout@v2”. The second step would be to run the Cypress tests. Cypress has official GitHub Actions for running the Cypress tests, which is “cypress-io/github-action@v4”. This action provides npm installation, custom caching, and additional configuration options and simplifies the setup of advanced workflows using Cypress with the GitHub Actions platform. The above workflow would be called whenever there is a code push in your remote GitHub repository. Once the workflow is run, you can view the workflow run status from the Actions tab in your GitHub Project, just as shown in the screenshot below. You can also check the step-by-step summary of each GitHub Actions just by clicking the file name, and it will take you to the summary screen. cypress-io/github-action@v4.x.x . . is the Cypress official GitHub Action. By default, it: Install npm dependencies. Build the project (npm run build). Start the project web server (npm start). Run the Cypress tests within our GitHub repository within Electron. How To Run the Cypress Test Case on a Cloud Platform Using GitHub Actions There are multiple approaches to performing Cypress testing on a cloud testing platform like LambdaTest using GitHub Actions. Continuous quality cloud testing platforms such as LambdaTest let you perform manual and automation testing of your websites and mobile applications on an online device farm of 3000+ real browsers, devices, and platform combinations. It provides developers with the ability to run their automated tests using different automation testing frameworks, including Selenium, Cypress, Playwright, and more. One of the ways is using the command as part of Cypress with GitHub Actions. Below is the Cypress end-to-end testing code, which I would run in the LambdaTest cloud platform. JavaScript describe("Automation using Cypress", () => { it("Open website and enter username, password", () => { cy.visit( "https://ecommerce-playground.lambdatest.io/index.php?route=account/login" ); }); it("Login into the application using credentials", () => { cy.get('[id="input-email"]').type("lambdatest.Cypress@disposable.com"); cy.get('[id="input-password"]').type("Cypress123!!"); cy.get('[type="submit"]').eq(0).click(); }); it("Search the Product", () => { cy.get('[name="search"]').eq(0).type("Macbook"); cy.get('[type="submit"]').eq(0).click(); }); it("Verify Product after search ", () => { cy.contains("Macbook"); }); }); To run the Cypress UI tests on the LambdaTest Platform, we need to do the configuration using three steps. Step 1: Install LambdaTest CLI Install LambdaTest CLI using npm, use the below command: npm install lambdatest-cypress-c Step 2: Set up the Config Once the LambdaTest CLI is installed, we need to set up the configuration using the below command: lambdatest-cypress init After running the command, there will be a file created in your project named “lambdatest-config.json”. We need to set up the configuration in order to run our test case on different browsers on LambdaTest. auth: We need to set up the LambdaTest credentials, which will be used in lambdatest-config.json to run my test case on the cloud platform. browsers: Need to specify the browser and OS version on which we want our test case to run. run_setting: Need to set up the config in run_settings.Set up the config name, which would differ based on different Cypress Versions, and set up the spec file. For Cypress version 10 and above, you can follow the below code to set up lambdatest-config.json. JavaScript { "lambdatest_auth": { "username": "user.name", "access_key": "access.key" }, "browsers": [ { "browser": "Chrome", "platform": "Windows 10", "versions": [ "latest-1" ] }, { "browser": "Firefox", "platform": "Windows 10", "versions": [ "latest-1" ] } ], "run_settings": { "config_file": "cypress.config.js", "reporter_config_file": "base_reporter_config.json", "build_name": "Cypress-Test", "parallels": 1, "specs": "./cypress/e2e/*/*.cy.js", "ignore_files": "", "network": false, "headless": false, "npm_dependencies": { "cypress": "11.2.0" } }, "tunnel_settings": { "tunnel": false} Step 3: Execute Test Case Once the config is done, now you can execute the Cypress test case on the LambdaTest cloud Platform. You just need to run the below command to run it on LambdaTest. lambdatest-cypress runAfter the command is run and test cases are executed successfully, you can view the Cypress test run on the LambdaTest cloud platform (As shown in the screenshot below): You can also view the logs by navigating to the logs section. Until now, we have learned how to run Cypress automation tests on the LambdaTest cloud platform using the IDE terminal. Let’s see how to run the above flow using Cypress with GitHub Actions. In order to run it using Cypress with GitHub Actions workflow, we will use the Cypress official GitHub Actions (cypress-io/github-action@v4), which either works on the script passed in the package.json or directly picks the cypress commands passed in the .yml file. So, we will create a script in package.json. In order to run the Cypress test on the LambdaTest cloud platform, we need to run the command “lambdatest-cypress run”. The same command we will create as a script in package.json just as shown below: JavaScript { "name": "cypress_github-actions-example", "version": "1.0.0", "description": "test automation using cypress", "main": "index.js", "scripts": { "test": "lambdatest-cypress run" }, "repository": { "type": "git", "url": "https://github.com/Anshita-Bhasin/Cypress_Github-Actions.git" }, "author": "Anshita Bhasin", "license": "ISC", "devDependencies": { "cypress": "^11.2.0" }, "dependencies": { "lambdatest-cypress-cli": "^3.0.7" } To test if the above script is working fine, Go to the terminal and execute the command “< npm run scriptname >” For the above code, the command would be npm run test. If you see the test result shown below, that means your script is absolutely working fine. To run the same using GitHub Actions. Go to the workflow .yml file. (We covered above how to create a .yml workflow file.) Pass the script name as a command : JavaScript name: Cypress Cloud Tests on: [push] jobs: Cypress-Test: runs-on: ubuntu-latest steps: - name: Checkout GitCode uses: actions/checkout@v2 - name: Run Cypress Test uses: cypress-io/github-action@v4 with: command: npm run test The command basically runs the Cypress command to execute the Cypress tests. It will check in package.json if there is a command with the name “test,” and then it will run, or else it will throw an error. After creating the workflow, you can check the Actions tab in your GitHub Repository. It will show the summary of the workflow run. Now, navigate to the LambdaTest cloud platform to ensure whether tests are executed or not. You should be able to see the latest build on LambdaTest. How To Run Cypress Tests for the Specific Browser Using GitHub Actions You can also run tests on different browsers using Cypress with GitHub Actions. Just provide the Cypress-supported browser name on which you want to run your test case. As part of the code below, I have passed Firefox. So, the test case would run on the Firefox browser, but you can pass Chrome, Electron, or any other browser that Cypress supports. JavaScript name: Cypress Test on Firefox on: [push] jobs: cypress-run: runs-on: ubuntu-20.04 name: E2E Test on Firefox steps: - uses: actions/checkout@v2 - uses: cypress-io/github-action@v4 with: browser: firefox Once the above workflow is executed, you can go to the Actions tab and view the logs, where you should be able to see the browser name. It should be the same as passed in the workflow file. Below is the sample screenshot of the workflow logs: How To Run Cypress Tests in Docker Container GitHub Actions You can also run the Cypress tests in your Docker Container using GitHub Actions. There is already a docker image available online with Chrome version 78 and Firefox version 70 pre-installed. You can pass the container image using an argument container. After executing the test cases, it also stops the container. Below is the sample code to run the Cypress test in the docker container using GitHub Actions: JavaScript name: Cypress Test in custom container on: [push] jobs: cypress-test: runs-on: ubuntu-20.04 name: Cypress Tests on docker container # Cypress Docker image with Chrome v78 # and Firefox v70 pre-installed container: cypress/browsers:node12.13.0-chrome78-ff70 steps: - uses: actions/checkout@v2 - uses: cypress-io/github-action@v4 with: browser: chrome How To Get the Cypress Test Run Results Using GitHub Actions Artifacts You can get the Cypress test result as an artifact after your workflow is executed. You can store the generated videos or screenshots as CI artifacts. Let’s take an example — you want videos of your test run after every run, but you want screenshots only when your test fails. This can be done by passing conditions in the workflow. We will use GitHub Actions actions/upload-artifact@v2 for storing the artifacts and then will pass the condition to show the artifacts. Below is the sample code. JavaScript name: Cypress Artifacts on: [push] jobs: cypress-run: runs-on: ubuntu-latest name: Test Artifacts steps: - uses: actions/checkout@v2 - uses: cypress-io/github-action@v4 - uses: actions/upload-artifact@v2 if: failure() with: name: cypress-screenshots path: ./cypress/screenshots - uses: actions/upload-artifact@v2 if: always() with: name: cypress-videos path: ./cypress/videos After your workflow is executed, you can check the artifacts from the summary of the workflow. Click on the Actions tab. Select the action for the desired workflow run. Click the workflow, and it will navigate to the summary page. Conclusion Cypress is compatible with different CI tools available in the market, but one of the leading ones is GitHub Actions. Cypress even has its official GitHub Action (cypress-io/github-action@v4.x.x), which installs the npm dependency and runs the Cypress test. Running your Cypress test case on the CI/CD platform gives you leverage to schedule it and run it on different trigger events, which saves a lot of time. As part of this blog on Cypress with GitHub Actions, we covered the scenarios where we executed the Cypress test cases using GitHub Actions and also covered running it on a cloud platform. Stay tuned for the next topics on the Cypress with GitHub Actions Series. Topics like test case scheduling, slack notification, and uploading results on AWS S3 Bucket using GitHub Actions will be covered.
Justin Albano
Software Engineer,
IBM
Thomas Hansen
CTO,
AINIRO.IO
Soumyajit Basu
Senior Software QA Engineer,
Encora