Testing Legacy JSP Code

In this article, learn how to test JSP with the least effort while getting the most out of the automated tests, and keep focus on what matters.

Zoltán Csorba

Feb. 18, 26 · Tutorial

Likes (0)

Comment

Save

1.9K Views

JSP might be old, not fancy, or trendy anymore, but many legacy systems still use it, and there are development teams tasked with maintaining and extending systems with a JSP frontend (see https://webtechsurvey.com/technology/javaserver-pages). What can you do when you need to work on a code base that has unit tests for the Java code, but a significant part of the code base is living in (an untested) frontend code and is prone to failures?

You can rely on code reviews or pull requests, but that seems insufficient to flag even trivial issues. You can wait for manual testers or automated UI tests to find problems after the change was deployed to the QA environment, but that is way too late and cumbersome.

Alternatively, you can write an integrated unit test, which seems like an enormous task, considering the benefits may be minimal, and there are hundreds of JSP files to test to make it worthwhile.

The answer is to progress in small steps. Start with something simple and minimal that gives you some benefit, and add more checks as testing matures.

This is how I approached this question and where it led.

Step 1

Start with a single file.

After a quick research, I found many others asking the same question. Found some libraries that could help in the initial steps. Ended up using an embedded Jetty server, configured it to process JSP files using the webapp folder of the project.

Here you can find code samples to set it up. Then, using the HttpTester component of the jetty-http library, I wrote a test case that targets a specific JSP file and extracts the HTML output.

    Java
   
 

   protected HttpTester.Response getJspResponse(String pathToJsp) {
  HttpTester.Request request = HttpTester.newRequest();
  request.put(HttpHeader.HOST, "localhost");
  request.setURI(jspFilePath);
  request.setMethod("GET");
  return HttpTester.parseResponse(localConnector.getResponse(request.generate()));
}
  

I find approval testing a great approach that perfectly fits this use case. The output is saved in a test folder and placed under source control. The next test run was comparing the fresh result with this baseline.

    Java
   
 

   protected void assertHtmlStatusAndResponseBody(Response jspResponse, String pathToHtml, int expectedStatusCode) {
  assertThat(response.getStatus()).withFailMessage("Expecting HTTP response code to be: <%d>, but was: <%d>", 
      expectedStatus, jspResponse.getStatus()).isEqualTo(expectedStatus);
  String htmlOutput = jspResponse.getContent().trim().replace("\r\n", "\n);
  assertTestFileEquals(pathToHtml, htmlOutput);
}
  

Just needed to add some more input data to make the output more meaningful. At this point, I was already able to progress in a controlled manner. As I was polishing the test data, the test was already telling the difference.

Step 2

Add one more test and see the similarities; weed out the changing variables. If it’s working for one file, then it should be easy to extend to another.

Actually, these tests are very similar; a handful of "variables" can describe a test case. The path to the JSP file, the class of the data model, and the name it's referenced.

There's a great article about parameterized tests using EnumSource.

    Java
   
 

   @RequiredArgsConstructor
private enum JspTestCases {
  SETTINGS_USER("/settings/user.jsp", UserSettingsForm.class, "userForm"),
  SEARCH("/search.jsp", SearchQueryForm.class, "searchForm");
  private final String pathToJsp;
  private final Class<?> formClass;
  private final Sting formName;
}
  

And the parameterized test case can be as simple as:

    Java
   
 

   @ParameterizedTest
@EnumSource(JspTestCases.class)
public void testJsp(JspTestCases testCase) throws Exception {
  // given
  String htmlOutputPath = testCase.pathToJsp.replace(".jsp", ".html");
  Object form = SmartValueFactory.createValue(testCase.formClass);
  setModelAttribute(testCase.formName);
  // when
  Response actual = getJspResponse(testCase.pathToJsp);
  // then
  assertHtmlStatusAndResponseBody(actual, htmlOutputPath, 200);
}
  

Creating a good input is not trivial, but for that, I already had a generic test value factory. This itself is a big topic; I’ll talk about it in a separate post, but the basic idea is that we use a "smart mock" object where each method returns a deterministic non-null value.

In some cases, dynamic values appeared in the output, e.g., a session ID or the current timestamp. These needed to be normalized and replaced with placeholders so that the output stays consistent.

    Java
   
   public static String replaceJsessionId(String html) {
  return html.replaceAll("jsessionid=[^\"]+", "jsessionid=%JSESSIONID%");
}

Step 3

Find a way to deal with hundreds of test cases with a clever solution.

This was the state where I could see that it could work, but it’s an enormous task. Writing a test case for hundreds of files is not feasible. And even if it's done once, it seems impossible to maintain.

Yet again, the tests are actually very similar. We just need to list all the paths and determine the data model. Writing a path-traversal algorithm and text extraction is much easier, faster, and more reliable than doing so manually.

With the extraction logic came a nice additional benefit. I was writing it as a unit test itself. First, it was getting the list of existing test cases (finding all JSP path references from the already implemented tests). Then it searches the webapp folder, checking each JSP file for a corresponding test case. If not, then it's analyzing the JSP to figure out the data model and lists all of them as a “test failure” with a handy little code snippet saying that these are the missing test cases.

    Java
   
 

   public static interface JspTestModule {
        String getBaseFolder();
        Class<? extends AbstractJspTestBase> getBaseClass();
}
public void testCoverage(JspTestModule testModule) {
  // given
  logger.info("##########################################################################");
  logger.info("Check JSP test coverage for {}", testModule.getBaseFolder());
  Set<String> existingJspFiles = collectJspFilesFromWebAppFolder(testModule.getBaseFolder());
  // when
  Set<String> jspFileReferences = collectJspFileReferencesFromTestClasses(testModule.getBaseClass(), testModule.getBaseFolder());
  // then
  Collection<String> differences = new TreeSet<>(
    CollectionUtils.disjunction(existingJspFiles, jspFileReferences));
  logger.info("Currently tested JSP files under {}: {}", testModule.getBaseFolder(), jspFileReferences.size());
  Set<String> missing = new TreeSet<>(existingJspFiles);
  missing.removeAll(jspFileReferences);
  if (!missing.isEmpty()) {
    logger.warn("Number of untested JSP files under {}: {}", testModule.getBaseFolder(), missing.size());
    List<String> missingTestCaseDeclarations = new LinkedList<>();
    for (String jspReference : missing) {
      Optional<String> formName = determineCorrectFormDeclarationForJspReference(jspReference, testModule.getBaseFolder());
      if (formName.isPresent()) {
        missingTestCaseDeclarations.add(convertToEnumName(jspReference) + "(\"" + jspReference + "\", " + formName.get() + ".class),");
      } else {
        missingTestCaseDeclarations.add(convertToEnumName(jspReference) + "(\"" + jspReference + "\"),");
      }
    }
    logger.warn("Add following test cases to the parameterized test:\n{}", String.join("\n", new TreeSet<>(missingTestCaseDeclarations)));
  } else {
    logger.info("All JSP files under {} has a corresponding unit test case", testModule.getBaseFolder());
  }
  assertThat(differences).isEmpty();
  assertThat(jspFileReferences).containsExactlyInAnyOrderElementsOf(existingJspFiles);
}
  

This way, I was able to create the necessary test cases from the test output, and then the test passed. It also meant that the maintenance became very easy, since anytime we’re adding a new JSP file, this test would ensure that we won’t forget to add a new test case.

Step 4

See the common development issues and how those can be prevented. One of the repeating mistakes was simple typos in the code. For example, a misplaced quotation mark, a missing closing tag, and so on.

These syntax errors might remain hidden, since the browser is clever enough to display a wrongly formatted HTML code. Since HTML validation is easy and relatively fast to execute in the test, adding it as an additional check on the output was trivial. The question was what to do with the result. There can be surprisingly many validation errors in the legacy codebase, and it’s not feasible to "just get in and fix all of them." So, the test should not fail, but it should give a clear report of the issues detected.

The solution was to append the report at the end of the HTML output in an HTML comment block, which became a part of the baseline. This solution was exposing the problems without forcing immediate action. Still, it enabled us to become aware, and any time a file was worked on for further development, we could chip away at a few. Starting with the fatal errors (XML syntax issues, like unclosed tags, missing quotation marks, etc.) and then the critical issues (HTML syntax issues).

Step 5

Detect and fix potential XSS issues.

Automated vulnerability checks and penetration tests can reveal many vulnerabilities related to end-user-controlled input data, that is posing a potential threat to the system. But even with these reports at hand, it takes a lot of time to track back the problem to specific lines of code and fix them one by one, each time going through exhausting manual testing cycles, just to see that nothing is broken, but those malicious values are not ending up in the output.

What if we can detect these issues without deploying the application, and we can check if the implemented fix did not introduce regression issues?

The test value factory mentioned above provides a convenient way to handle this. The dynamic data model made it possible to inject any values in the fields referenced in the JSP code base. Any time a field with a String type (or more accurately, a method with a String return value) was referenced, then the test value provider was injecting a value like this “<script>alert(‘XSS:fieldname’);</script>”.

The field name was the actual name of the field/method used, and the test was analyzing the output, detecting whether the value appeared unchanged or properly escaped. Then, the result of this analysis is printed in a report file based on the path of the JSP file to src/test/resources/jsptest/xss/path-to-jsp/name-of-jsp.report.

When everything is fine, the report contains a reassuring line that “All fields are protected”, along with the names of these fields. Otherwise, the list of the vulnerable field references and the corresponding line is provided to make it easier to find them.

The most important part of this is that the normal HTML output should remain intact with the fix, since the "normal" values should not change the output (i.e., escaping a value without special characters is the same). This means that the test shows there's no regression issue.

Step 6

How do you know that all lines are trained in the tests?

Conditional blocks (if or choose-when statements), sometimes nested, can make it tough to see which parts are actually active in the test output. It would be nice to see a line coverage report for the JSP code, just like ECLEmma and JaCoCo.

With a neat trick, the line coverage can be automatically extracted in the same run as the normal test is executed. By adding an HTML comment to each line like this:  we can see in the output exactly which JSP lines were used and which ones are missing.

The lineCoverageAnalyzer below is a custom implementation, that inserts these comments in the JSP source, with the additional logic to omit the <script> or HTML comment blocks or multiline tags.

    Java
   
 

   protected HttpTester.Response getJspResponse(String pathToJsp) {
  HttpTester.Request request = HttpTester.newRequest();
  request.put(HttpHeader.HOST, "localhost");
  String tempJspFilePath = lineCoverageAnalyzer.addLineNumberCommentsToJsp(jspFilePath);
  request.setURI(tempJspFilePath);
  request.setMethod("GET");
  return HttpTester.parseResponse(localConnector.getResponse(request.generate()));
}
  

And we extract and delete this information from the output before comparing it to the baseline.

    Java
   
 

   protected void assertHtmlStatusAndResponseBody(Response jspResponse, String pathToHtml, int expectedStatusCode) {
  assertThat(response.getStatus()).withFailMessage("Expecting HTTP response code to be: <%d>, but was: <%d>", 
      expectedStatus, jspResponse.getStatus()).isEqualTo(expectedStatus);
  String htmlOutput = lineCoverageAnalyzer.collectCoverageDataFromHtml(response.getContent().trim().replace("\r\n", "\n")));
  assertTestFileEquals(pathToHtml, htmlOutput);
}
  

The line coverage information is best collected for multiple test cases and printed as a report at the end of the test class.

    Java
   
 

       private static AbstractJspTestBase currentTestRun;
    @Before
    public void setupTest() {
        currentTestRun = this;
    }
    @AfterClass
    public static void stopJettyServer() throws Exception {
        try {
            logger.info("Stopping Jetty server...");
            jettyServer.stop();
        } finally {
            String lineCoverageReport = lineCoverageAnalyzer.getLineCoverageReport();
            String className = currentTestRun.getClass().getSimpleName();
            currentTestRun = null;
            assertTestFileEquals("jsptest/coverage/" + className + ".report", lineCoverageReport);
        }
    }

  

This results in a report file like this: src/test/resources/jsptest/coverage/UserPagesParameterizedTest.report.

    Plain Text
   
   Uncovered lines in JSP files:
    /settings/user.jsp: 23-27
    /search.jsp: 12-16, 42-43

Or just a single line All lines covered in JSP files, that is the equivalent of a 100% coverage.

Conclusion

With that, we have a test package with numerous benefits:

Provides consistent results with immediate feedback on the changes
Fails only if the JSP code is changing or the data model becomes corrupted (e.g., a field was renamed in Java but not in the JSP)
Exposes existing problems, without forcing an immediate fix, but it enables an incremental approach to improve the quality
Highlights new issues, which helps avoid the slow degradation of the code base (no new XSS vulnerabilities, or untested code blocks anymore)

The same approach could be used for other server-side technologies, like JSF or Thymeleaf.

Java Server Pages unit test

Opinions expressed by DZone contributors are their own.

Related

Trending