Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

The Return of the Scala Rule Tutorial: The Execution

DZone's Guide to

The Return of the Scala Rule Tutorial: The Execution

Learn about the Scala rule in this neat tutorial, including producing an actual executable!

Free Resource

Transform incident management with machine learning and analytics to help you maintain optimal performance and availability while keeping pace with the growing demands of digital business with this eBook, brought to you in partnership with BMC.

This builds on the first part of this tutorial. In this post, we will make the the rule actually produce an executable.

Capturing the Output from scalac

At the end of the tutorial last time, we were calling scalac, but ignoring the result:

(cd /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg && \
  exec env - \
  /bin/bash -c 'external/scala/bin/scalac HelloWorld.scala; echo '\''blah'\'' > bazel-out/local_darwin-fastbuild/bin/hello-world.sh')

If you look at the directory where the action is running (/private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg in my case) you can see that HelloWorld.class and HelloWorld$.class is created. This directory is called the execution root, it is where bazel executes build actions. Bazel uses separate directory trees for source code, executing build actions, and output files (bazel-out/). Files won’t get moved from the execution root to the output tree unless we tell Bazel we want them.

We want our compiled scala program to end up in bazel-out/, but there’s a small complication. With languages like Java (and Scala), a single source file might contain inner classes that cause multiple .class files to be generated by a single compile action. Bazel cannot know until it runs the action how many class files are going to be generated. However, Bazel requires that each action declare, in advance, what its outputs will be. The way to get around this is to package up the .class files and make the resulting archive the build output.

In this example, we’ll add the .class files into a .jar. Let’s add that to the outputs, which should now look like this:

  outputs = {
    'jar': "%{name}.jar",
    'sh': "%{name}.sh",
  },

In the impl function, our command is getting a bit complicated so I’m going to change it to an array of commands and then join them on “\n” in the action:

def impl(ctx):
    cmd = [
        "%s %s" % (ctx.file._scalac.path, ctx.file.src.path),
        "find . -name '*.class' -print > classes.list",
        "jar cf %s @classes.list" % (ctx.outputs.jar.path),
    ]

    ctx.action(
        inputs = [ctx.file.src],
command = "\n".join(cmd),
        outputs = [ctx.outputs.jar]
    )

This will compile the src, find all of the .class files, and add them to the output jar. If we run this, we get:

$ bazel build -s :hello-world
INFO: Found 1 target...
>>>>> # //:hello-world [action 'Unknown hello-world.jar']
(cd /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg && \
  exec env - \
  /bin/bash -c 'external/scala/bin/scalac HelloWorld.scala
find . -name '\''*.class'\'' -print > classes.list
jar cf bazel-out/local_darwin-fastbuild/bin/hello-world.jar @classes.list')
Target //:hello-world up-to-date:
  bazel-bin/hello-world.jar
INFO: Elapsed time: 4.774s, Critical Path: 4.06s

Let’s take a look at what hello-world.jar contains:

$ jar tf bazel-bin/hello-world.jar
META-INF/
META-INF/MANIFEST.MF
HelloWorld$.class
HelloWorld.class

Looks good! However, we cannot actually run this jar, because java doesn’t know what the main class should be:

$ java -jar bazel-bin/hello-world.jar 
no main manifest attribute, in bazel-bin/hello-world.jar

Similar to the java_binary rule, let’s add a main_class attribute to scala_binary and put it in the jar’s manifest. Add 'main_class' : attr.string(), to scala_binary‘s attrs and change cmd to the following:

    cmd = [
        "%s %s" % (ctx.file._scalac.path, ctx.file.src.path),
        "echo Manifest-Version: 1.0 > MANIFEST.MF",
        "echo Main-Class: %s >> MANIFEST.MF" % ctx.attr.main_class,
        "find . -name '*.class' -print > classes.list",
"jar cfm %s MANIFEST.MF @classes.list" % (ctx.outputs.jar.path),
    ]

Remember to update your actual BUILD file to add a main_class attribute:

# BUILD
load("/scala", "scala_binary")

scala_binary(
    name = "hello-world",
    src = "HelloWorld.scala",
    main_class = "HelloWorld",
)

Now building and running gives you:

$ bazel build :hello-world
INFO: Found 1 target...
Target //:hello-world up-to-date:
  bazel-bin/hello-world.jar
INFO: Elapsed time: 4.663s, Critical Path: 4.05s
$ java -jar bazel-bin/hello-world.jar 
Exception in thread "main" java.lang.NoClassDefFoundError: scala/Predef$
at HelloWorld$.main(HelloWorld.scala:4)
at HelloWorld.main(HelloWorld.scala)
Caused by: java.lang.ClassNotFoundException: scala.Predef$
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 2 more

Closer! Now it cannot find some scala libraries it needs. You can add it manually on the command line to see that our jar does actually does work if we specify the scala library jar, too:

$ java -cp $(bazel info output_base)/external/scala/lib/scala-library.jar:bazel-bin/hello-world.jar HelloWorld
Hello, world!

So we need our rule to generate an executable that basically runs this command, which can be accomplished by adding another action to our build. First we’ll add a dependency on scala-library.jar by adding it as a hidden attribute:

        '_scala_lib': attr.label(
            default=Label("@scala//:lib/scala-library.jar"),
            allow_files=True,
            single_file=True),

Making scala_binarys Executable

Let’s pause here for a moment and switch gears: we’re going to tell bazel that scala_binarys are binaries. To do this, we add executable = True to the attrs and get rid of the reference to hello-world.sh in the outputs:

...
    outputs = {
        'jar': "%{name}.jar",
    },
    implementation = impl,
    executable = True,
)

This says that scala_binary(name = "foo", ...) should have an action that creates a binary called foo, which can be referenced via ctx.outputs.executable in the implementation function. We can now use bazel run :hello-world (instead of bazel build :hello-world; ./bazel-bin/hello-world.sh).

The executable we want to create is the java command from above, so we add the second action to impl, this one a file action (since we’re just generating a file with certain content, not executing a series of commands to generate a .jar):

    cp = "%s:%s" % (ctx.outputs.jar.basename, ctx.file._scala_lib.path)
    content = [
"#!/bin/bash",
        "echo Running from $PWD",
"java -cp %s %s" % (cp, ctx.attr.main_class),
    ]
    ctx.file_action(
content = "\n".join(content),
output = ctx.outputs.executable,
    )

Note that I also added a line to the file to echo where it is being run from. If we now use bazel run, you’ll see:

$ bazel run :hello-world
INFO: Found 1 target...
Target //:hello-world up-to-date:
  bazel-bin/hello-world.jar
  bazel-bin/hello-world
INFO: Elapsed time: 2.694s, Critical Path: 0.08s

INFO: Running command line: bazel-bin/hello-world
Running from /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg/bazel-out/local_darwin-fastbuild/bin/hello-world.runfiles
Error: Could not find or load main class HelloWorld
ERROR: Non-zero return code '1' from command: Process exited with status 1.

Whoops, it’s not able to find the jars! And what is that path, hello-world.runfiles, it’s running the binary from?

The runfiles Directory

bazel run runs the binary from the runfiles directory, a directory that is different than the source root, execution root, and output tree mentioned above. The runfiles directory should contain all of the resources needed by the executable during execution. Note that this is not the execution root, which is used during the bazel build step. When you actually execute something created by bazel, its resources need to be in the runfiles directory.

In this case, our executable needs to access hello-world.jar and scala-library.jar. To add these files, the API is somewhat strange. You must return a struct containing a runfiles object from the rule implementation. Thus, add the following as the last line of your impl function:

return struct(runfiles = ctx.runfiles(files = [ctx.outputs.jar, ctx.file._scala_lib]))

Now if you run it again, it’ll print:

$ bazel run :hello-world
INFO: Found 1 target...
Target //:hello-world up-to-date:
  bazel-bin/hello-world.jar
  bazel-bin/hello-world
INFO: Elapsed time: 0.416s, Critical Path: 0.00s

INFO: Running command line: bazel-bin/hello-world
Running from /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg/bazel-out/local_darwin-fastbuild/bin/hello-world.runfiles
Hello, world!

Hooray!

However! If we run it as bazel-bin/hello-world, it won’t be able to find the jars (because we’re not in the runfiles directory). To find the runfiles directory regardless of where the binary is run from, change your content variable to the following:

    content = [
        "#!/bin/bash",
        "case \"$0\" in",
        "/*) self=\"$0\" ;;",
        "*)  self=\"$PWD/$0\";;",
        "esac",
        "(cd $self.runfiles; java -cp %s %s)" % (cp, ctx.attr.main_class),
    ]

This way, if it’s run from bazel run, $0 will be the absolute path to the binary (in my case, /private/var/tmp/_bazel_kchodorow/92df5f72e3c78c053575a1a42537d8c3/blerg/bazel-out/local_darwin-fastbuild/bin/hello-world). If it’s run via bazel-bin/hello-world, $0 will be just that: bazel-bin/hello-world. Either way, we’ll end up in the runfiles directory before executing the command.

Now our rule is successfully generating a binary. You can see the full code for this example on GitHub.

In the final part of this tutorial, we’ll fix the remaining issues:

  • No support for multiple source files, never mind dependencies.
  • [action 'Unknown hello-world.jar'] is pretty ugly.

Until next time!

Evolve your approach to Application Performance Monitoring by adopting five best practices that are outlined and explored in this e-book, brought to you in partnership with BMC.

Topics:
performance ,scala ,scale rule ,java

Published at DZone with permission of Kristina Chodorow, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}