DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • CubeFS: High-Performance Storage for Cloud-Native Apps
  • Evolution of Recommendation Systems: From Legacy Rules Engines to Machine Learning
  • Hammerspace Empowers GPU Computing With Enhanced S3 Data Orchestration
  • Block Size and Its Impact on Storage Performance

Trending

  • Designing a Java Connector for Software Integrations
  • Memory-Optimized Tables: Implementation Strategies for SQL Server
  • Strategies for Securing E-Commerce Applications
  • When Airflow Tasks Get Stuck in Queued: A Real-World Debugging Story
  1. DZone
  2. Coding
  3. Java
  4. A Look Inside JBoss Microcontainer, Part 3 - the Virtual File System

A Look Inside JBoss Microcontainer, Part 3 - the Virtual File System

By 
Ales Justin user avatar
Ales Justin
·
Sep. 24, 09 · News
Likes (0)
Comment
Save
Tweet
Share
42.9K Views

Join the DZone community and get the full member experience.

Join For Free

We're finally back with our next article in the Microcontainer series. In the first two articles we demonstrated how Microcontainer supports , and showed its powerful . In this article, we'll explain Classloading and Deployers, but first we must familiarize ourselves with VFS.

VFS stands, as expected, for Virtual File System. What does VFS solve for us, or why is it useful?

Here, at JBoss, we saw that a lot of similar resource handling code was scattered/duplicated all over the place.
In most cases it was code that was trying to determine what type of resource a particular resource was, e.g. is it a file, a directory, or a jar loading resources through URLs. Processing of nested archives was also reimplemented again, and again in different libraries.

Read the other parts in DZone's exclusive JBoss Microcontainer Series: 

  • Part 4 -- ClassLoading Layer

 

Example:
 

public static URL[] search(ClassLoader cl, String prefix, String suffix) throws IOException
  {
     Enumeration[] e = new Enumeration[]{
           cl.getResources(prefix),
           cl.getResources(prefix + "MANIFEST.MF")
     };
     Set all = new LinkedHashSet();
     URL url;
     URLConnection conn;
     JarFile jarFile;
     for (int i = 0, s = e.length; i < s; ++i)
     {
        while (e[i].hasMoreElements())
        {
           url = (URL)e[i].nextElement();
           conn = url.openConnection();
           conn.setUseCaches(false);
           conn.setDefaultUseCaches(false);
           if (conn instanceof JarURLConnection)
           {
              jarFile = ((JarURLConnection)conn).getJarFile();
           }
           else
           {
              jarFile = getAlternativeJarFile(url);
           }
           if (jarFile != null)
           {
              searchJar(cl, all, jarFile, prefix, suffix);
           }
           else
           {
              boolean searchDone = searchDir(all, new File(URLDecoder.decode(url.getFile(), "UTF-8")), suffix);
              if (searchDone == false)
              {
                 searchFromURL(all, prefix, suffix, url);
              }
           }
        }
     }
     return (URL[])all.toArray(new URL[all.size()]);
  }

  private static boolean searchDir(Set result, File file, String suffix) throws IOException
  {
     if (file.exists() && file.isDirectory())
     {
        File[] fc = file.listFiles();
        String path;
        for (int i = 0; i < fc.length; i++)
        {
           path = fc[i].getAbsolutePath();
           if (fc[i].isDirectory())
           {
              searchDir(result, fc[i], suffix);
           }
           else if (path.endsWith(suffix))
           {
              result.add(fc[i].toURL());
           }
        }
        return true;
     }
     return false;
  }


There were also many problems with file locking on Windows systems, which forced us to copy all hot-deployable archives to another location to prevent locking those in deploy folders (which would prevent their deletion and filesystem based undeploy).


File locking was a major problem that could only be addressed by centralizing all the resource loading code in one place.


Recognizing a need to deal with all of these issues in one place, wrapping it all into a simple and useful API, we created the VFS project.



VFS public API


Basic usage in VFS can be split in two pieces:


  • simple resource navigation
  • visitor pattern API


As mentioned, in plain JDK resource handling navigation over resources is far from trivial. You must always check what kind of resource you're currently handling, and this is very cumbersome.


With VFS we wanted to limit this to a single resource type - VirtualFile.


public class VirtualFile implements Serializable
{
  /**
   * Get certificates.
   *
   * @return the certificates associated with this virtual file
   */
   Certificate[] getCertificates()

  /**
   * Get the simple VF name (X.java)
   *
   * @return the simple file name
   * @throws IllegalStateException if the file is closed
   */
   String getName()

  /**
   * Get the VFS relative path name (org/jboss/X.java)
   *
   * @return the VFS relative path name
   * @throws IllegalStateException if the file is closed
   */
   String getPathName()

  /**
   * Get the VF URL (file://root/org/jboss/X.java)
   *
   * @return the full URL to the VF in the VFS.
   * @throws MalformedURLException if a url cannot be parsed
   * @throws URISyntaxException if a uri cannot be parsed
   * @throws IllegalStateException if the file is closed
   */
   URL toURL() throws MalformedURLException, URISyntaxException

  /**
   * Get the VF URI (file://root/org/jboss/X.java)
   *
   * @return the full URI to the VF in the VFS.
   * @throws URISyntaxException if a uri cannot be parsed
   * @throws IllegalStateException if the file is closed
   * @throws MalformedURLException for a bad url
   */
   URI toURI() throws MalformedURLException, URISyntaxException

  /**
   * When the file was last modified
   *
   * @return the last modified time
   * @throws IOException for any problem accessing the virtual file system
   * @throws IllegalStateException if the file is closed
   */
   long getLastModified() throws IOException

  /**
   * Returns true if the file has been modified since this method was last called
   * Last modified time is initialized at handler instantiation.
   *
   * @return true if modifed, false otherwise
    * @throws IOException for any error
   */
   boolean hasBeenModified() throws IOException

  /**
    * Get the size
   *
   * @return the size
    * @throws IOException for any problem accessing the virtual file system
   * @throws IllegalStateException if the file is closed
   */
   long getSize() throws IOException

  /**
   * Tests whether the underlying implementation file still exists.
   * @return true if the file exists, false otherwise.
   * @throws IOException - thrown on failure to detect existence.
   */
   boolean exists() throws IOException

  /**
   * Whether it is a simple leaf of the VFS,
   * i.e. whether it can contain other files
   *
   * @return true if a simple file.
   * @throws IOException for any problem accessing the virtual file system
   * @throws IllegalStateException if the file is closed
   */
   boolean isLeaf() throws IOException

  /**
   * Is the file archive.
   *
   * @return true if archive, false otherwise
   * @throws IOException for any error
   */
   boolean isArchive() throws IOException

  /**
   * Whether it is hidden
   *
   * @return true when hidden
   * @throws IOException for any problem accessing the virtual file system
   * @throws IllegalStateException if the file is closed
   */
   boolean isHidden() throws IOException

  /**
   * Access the file contents.
   *
   * @return an InputStream for the file contents.
   * @throws IOException for any error accessing the file system
   * @throws IllegalStateException if the file is closed
   */
   InputStream openStream() throws IOException

  /**
   * Do file cleanup.
   *
   * e.g. delete temp files
   */
   void cleanup()

  /**
   * Close the file resources (stream, etc.)
   */
   void close()

  /**
   * Delete this virtual file
   *
   * @return true if file was deleted
   * @throws IOException if an error occurs
   */
   boolean delete() throws IOException

  /**
   * Delete this virtual file
   *
   * @param gracePeriod max time to wait for any locks (in milliseconds)
   * @return true if file was deleted
   * @throws IOException if an error occurs
   */
   boolean delete(int gracePeriod) throws IOException

  /**
   * Get the VFS instance for this virtual file
   *
   * @return the VFS
   * @throws IllegalStateException if the file is closed
   */
   VFS getVFS()

  /**
   * Get the parent
   *
   * @return the parent or null if there is no parent
   * @throws IOException for any problem accessing the virtual file system
   * @throws IllegalStateException if the file is closed
   */
   VirtualFile getParent() throws IOException

  /**
   * Get a child
   *
   * @param path the path
   * @return the child or <code>null</code> if not found
   * @throws IOException for any problem accessing the VFS
   * @throws IllegalArgumentException if the path is null
   * @throws IllegalStateException if the file is closed or it is a leaf node
   */
   VirtualFile getChild(String path) throws IOException

  /**
   * Get the children
   *
   * @return the children
   * @throws IOException for any problem accessing the virtual file system
   * @throws IllegalStateException if the file is closed
   */
   List<VirtualFile> getChildren() throws IOException

  /**
   * Get the children
   *
   * @param filter to filter the children
   * @return the children
   * @throws IOException for any problem accessing the virtual file system
   * @throws IllegalStateException if the file is closed or it is a leaf node
   */
   List<VirtualFile> getChildren(VirtualFileFilter filter) throws IOException

  /**
   * Get all the children recursively<p>
   *
   * This always uses {@link VisitorAttributes#RECURSE}
   *
   * @return the children
   * @throws IOException for any problem accessing the virtual file system
   * @throws IllegalStateException if the file is closed
   */
   List<VirtualFile> getChildrenRecursively() throws IOException

  /**
   * Get all the children recursively<p>
   *
   * This always uses {@link VisitorAttributes#RECURSE}
   *
   * @param filter to filter the children
   * @return the children
   * @throws IOException for any problem accessing the virtual file system
   * @throws IllegalStateException if the file is closed or it is a leaf node
   */
   List<VirtualFile> getChildrenRecursively(VirtualFileFilter filter) throws IOException

  /**
   * Visit the virtual file system
   *
   * @param visitor the visitor
   * @throws IOException for any problem accessing the virtual file system
   * @throws IllegalArgumentException if the visitor is null
   * @throws IllegalStateException if the file is closed
   */
   void visit(VirtualFileVisitor visitor) throws IOException
}


As you can see you have all of the usual read-only File System operations, plus a few options to cleanup or delete the resource. Cleanup or deletion handling is needed when we're dealing with some internal temporary files; e.g. from nested jars handling.


To switch from JDK's File or URL resource handling to new VirtualFile we need a root. It is the VFS class that knows how to create one with the help of URL or URI parameter.


public class VFS
{
  /**
   * Get the virtual file system for a root uri
   *
   * @param rootURI the root URI
   * @return the virtual file system
   * @throws IOException if there is a problem accessing the VFS
   * @throws IllegalArgumentException if the rootURL is null
   */
   static VFS getVFS(URI rootURI) throws IOException

  /**
   * Create new root
   *
   * @param rootURI the root url
   * @return the virtual file
   * @throws IOException if there is a problem accessing the VFS
   * @throws IllegalArgumentException if the rootURL
   */
   static VirtualFile createNewRoot(URI rootURI) throws IOException

  /**
   * Get the root virtual file
   *
   * @param rootURI the root uri
   * @return the virtual file
   * @throws IOException if there is a problem accessing the VFS
   * @throws IllegalArgumentException if the rootURL is null
   */
   static VirtualFile getRoot(URI rootURI) throws IOException

  /**
   * Get the virtual file system for a root url
   *
   * @param rootURL the root url
   * @return the virtual file system
   * @throws IOException if there is a problem accessing the VFS
   * @throws IllegalArgumentException if the rootURL is null
   */
   static VFS getVFS(URL rootURL) throws IOException

  /**
   * Create new root
   *
   * @param rootURL the root url
   * @return the virtual file
   * @throws IOException if there is a problem accessing the VFS
   * @throws IllegalArgumentException if the rootURL
   */
   static VirtualFile createNewRoot(URL rootURL) throws IOException

  /**
   * Get the root virtual file
   *
   * @param rootURL the root url
   * @return the virtual file
   * @throws IOException if there is a problem accessing the VFS
   * @throws IllegalArgumentException if the rootURL
   */
   static VirtualFile getRoot(URL rootURL) throws IOException

  /**
   * Get the root file of this VFS
   *
   * @return the root
   * @throws IOException for any problem accessing the VFS
   */
   VirtualFile getRoot() throws IOException
}


You can see three different methods that look a lot alike - getVFS, createNewRoot and getRoot. Method getVFS returns a VFS instance, and what's important, it doesn't yet create a VirtualFile instance. Why is this important? Because there are methods which help us configure a VFS instance (see VFS class API javadocs), before telling it to create a VirtualFile root.

The other two methods, on the other hand, use default settings for root creation. The difference between createNewRoot and getRoot is in caching details, which we'll delve in later on.

URL rootURL = ...; // get root url
VFS vfs = VFS.getVFS(rootURL);
// configure vfs instance
VirtualFile root1 = vfs.getRoot();
// or you can get root directly
VirtualFile root2 = VFS.crateNewRoot(rootURL);
VirtualFile root3 = VFS.getRoot(rootURL);


The other useful thing about VFS API is its implementation of a proper visitor pattern. This way it's very simple to recursively gather different resources, something quite impossible to do with plain JDK resource loading.

public interface VirtualFileVisitor
{
  /**
   * Get the search attribues for this visitor
   *
   * @return the attributes
   */
  VisitorAttributes getAttributes();

  /**
   * Visit a virtual file
   *
   * @param virtualFile the virtual file being visited
   */
  void visit(VirtualFile virtualFile);
}

VirtualFile root = ...; // get root
VirtualFileVisitor visitor = new SuffixVisitor(".class"); // get all classes
root.visit(visitor);


VFS Architecture

While public API is quite intuitive, real implementation details are a bit more complex. We'll try to explain the concepts in a quick pass.

Each time you create a VFS instance, its matching VFSContext instance is created. This creation is done via VFSContextFactory. Different protocols map to different VFSContextFactory instances - e.g. file/vfsfile map to FileSystemContextFactory, zip/vfszip map to ZipEntryContextFactory.

Also, each time a VirtualFile instance is created, its matching VirtualFileHandler is created. It's this VirtualFileHandler instance that knows how to handle different resource types properly - VirtualFile API just delegates invocations to its VirtualFileHandler reference.

As one could expect, VFSContext instance is the one that knows how to create VirtualFileHandler instances accordingly to a resource type - e.g. ZipEntryContextFactory creates ZipEntryContext, which then creates ZipEntryHandler.

Existing implementations

Apart from files, directories (FileHandler) and zip archives (ZipEntryHandler) we also support other more exotic usages.

The first one is Assembled, which is similar to what Eclipse calls Linked Resources. Its idea is to take existing resources from different trees, and "mock" them into single resource tree.

AssembledDirectory sar = AssembledContextFactory.getInstance().create("assembled.sar");

URL url = getResource("/vfs/test/jar1.jar");
VirtualFile jar1 = VFS.getRoot(url);
sar.addChild(jar1);

url = getResource("/tmp/app/ext.jar");
VirtualFile ext1 = VFS.getRoot(url);
sar.addChild(ext);

AssembledDirectory metainf = sar.mkdir("META-INF");

url = getResource("/config/jboss-service.xml");
VirtualFile serviceVF = VFS.getRoot(url);
metainf.addChild(serviceVF);

AssembledDirectory app = sar.mkdir("app.jar");
url = getResource("/app/someapp/classes");
VirtualFile appVF = VFS.getRoot(url);
app.addPath(appVF, new SuffixFilter(".class"));


Another implementation is in-memory files. In our case this came out of a need to easily handle AOP generated bytes. Instead of mucking around with temporary files, we simply drop bytes into in-memory VirtualFileHandlers.

URL url = new URL("vfsmemory://aopdomain/org/acme/test/Test.class");
byte[] bytes = ...; // some AOP generated class bytes
MemoryFileFactory.putFile(url,  bytes);

VirtualFile classFile = VFS.getVirtualFile(new URL("vfsmemory://aopdomain"), "org/acme/test/Test.class");
InputStream bis = classFile.openStream(); // e.g. load class from input stream


Extension hooks

It's quite easy to extend VFS with a new protocol, similar to what we've done with Assembled and Memory.
All you need is a combination of VFSContexFactory, VFSContext, VirtualFileHandler, FileHandlerPlugin and URLStreamHandler implementations. The first one is trivial, while the others depend on the complexity of your task - e.g. you could implement rar, tar, gzip or even remote access.


In the end you simply register this new VFSContextFactory with VFSContextFactoryLocator.

See this article's demo for a simple gzip example

 

Features

One of the first major problems we stumbled upon was proper usage of nested resources, more exactly nested jar files.

e.g. normal ear deployments: gema.ear/ui.war/WEB-INF/lib/struts.jar

In order to read contents of struts.jar we have two options:

  • handle resources in memory
  • create top level temporary copies of nested jars, recursively


The first option is easier to implement, but it's very memory-consuming--just imagine huge apps in memory.
The other approach leaves a bunch of temporary files, which should be invisible to plain user. Hence expecting them to disappear once the deployment is undeployed.

Now imagine the following scenario: A user gets a hold of VFS's URL instance, which points to some nested resource.

The way plain VFS would handle this is to re-create the whole path from scratch, meaning it would unpack nested resources over and over again. This would (and it did) lead to a huge pile of temporary files.
How to avoid this? The way we approached this is by using VFSRegistry, VFSCache and TempInfo.

When you ask for VirtualFile over VFS (getRoot, not createNewRoot), VFS asks VFSRegistry implementation to provide the file. Existing DefaultVFSRegistry first checks if matching root VFSContext for provided URI exists. If it does, it first tries to navigate to existing TempInfo (link to temporary files), falling back to regular navigation if no such temporary file exists. This way we completely re-use any already unpacked temporary files, saving time and disk space. If no matching VFSContext is found in cache, we create a new VFSCache entry, and continue with default navigation.

It's then up to VFSCache implementation used, how it handles cached VFSContext entries. VFSCache is configurable via VFSCacheFactory - by default we don't cache anything, but there are a few useful existing VFSCache implementations, ranging from LRU to timed cache.


API Use case

There is a class called VFSUtils which is part of a public API, and it is sort of a dumping ground of useful functionality. It contains a bunch of helpful methods and configuration settings (system property keys, actually). Check the API javadocs for more details.

Existing issues / workarounds

Another issue that came up - expectedly - was inability of some frameworks to properly work on top of VFS. The problem lied in custom VFS urls like: vfsfile, vfszip, vfsmemory.

In most cases you could still work around it with plain URL or URLConnection usage, but a lot of frameworks do a strict match on file or jar protocol, which of course fails.

We were able to patch some frameworks (e.g. Facelets) and provide extensions to others (e.g. Spring).
If you are a library developer, and your library has a simple pluggable resource loading mechanism, then we suggest you simply extend it with VFS based implementation. If there are no hooks, try to limit your assumptions to more general usage based on URL or URLConnection.


 Conclusion


While VFS is very nice to use, it comes at a price. It adds additional layer on top of JDK's resource handling, meaning extra invocations are always present when you're dealing with resources.

We also keep some of the jar handling info in memory to make it easy to get hold of a specific resource, but at the expense of some extra memory consumption.

Overall VFS proved to be a very useful library as it hides away many use cases that are painful with plain JDK, and provides a comprehensive API for working with resources - i.e.  visitor pattern implementation.
We're constantly following user feedback to VFS issues they encounter, making each version a bit better.

Now, that we got to know VFS, it's time we move on to MC's new Classloading layer!

 

About the Author

Ales Justin was born in Ljubljana, Slovenia and graduated with a degree in mathematics from the University of Ljubljana. He fell in love with Java seven years ago and has spent most of his time developing information systems, ranging from customer service to energy management. He joined JBoss in 2006 to work full time on the Microcontainer project, currently serving as its lead. He also contributes to JBoss AS and is Seam and Spring integration specialist. He represent JBoss on 'JSR-291 Dynamic Component Support for Java SE' and 'OSGi' expert groups.

 

File system JBoss

Opinions expressed by DZone contributors are their own.

Related

  • CubeFS: High-Performance Storage for Cloud-Native Apps
  • Evolution of Recommendation Systems: From Legacy Rules Engines to Machine Learning
  • Hammerspace Empowers GPU Computing With Enhanced S3 Data Orchestration
  • Block Size and Its Impact on Storage Performance

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!