Why Nexus and not Artifactory? Compliance, Standards, Security, and Quality

DZone 's Guide to

Why Nexus and not Artifactory? Compliance, Standards, Security, and Quality

· Java Zone ·
Free Resource

We (Sonatype) recently received some support requests from a company making a switch from Artifactory to Nexus.  In the evaluation and system design phase, they were setting up Nexus to proxy their internal Artifactory instance and where having some troubles with integration. Our support staff did some digging and the results where unexpected. 

Before I get into the details, I just want to say that I don't derive much satisfaction from pointing out problems in Artifactory, and I won't claim Nexus is perfect either, but we pay very detailed attention to key areas like stability, performance and most importantly, interoperability.  Frankly, it isn't something I'd like to be spending my time on, but I've read so much hyperbole from JFrog about how configuring mirrorOr is "lazy and dirty", and so much trash talk about Sonatype just being "all talk" that I think it is time to start answering the criticism.

POM Rewriting and License Compliance

The customer was configuring their system to use the Procurement support in Nexus and it was choking on validating the signature of a lot of artifacts coming from their legacy Artifactory system.  Upon investigation, we found that Artifactory completely rewrites the pom files, presumably as part of a new feature to strip out repository entries from the poms. To see for yourself, compare the results of these two urls:




Notice first that this pom has no repository element in it, therefore there is no need to modify the file at all. A closer evaluation will reveal that this pom being “proxied” by Artifactory is completely rewritten, removing all comments and reordering elements. I personally don’t think it’s a good idea to muck around with files being proxied but it’s probably fine assuming all the parsing is done correctly. It does introduce yet another place for things to go wrong though.  I mean comments aren’t really that important are they?  Well, if you care about open source licensing, they are. Take a look at this POM from Central:


Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at


Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
<project xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">

Now take a look at first few lines from  the same POM from JFrog's public repository:

<?xml version="1.0" encoding="UTF-8"?>

The License header of the file has been completely stripped away.  I was pretty sure that this might be a violation of the license itself, so I checked the Apache License at http://www.apache.org/licenses/LICENSE-2.0.

4.2  You must cause any modified files to carry prominent notices stating that You changed the files; and

4.3 You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works

I am not a lawyer, but I interpret this to mean that if you have this option turned on, and you are distributing these POMs to anyone else, you may be violating the license of artifacts being proxied. This example is for the ASL license, a rather liberal license as they go, but as an active participant in the ASF, I can tell you that the organization takes licensing issues very seriously.  The fact is headers from any pom would be dumped and most licenses out there probably frown upon this.  Some of you are going to shrug this off as a minor problem, maybe it is, but this is the sort of minor issue that will make a legal compliance department go berserk.  But, striping licenses off of POMs wasn't really the main issue, it was just something I stumbled on trying to find a solution to the problem with PGP signatures.

POM Rewriting and PGP Signatures

Setting aside the license issue for a moment, let’s go back to the procurement issue that was reported. Now try getting the signature file for this artifact so you can validate it hasn’t been tampered with.  The asc file should have a GPG signature that was created with a publicly accessible key.  Click on the following URL on central to see an example.


Ok so far? (That my signature fwiw) Here’s the crux of the issue.  Click on the same artifact in the Atifactory proxy of Central below:


At the time of writing, I get:

HTTP Status 500 -

type Exception report


description The server encountered an internal error () that prevented it from fulfilling this request.


java.lang.IllegalArgumentException: Checksum type not found for path org/apache/maven/apache-maven/2.0.9/apache-maven-2.0.9.pom.asc org.artifactory.engine.DownloadServiceImpl.respondForChecksumRequest(DownloadServiceImpl.java:214) org.artifactory.engine.DownloadServiceImpl.respond(DownloadServiceImpl.java:176) org.artifactory.engine.DownloadServiceImpl.process(DownloadServiceImpl.java:122) sun.reflect.GeneratedMethodAccessor93.invoke(Unknown Source) )

My valid signature that exists on Central can no longer be retrieved through the proxy. I have no doubt they will fix the crash. However: the problem still stands, how can you have a web of trust that links back to the original developer, when proxies in the middle are rewriting the artifact and stripping (or regenerating) the pgp signature? Even if you trust your instance, how can you validate the signature was correct for the inbound artifact before it was rewritten? What if you’re proxying from someone else that happens to be using Artifactory, did they Trojan you or just unwittingly break the web of trust?

If you download things from the internet, validating PGP signatures isn't something you should think about doing, it is something you need to do.  It is the only way to guarantee that the artifacts from a remote repository are sound, and Sonatype has invested a great deal of time into making sure that artifacts added to the Central Maven repositories, the Apache repositories, and the Codehaus repositories are all accompanied by valid PGP keys that are on a public keyserver. In addition to that, the ASF takes the idea of building a web of trust very seriously.  You shouldn't sign an ASF release unless you've had your key signed by someone in the ASF's web of trust at a key signing event (PGP keys are best signed only if you can verify someone's signature, face-to-face.)  It seems a shame to throw away all of that work just to "clean" the POM of repository elements.

Again, JFrog has written publicly that the only reason this POM rewriting is necessary is because they think that Maven is broken by design.  But, their fix throws away the web of trust that makes it possible to validate the contents of a repository using original PGP keys from project developers.  We've considered similar changes in the past, but because we are responsible for maintaining some of these source repositories, we are forced to think about the ramifications of our changes for the community.  Building a repository manager that just "throws out" PGP signatures for POMs seems to me to be irresponsible when we're starting to make traction on the difficult job of making sure that new artifacts added to central have PGP signatures.

Artifactory Produces Non-standard Indexes

We also had some reports of odd indexing behavior. The original index format was a Lucene 2.3 binary file zipped up in a convenient archive.  This created a problem because if you want to upgrade to a newer version of Lucene, you can no longer produce the older formatted version.  Newer versions of Lucene cannot generate backwards-compatible binary index files.  Because the community needed to maintain backwards-compatibility for all older clients, the standard Index that is produced by the major public repositories is now a new binary layout completely separate from Lucene.  All of this work was done in the Nexus Indexer project, a separate, open-source project that has been available under the Eclipse Public License (EPL) and which is already integrated into all repository managers.  This new .gz format. In addition to being a neutral format, it also supports incremental indexes. The indexes produced by Artifactory are using the old-style Lucene zip, but with a newer version of Lucene. This means it is non-standard and is not consumable by all IDE plugins or other index clients.

Another problem we found was that the indexes presented by the "virtual" repos (equivalent to Nexus group indexes) serve up only the index of the last repository in the list. This means in an enterprise you can not get an index that contains all artifacts available to you, both internal and external.  While you can certainly use the Artifactory search interface, the promise of a repository index is that tools like m2eclipse and other Maven plugins can use this index to quickly locate artifacts that contain particular classes or quickly generate a list of versions for a particular artifactId.

Because it is important for all repository managers to produce interoperable repository indexes, we've decided to donate the Nexus Indexer code to the Apache Software Foundation.  The Nexus index is the standard format for a Maven repository, it is integrated into Archiva, Nexus, and Artifactory, and it just makes sense that the code that created this index be moved moved to an open, transparent community like Apache.  This will increase the visibility of the Nexus Indexer code for people that actively participate in the Maven community.

Artifactory Breaks Wagon

Maven and the Maven Ant Tasks use something called the Wagon to transfer files to and from a repository.    It is the "transport abstraction that is used in Maven's artifact and repository handling code", and it has providers for SCP, HTTP, FTP, and file.   Any time Maven sees a URL, the Maven Wagon component handles the transfer.    I won't go into the gory details of this component, but one of the things that a repository manager needs to do is provide some sort of file list for a directory.  All of the other protocols with Wagon providers have some way to get a directory listing.  The basic subset of HTTP that is supported by all web servers does not have this command, so the HTTP wagon relies the repository returning a list of links to the folder's contents.

Instead of returning such a list of folder contents, Artifactory tries to redirect the client to the UI.  It doesn't return a file list, and anything in Maven that relies on Wagon's ability to get a file list will fail.  In other words, anything in Maven or any Maven plugin that uses wagon.getFileList() interface will break when you are using Artifactory. You can see it here:

[INFO] Scanning remote file system: http://repo1.maven.org/maven2/org/apache/mav
en/apache-maven/2.0.10/ ...
[INFO] apache-maven-2.0.10-bin.tar.bz2
[INFO] apache-maven-2.0.10-bin.tar.bz2.asc
[INFO] apache-maven-2.0.10-bin.tar.bz2.asc.md5
[INFO] apache-maven-2.0.10-bin.tar.bz2.asc.sha1
[INFO] apache-maven-2.0.10-bin.tar.bz2.md5
[INFO] apache-maven-2.0.10-bin.tar.bz2.sha1
[INFO] apache-maven-2.0.10-bin.tar.gz
[INFO] apache-maven-2.0.10-bin.tar.gz.asc
[INFO] apache-maven-2.0.10-bin.tar.gz.asc.md5
[INFO] apache-maven-2.0.10-bin.tar.gz.asc.sha1
[INFO] apache-maven-2.0.10-bin.tar.gz.md5
[INFO] apache-maven-2.0.10-bin.tar.gz.sha1
[INFO] apache-maven-2.0.10-bin.zip
[INFO] apache-maven-2.0.10-bin.zip.asc
[INFO] apache-maven-2.0.10-bin.zip.asc.md5
[INFO] apache-maven-2.0.10-bin.zip.asc.sha1
[INFO] apache-maven-2.0.10-bin.zip.md5
[INFO] apache-maven-2.0.10-bin.zip.sha1
[INFO] apache-maven-2.0.10-sources.jar
[INFO] apache-maven-2.0.10-sources.jar.asc
[INFO] apache-maven-2.0.10-sources.jar.asc.md5
[INFO] apache-maven-2.0.10-sources.jar.asc.sha1
[INFO] apache-maven-2.0.10-sources.jar.md5
[INFO] apache-maven-2.0.10-sources.jar.sha1
[INFO] apache-maven-2.0.10.jar
[INFO] apache-maven-2.0.10.jar.asc
[INFO] apache-maven-2.0.10.jar.asc.md5
[INFO] apache-maven-2.0.10.jar.asc.sha1
[INFO] apache-maven-2.0.10.jar.md5
[INFO] apache-maven-2.0.10.jar.sha1
[INFO] apache-maven-2.0.10.pom
[INFO] apache-maven-2.0.10.pom.asc
[INFO] apache-maven-2.0.10.pom.asc.md5
[INFO] apache-maven-2.0.10.pom.asc.sha1
[INFO] apache-maven-2.0.10.pom.md5
[INFO] apache-maven-2.0.10.pom.sha1
[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 32 seconds
[INFO] Finished at: Mon Jan 04 17:17:11 EST 2010
[INFO] Final Memory: 8M/47M
[INFO] ------------------------------------------------------------------------

C:\svn\staging-test>mvn validate
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Building staging-test
[INFO] task-segment: [validate]
[INFO] ------------------------------------------------------------------------
[INFO] [wagon:list {execution: upload-javadoc}]
[INFO] Scanning remote file system: http://repo.jfrog.org/artifactory/libs-relea
ses/org/apache/maven/apache-maven/2.0.10/ ...
[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Error handling resource

Embedded error: Error transferring file
Server redirected too many times (20)
[INFO] ------------------------------------------------------------------------
[INFO] For more information, run Maven with the -e switch


Last time I wrote something about Artifactory, the founders of the company came back and called me biased and not objective.  Judge for yourself, I've presented some concrete facts in this post.

I have to tell you that the thing that really struck a chord with me and the other engineers at Sonatype was the idea that someone could write a blog post saying that Sonatype is "all talk".  It just doesn't make any sense, as a corporation we've poured resources into the foundational technologies that our competitors use.  I spend a great deal of my time working on the Maven project, stopping Denial of Service attacks on Central, I'm on the PMC, a lot of that time is spent trying to make Maven a better product.  A lot of this work involves talking to our competitors about ways to improve Maven and related technologies.  To hear someone come at us because we're "all talk" is, frankly, insulting given the hours (no, years) we've put into this open source community.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}