Skip to content

Maven and the one true repository

July 14, 2009

One of the things that first attracted me to the notion of maven was the idea that almost all of what you want in terms of dependencies were available from a few repositories. Or at least that was the general idea.

Fast forward to today, and we seem to have a proliferation of repositories. In fact, we have so many repositories that we even have tools to help us manage all of these repositories. I’m using one of these tools (Nexus – a very nice tool BTW ) and I have noticed my Nexus configuration is getting quite complex! I’m wondering if we have traded jar dependency hell for repository management hell?

To my mind, tools such as Nexus are treating the symptoms, and not the disease.

One of the symptoms: maven pom files that contain repostory configuration. This seems like a big fat anti-pattern (violates sepration of concerns, and all that lot). Surely the pom should tell us what artifacts to use, not where to go find them.

What if we had fewer repositories? In fact, what if we had only one big giant global repository in the sky? Let’s call it the BARGE (Big Ass Repository for Global Enlightenment). Would that not vastly simplify the whole isssue of dependency management (and not just for Maven, but for many tools that want to leverage a similiar mechanism)?

Now this sounds like crazy talk; and perhaps it is. But it seems to be at least possible – in a nice hand waving kind of way.

Here’s how it ought to work:

  • There is one global name space for publishing artifacts
  • Artifacts have controlled scope of visibility (i.e. who can see them). For example, a corporation may choose to restrict its artifacts to a private group
  • Artifacts have a TTL. Example: snapshot  releases live for 30 days, and then are purged from the repository.
  • Artifacts have a replication scope. They might be global (of interest to everyone – for example the Apache tomcat jar files), group wide (e.g. just the Acme engineering organization),  or they might be local to a developers machine. While the mechanism for publishing would be the same in all cases- the degree to which artifacts get replicated to the global name space would depend on the replication scope.
  • Anyone can publish artifacts to the global repository . Anyone.  Whether or not those artifacts are replicated is another question.
  • Trust and identity are key to making this all work. The degree to which artifacts are replicated would depend on the level of trust that people place on the publisher.

Conceptually this should work in a fashion similar to dns. You point to an upstream provider that sends you updates and accepts your published artifacts (assuming your provider trusts you enough to accept updates to your chunk of the name space).

Above all, this system should be dead simple to use – especially for the 99% of developers who just want to consume artifacts. For most developers, this should be a one time configuration pointing their build management tool at the closest repository cache.

The implementation of the BARGE is trivial and is left as an exercise for the reader 🙂

Only the true repository would deny its divinity!

Only the true repository would deny it's divinity!


Comments are closed.

%d bloggers like this: