Going offline with Maven
Jan 14, 2013

At Lateral-Thoughts, we organize at least once a year, what we call a "Timeoff" where we get together in a nice place and hack on what we want. It can be a learning period or a startup weekend-like event where we hack on a product/idea. Last time it was in a nice house in Guérande where we had everything we needed, internet access, rooms, tables, lots of space, an indoor swimming pool and a barbecue !
But when you want to find a nice place in France, it's not always easy to also have a good/decent internet access , so as we're beginning to plan the next event right now, we asked ourselves what could we do if there was no internet access ? Is there a way to plan for what we would need, so that we wouldn't suffer from having no contact with the outside world :). But in a Java/Python environment, where you use a lot Maven and PyPi, when you don't know what you'll be working on, the one thing you can't (and shouldn't plan) is the libraries/dependencies you'll need.
So what do we do ? The simplest way is to download all the dependencies you can from a Maven repository but that seems like the most in-efficient way ever, and with more than 30Gb of data each, it can take a while...
In the last article I extracted all the libs' metadata and dependencies link, so we know what depends on what. So in order to be more efficient in creating a copied repository, I decided to use those metadata according to two simple rules :
- Only keep the latest version of artifacts;
- And artifacts/versions that are needed to other artifacts in their latest versions.
With those simple rules, we can create a "minimum" repository containing only what we would need to start a new project :). The data I extracted is not perfect so don't take my word on it. This is a first draft of a work I (or someone else) may continue. The result is a simpler graph containing only 25 553 nodes and 52 916 edges (compared to the 186 384 Nodes and 1 229 083 Edges of the full repository), we can almost comprehend : [caption id="attachment_1003" align="aligncenter" width="640"]