1 may 2017

hg share: Sharing mercurial repository between different clones / checkouts

Our starting point

At our company, we developed a product based on Django. To manage code changes, we use mercurial and to manage all dependency stuff we use buildout + setuptools. buildout recipes are wonderful if you need to do other things than just pulling code and resolving and building library dependencies. These thing could be:
  • Building any binary from source. We use it for building Nginx, part of product
  • Generating config files. We used it for generating configuration files for nginx, supervisor, etc.
  • Generating SSL Certificates
  • etc.
Our deployments use a shared product base with its mercurial repo and customer specific project customizations which are held on separate mercurial repos. Managing changes with mercurial (or any other SCM system) allows us to:
  • Deploy quickly any hot fixes
  • Share code changes easily merging or "grafting" changesets between branches.
  • Get exhaustive change history information.

The problem

When working on several projects at the same time, it's not easy to share the same "buildout" project because each one has it's own settings, customizations, and so on. That led me to have a copy for each customer. Each buildout is about 1GB.

As the number of customers rise, the space required to hold all buildouts is getting quite big.

The solution

Using shared mercurial repository

A mercurial repository can be divided into:
  • a history tracking store where all changesets reside
  • the state which is basically a pointer to an entry in the history
  • a local copy, which hold any changes which are not commited
The store can be shared between several clones / checkouts / repositories. This is just what the mercurial share extension (hg share) does. 

The syntax is similar to the "hg clone" command: hg share <local source repo> [<dest name>]

One of the advantages is that a change is directly visible to each clone. This saves a lot of pulls. But care should be taken, because strips / rollbacks apply to them all. This could leave a repository pointing to a non-existing (anymore) state.

Using shared eggs and download-cache directories

These directories hold nearly the same info between different buildouts, so it's easy to share them. The solution I used is to simply use symbolic links to some globally shared directories. Another solution would be to specify specific eggs and download-cache directories inside buildout parameters (eg. using a "develop.cfg" invoked from "buildout.cfg" which inherits from a "base.cfg").

A + B

I worked out a little script which replaces automatically each mercurial repository with a shared one and unifies the eggs and the download-cache directories.

Applying both changes to each of my buildouts reduces them by more than 65% including the shared part of eggs and download-cache. This is quite a good saving.

No hay comentarios:

Publicar un comentario