- Published on
GitHub Spring Cleaning - the Deprecation Hack
- Authors

- Name
- Kevin van Zonneveld
- @kvz
Almost spring here! Birds are chirping and we start cleaning out our kitchens and backyards and closets and GitHub accounts. Let's trash some legacy!
Why? Because
- We're ashamed of old code
- We want to save money by having a lower (private) repo count
- We want to improve the signal-to-noise on our profiles before a job interview
- Spring
But wait, what if your co-worker wants to access some of those commits again? You probably don't feel like peeling archives from crashed backup drives in the basement of your previous building.
Renan and I faced this at true.nl and we started looking for simple solutions.
After going over several, we (ok, Renan) came up with the idea of storing every old repository's master branch to a self-named branch in a single deprecated repository.
Here's what it might look like on 3 sample repos:
github.com/kvz/eggshell/tree/master -> github.com/kvz/deprecated/tree/eggshell
github.com/kvz/submin/tree/master -> github.com/kvz/deprecated/tree/submin
github.com/kvz/Elastica/tree/master -> github.com/kvz/deprecated/tree/Elastica
Hack? Yes. But the advantages of this method are clear. You get to:
- Preserve paths, commits, users
- Use GitHub's webinterface to quickly traverse the archives and link to them
- Make it very clear to people that they're looking at indeed deprecated code
- Make the deprecated repo private if need be and enjoy GitHub's access control
- Checkout a
deprecatedrepo branch, force-push it to a fresh repo'smasterand be back in business
It's limited in that we only preserve the master branch, but we figured that would suffice for repos whose code history would otherwise just have been made inaccessible in far worse ways.
Starting is simple. You create a container repo on Github named deprecated, add it as an origin to existing repos, force push current master to a named branch, and done.
However, spring cleaning is no fun without automation, so we wrote a script to do this for you. Just change the repo_sources and repo_destiny variables.
If you don't understand what this does, please don't run it
#!/usr/bin/env bash
set -o errexit
set -o nounset
set -o pipefail
# Repositories to deprecate and their destination
repo_user="kvz"
repo_sources=(eggshell submin Elastica)
repo_destiny="deprecated"
todo=""
myTemp=`mktemp -d -t /tmp` && cd "${myTemp}"
# Iterate the commands below for every repository you wanna merge
for repo_source in "${repo_sources[@]}"; do
git clone git@github.com:${repo_user}/${repo_source}.git || true
pushd ${repo_source}
git clean -fd
git reset --hard
git checkout master
git pull -f origin master
git checkout -B ${repo_user}/${repo_source}
git push -f git@github.com:${repo_user}/${repo_destiny}.git ${repo_user}/${repo_source}
popd
todo="${todo}--> Feel free to delete the repository at https://github.com/${repo_user}/${repo_source}/settings\n"
done
todo="${todo}--> Saved ${repo_sources} as branches in https://github.com/${repo_user}/${repo_destiny}/branches\n"
echo -e ${todo}
Let us know what you think!
Legacy Comments (3)
These comments were imported from the previous blog system (Disqus).
Great idea and execution!
I would make REPO_SRCS an array and use lower-case names to avoid confusing it with an environment variable:
repo_sources=(kvz/eggshell kvz/submin kvz/Elastica)
…
for $repo_source in "${repo_sources[@]}"; do
Also, if you're automating anyway, why not transfer every branch anyway?
kvz/eggshell's "master" branch becomes kvz/deprecated's "eggshell-master" branch.
kvz/eggshell's "development" branch becomes kvz/deprecated's "eggshell-development" branch.
Et cetera.
Hi Jan,
True. We chose not to do it as master really is the only branch we're interested in, and it would get messy as some repos would really need to be cleaned up before attempting something like that : ) It's trivial to add backup for all branches this way though.
As for your other changes, I like it! Just updated the post to reflect them. Thanks!