This article in 50 words: I used to prefer spaces vs tabs, now I don't care so much, think it's more important that you can easily switch on a per-project basis. Have some thoughts on how conventions should be established, and I'll demonstrate bash code that can convert your codebase to a new standard.
Back in the Day
I used to prefer spaces over tabs so code would look consistent throughout the monospace universe. No matter what crappy viewer people ended up using on it: as long as it supported a monospace font, your code looked as intended. I also did some work for PEAR and they enforced spaces in their thorough Coding Standards. Which was later adopted by many frameworks, sometimes with small deviations.
Quest for the Holy Convention
After years of spaces in my code I started using CakePHP and their standard was tabs. Nothing to get hung up over, but after a while my code started intermingling with other Cake developer's code and that's when it gets a little hairy.
So I started using tabs there cause in my view conventions are much like traffic rules as I mentioned before in SQL Formatting
It's irrelevant if people drive on the right or left side of the road, as long as they do the same
At Transloadit we started using 2 spaces for JavaScript as it's the way of node.js. And then there's a little Ruby project I started hacking on and they also like 2 spaces.
Adopt Many Conventions
Coding standards change. Within a project, organization, framework, and even language. Or they change for you simply because you contribute to different aforementioned forms.
Instead of trying to enforce one preference throughout all of my projects, I adopt the rules of the domain at hand. In this order:
Project > Organization > Framework > Language
(where conflicting, left wins from right)
On a side-note, I think it's the convention-designer's responsibility to align his conventions with the layer they're building on as much as possible. So frameworks should look at their language. companies should look at their framework. This makes for consistently looking codebases. And that helps encouraging involvement. Nobody likes messing with code that suffers from poor housekeeping.
In order to be flexible about this, it helps a lot if your IDE supports per-project settings (I currently use both NetBeans & Vim, and they do an fine job at that). In NetBeans it's easy to mess up though cause it's pretty much indentation agnostic. So sometimes you won't notice you're filling a 4-spaces file with tabs, ruining the code in other views/editors.
Once that happens, or maybe if you're porting big chunks of 'legacy' code to a new standard that's closer to your layer, you'll need decent conversion scripts.
Switch Conventions
There are many pages on converting spaces to tabs on Linux or Mac, but I wasn't satisfied as they:
- Also change non-leading whitespace (which may not be what you want, e.g. a tab-indented document could still use spaces to promote readability around assignments, or inside big strings)
- Don't support multiple levels of indention
- Can't be run from command-line (e.g. depend on IDE)
- Are specific to a language (
indent
/astyle
) - Messed up my indentation (
expand
/unexpand
)
In an attempt to come up with a reliable tabs vs spaces converter that you can simply run inside a directory and will traverse your source files, I'd like to share a couple of lines of Bash.
Warnings:
- Only do this when your source is under version control, these snippets make no backups! So execute, test, verify, commit. Or hit
git reset --hard
if you don't like it (leave a comment for improvement!) - Currently processes
.php
,.ctp
,.js
,.css
,.sh
. But can easily be modified to do other extensions as well.
Ubuntu
$ # 4 Spaces to tabs
$ find -P . -type f -regextype egrep -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0 sed -i"" -e ':repeat; s/^\(\t*\) /\1\t/; t repeat'
$ # extra: Strip any trailing whitespace
$ find -P . -type f -regextype egrep -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0 sed -i"" -e 's/[[:blank:]]*$//g'
$ # extra: Strip any trailing blank lines (https://www.eng.cam.ac.uk/help/tpl/unix/sed.html)
$ find -P . -type f -regextype egrep -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0 sed -i"" -e :a -e '/^\n*$/{$d;N;ba' -e '}'
$ # extra: Strip any trailing PHP closing tags
$ find -P . -type f -regextype egrep -regex '.*\.(php|ctp)$' -print0 | xargs -0 sed -i"" -e :a -e '/^
*$/{$d;N;ba' -e '}'
$ # extra: Check the PHP files for syntax errors
$ find -P . -type f -regextype egrep -regex '.*\.(php|ctp)$' -exec php -l {} \; > /dev/null
Mac
On a Mac? You need GNU sed! - Read below.
$ # 4 spaces to tabs
$ find -P -E . -type f -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0 gsed -i"" -e ':repeat; s/^\(\t*\) /\1\t/; t repeat'
$ # tabs to 2 spaces
$ find -P -E . -type f -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0 gsed -i"" -e ':repeat; s/^\(\( \)*\)\t/\1 /; t repeat'
$ # extra: Strip any trailing whitespace
$ find -P -E . -type f -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0 gsed -i"" -e 's/[[:blank:]]*$//g'
$ # extra: Strip any trailing blank lines (https://www.eng.cam.ac.uk/help/tpl/unix/sed.html)
$ find -P -E . -type f -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0 gsed -i"" -e :a -e '/^\n*$/{$d;N;ba' -e '}'
$ # extra: Strip any trailing PHP closing tags
$ find -P -E . -type f -regex '.*\.(php|ctp)$' -print0 | xargs -0 gsed -i"" -e :a -e '/^\n*$/{$d;N;ba' -e '}'
$ # extra: Check the PHP files for syntax errors
$ find -P -E . -type f -regex '.*\.(php|ctp)$' -exec php -l {} \; > /dev/null
Run Into Problems?
Please let me know, I'll update the article so that these lines become the perfect converters.
On a Mac? You need gnu-sed
Mac OSX (BSD) has a cripled sed
. This illustrates my point:
$ # On Mac:
$ echo "1|2|||5||7|" | sed -e ': repeat; s/||/|NULL|/; t repeat'
1|2|||5||7|
$ # On Linux:
$ echo "1|2|||5||7|" | sed -e ': repeat; s/||/|NULL|/; t repeat'
1|2|NULL|NULL|5|NULL|7|
Luckily you can get GNU sed for Mac OSX just as well. Get homebrew, then run:
$ brew install gnu-sed
And change all of sed
references to gsed
.
That's it
How do you deal with changing/different coding standards?