Occasionally, I’ll run across a case where some code (usually a background process) is looping, maintaining some kind of state, and checking for state change. For example, I have an application which monitors a file for changes. It does this by checking the file timestamp, and, if it has changed from the last run, operating on the file.
When I find this code, it is frequently a long loop with a bunch of code in between. Infinite loops are essentially impossible to test, so whenever you have a case like that, your first goal is to move as much code as possible outside of the loop. The best way I’ve found to do this is to separate the state management from the operation.
First, identify all the state variables within the method. One way to think about this is which variables will stop working if you were to reset them at the top of every loop. Take all of these variables and create a value object for them. Then take the rest of the code and move it to a new method which take the value object as an argument and returns one. Then change the loop to just call the method, passing the value object, and then assign the output to the same value object.
By doing this, the loop now has minimal code which can be visually audited. The rest of the state logic is now encapsulated inside of a method (which can be refactored easily into one or more command objects of needed) which is easy to test. You can now write tests which create a state situation, pass it into the method, and then examine the resulting state.
I’ve been dealing lately with issues around project estimation and timelines, and I recently realized something. Developers tend to be optimists, and are willing to tackle cool new projects which are very greenfield.
Green field projects are important and can sometimes offer huge payoffs. However, their downside is that they are by their nature unscoped. The goals can be scoped, but the development involved, the learning needed, and the unknowns presented leave the true scope undefined.
These are cool! It is exciting to jump into the great unknown and figure out ways to do things which have never been done before.
However, 99% of what is built today has already been done thousands of times by other developers, and I bet that you have done it already once or twice. The long term sustainability of your company is based on your ability to not get in over your head.
So I propose that you try to find the simplest, most boring way to solve your problem. If it is turning out to be hard, ask whether there is a way to shift requirements such that it is made easy. The vast majority of the time, you can.
For the remaining 1% of projects, keep their scope as small as possible. Since you cannot estimate the depth of the project, keep the breadth limited.
However, feel free to try new tech. But with these rules:
Only try one new technology in any given project. A new database is a new technology. A new templating language is a new technology. Et cetera.
Set a fallback plan. If you were not allowed to use any new technology, how would you do it?
Set a cutoff. If this takes longer than X time, you will fall back to the old tech.
Set a stretch cutoff. If you think you are really close, you are allowed Y more time to get it there.
If you hit these cutoffs, do not, under any circumstances, continue down this road on this project. You have to ship and move on to other things. Learn from the test and feel free to try again with the new-found knowledge.
Since caching is one of the two hard things in Computer Science (along with naming things and off-by-one errors), and I’ve been dealing with tons of caching issues of late, I here offer my personal rules of caching:
Caching should be done as little as possible.
You should not cache until you know you need to.
When possible, caching should be done by dedicated applications (Varnish, Akamai, et cetera) outside of the application.
When possible, HTTP APIs should set sane cache headers to help aforementioned dedicated applications.
Try to keep cache lifetimes as short as your applications can handle. If your data rarely changes, try an ultra-long-lived cache, keeping in mind the below rules.
Ultra-long-lived caches which require forced expiration on change should only be used inside the application which generates the changes.
Never depend on polling to tell you when to expire a cache.
Always turn off caching in your test suite except when testing your cache.
When consuming slow data you do not control, you may cache the response in your application. However, you should not cache the processed result, only the source, and as early as possible.
You should not cache intermediate steps in your application’s program flow.
This will be updated as I bang my head against more issues.
I’ve been using rbenv for development for a while, but recently I started working with using rbenv on my Webfaction hosting. Here is a basic tutorial.
First, create a new Passenger app in the Webfaction control panel. You can create a Rails app, but it adds some extra cruft you don’t need.
Log into the server, and install rbenv locally, using the standard rbenv tutorial. You will want to also install ruby-build in order to set up your ruby versions. Compile and install the desired version of Ruby using rbenv install 2.2.2 (change the version to match).
I suggest that you update rubygems and install bundler at this point by switching to the desired version of ruby (rbenv global 2.2.2), and running gem update --system and gem install bundler. You may want to switch back to system Ruby, although this is up to you (rbenv global system).
In the app directory (webapps/yourappname), edit the Nginx config file at nginx/conf/nginx.conf. Set passenger_ruby to be /home/[yourusername]/.rbenv/shims/ruby.
After a restart, you should be running on your rbenv ruby.
Closing track on the new album, “With Heart and Voice.” This festival arrangment was commissioned for the CUC 2010-11 Academic year. Produced and Released by Concordia University Chicago Wind Symphony, 2011.