LSpace: Dynamic scope for Ruby

At Rapportive we have one master database that’s always up-to-date, and a read slave that may be a little bit behind. Usually a database read can use either connection, but if a user is viewing their own page we need to use the master database. This is so that if they’ve just edited their information, their edits are guaranteed to show up, even if the read slave is lagging.

LSpace is a library designed for holding implicit state like this. We can tell the database layer to use the master database at the top level, and that decision will be transparently passed down through the page loading logic:

require 'lspace'
class DatabaseConnection
  def get_read_connection
    LSpace[:database] || slave_db
  end

  def self.use_master(&block)
    LSpace.with(:database => master_db) do
      block.call
    end
  end
end

DatabaseConnection.use_master do
  load_data_for_page
end

As you can see, load_data_for_page doesn’t need to take an argument to tell it which database to use. It just uses the LSpace for implicit context.

Why not just a global?

In the example above I clearly could have used a global variable, though that gets a bit messy if there are multiple threads. I could even have used a Thread-local, but LSpace is better than Thread-locals for two reasons:

  1. LSpace locals are changed only for the duration of a block. This means that you won’t permanently clobber the value that someone else set, and that your value won’t leak once you no-longer need it.
  2. LSpace can automatically extend the notion of thread when you’re using eventmachine or celluloid. Values in LSpace are preserved even when your code jumps between actors or continues after a callback.

As I’m sure the old-time-lispers are eager to point out, the first makes LSpace an implementation of dynamic scoping and the second exists to avoid the confusion caused by the upwards funarg problem.

Why is dynamic scope useful?

LSpace is good for anything that needs implicit context. You’ll notice that to demonstate how it works, I did not need to show you the implementation of load_data_for_page. By the same token, load_data_for_page doesn’t need to care which database connection it’s using; it just relies on the fact that a connection will be available when it needs one.

Let’s take another example. When logging I want to ensure related log messages have the same prefix, so I can tie them back together again:

def log(str)
  puts "#{LSpace[:log_prefix]}: #{str}"
end

LSpace.with(:log_prefix => "example") do
  log "hello world"
end
# => example: hello world

I could have just passed log_prefix into the log method directly, but I chose not to in order to keep the API clean. The benefits of this become more obvious in a larger example:

def fetch(url)
  url = URI(url)
  log "Fetching #{url}"
  Net::HTTP.get(url)
end

def get_title(url)
  html = fetch(url)
  (Nokogiri::HTML(html) / 'title').to_s
end

LSpace.with(:log_prefix => "example") do
  log get_title('http://www.google.com')
end

If I didn’t have LSpace (or thread locals) then I’d have had to pass log_prefix as an argument to get_title. This works, but it’s ugly:

def get_title(url, log_prefix)
  html = fetch(url, log_prefix)
  (Nokogiri::HTML(html) / 'title').to_s
end

I don’t just mean “it’s ugly because it now takes two parameters”, though that certainly doesn’t help. It’s ugly because get_title shouldn’t depend on how fetch is implemented; in fact, fetch shouldn’t even depend on how the logger is implemented.

This tight coupling means that if I ever want to add a new log call to my application, I’m going to have to change a lot of code to pass the log_prefix all the way down to the logger. This is clearly bad for maintainability, as it’s adding a lot of extra work for the programmer.

Should I use LSpace?

LSpace makes part of your context implicit. This is good because it reduces the amount of boilerplate state-passing you need, but can be confusing if over-used (this in Javascript is perhaps the most infamous example of the problem).

We use LSpace mostly for framework-level concerns, rather than directly for application logic:

  1. Setting a logging prefix (just like in the example above). It makes logs much easier to read when you can associate lines that came from the same request.
  2. Being able to use a particular database connection for the duration of a block. When we show the user their own information, we force the code to read from our master database so that any changes they just made will appear.
  3. Maintaining state in our metrics library to enable Zipkin-style traces of which slownesses are caused by which others.

Other people are using similar ideas for some really cool stuff. In particular you should look at how the reactive programming is implemented in meteor.js, or why dynamic scope is used in Emacs lisp.

I’d encourage you to read the README, and see whether LSpace can work its magic for you.