Ruby metaprogramming step-by-step

Instance, class, superclass, metaclass

Q: What does metaprogramming mean exactly?

It means writing programs which modify them selves and/or write other programs. At runtime! In Ruby the most common example of metaprogramming is the shorthand for creating attribute readers/writers/accessors (i.e. getters/setters)

class Person
  attr_accessor :name
end

# is extended to (and therefore is equivalent to)
class Person
  def name=(val)
    @name = val
  end
  def name
    @name
  end
end

The attr_accessor is an ordinary class method which accepts symbols (or strings) and based on those symbols defines the methods like name= and name. That is it modifies itself!

Q: Is this it? Is this all we can do with it?

Not at all. Pretty much anything you can do statically (before compilation) you can also do dynamically (at runtime). This includes declaring new classes, adding class and instance methods to those classes, setting their instance variables and so on.

But if you want to do that you need to understand a few things about the inner workings of Ruby. Specifically:

  1. How to create new classes at runtime
  2. Where are method stored
  3. How to programmatically define a method
  4. Understand instance_eval, class_eval

How to create new classes at runtime

Creating new classes at runtime is actually the easiest thing of the above four. It’s just a call to the class method of the class Class (that’s quite a mouthful).

Person = Class::new

# is equivalent to
class Person
end

Notice the double colons ( :: ) after Class. It’s there to remind us that we are calling a class method. Class::new creates an anonymous class. When we assign it to a constant we effectively give it a name. We can also supply a superclass to the Class::new method to create a subclass of that superclass.

Where are method stored

Q: Creating classes is great and all but we would really like them to do something useful not just to sit around. So how to we define methods?

The real question is — which methods? Instance methods or class methods? This is a crucial point because instance method actually reside elsewhere then class methods. The picture below (is just an approximation) illustrates the process of resolution of method calls.

Instance method call

Instance method “source codes” reside in classes. In order for an instance of a class to run such a method the instance needs to reach in to its class, find out if the class has the method and then run the method. Imagine we want to call the instance method im on the instanceOfCustom (which is our custom class, duh):

  1. instanceOfCustom follows the klass pointer to its class
  2. searches for the method im in the repository of methods inside the class
  3. invokes the method

Class method call

Q: So if the class holds the instance methods where are the class methods?

Good question. Lets look in the superclass of our Custom class. Nope, not there … just more instance methods. So what if we would follow the same procedure like with the instance method call? We (and by we we mean our Custom class) get a request to run the class method cm:

  1. We follow our klass pointer to something
  2. then search that somethings repository of methods and are surprised that it actually has one
  3. invoke the method

What is that something that lets Ruby have such elegantly similar (i.e. the same) procedure for calling instance and class methods? Well it behaves something like a class but is not quite a class and in fact it’s a metaclass. It’s kinda like a new class hierarchy (see the above picture). Metaclasses are virtual (notice the flag V, you cannot make instance of them) and are created by the interpreter on demand and are not visible in the class hierarchy as you cannot reach them using the superclass reference. The metaclass concept can be hard to understand so try reading why’s seeing metaclasses clearly for more insight.

OK, now that we know where the individual methods reside lets see how can we actually put them there our selves programmatically.

How to programmatically define a method

Now that we know where methods need to be stored to take the property of being instance or class methods we can try to create our own method programmatically. One way to do so is to use instance_eval and class_eval methods and inside their body define the method we want to add. Lets focus on class_eval first.

With class_eval you can run code in the context of a class (it’s just like being inside class … end).

# we'll use the above defined Person class
Person.class_eval do
  def name
    @name
  end
  def name=(val)
    @name = val
  end
end

p = Person.new
p.name = 'John'
p.name
# => "John"

This way you can add the instance and class methods name and name= to whatever class you wish. For example:

def add_name(klass)
  klass.class_eval do
    def name; @name; end
    def name=(val); @name = val; end
    def self.name; self.to_s; end
  end
end

add_name(String)
a = 'hello'
a.name = 'John'
a.name
# => "John"
a.class.name
# => "String"

Q: So what is instance_eval for then? And why did we bother learning about metaclasses?

Good questions. Before answering them lets make it a bit more confusing than it is now but after that the explanation will make more sense. Lets try the same example but change the class_eval for instance_eval.

def add_name(klass)
  klass.instance_eval do
    def name; @name; end
    def name=(val); @name = val; end
    def self.name; self.to_s; end
  end
end

add_name(String)
a = 'hello'
a.name = 'John'
a.name
# => "John"
a.class.name
# => "String"

We get the same result! Surprised? There is no reason to be and this is why: classes are just instances of the class Class. That is why we get the same results with instance_eval and class_eval called on a class.

Q: So what is instance_eval for again?

The reason that you can use instance_eval on classes is a side effect of the fact that classes are instances themselves. The really cool thing about instance_eval is that you can call instance_eval on objects and execute code in their context. So we can for example “cheat” and display private attributes using instance_eval.

class C
  def initialize
    @a = 1
  end
end
C.new.instance_eval { @a }
# => 1

Or even add singleton (object specific) methods to objects!

p1 = Person.new
p2 = Person.new
p1.instance_eval do
  def say_hello
    p 'hello'
  end
end
p1.say_hello
# => "hello"
p2.say_hello
# => NoMethodError: undefined method `say_hello' for #<Person:0x2b8462c>
#            from (irb):9

How cool is that! Only the polite person (p1) can say_hello and the other one (p2) is just rude. The instance_eval method is defined in class Object so everyone can use it but class_eval method is defined in Module and can be used only by modules an classes.

Q: I still don’t see how the metaclasses fit in.

The above examples of adding instance and class methods are perfectly valid but there are things that you cannot accomplish by explicitly defining methods. Sometimes you need to pass stuff from outside into the method definition like symbols, string, blocks and so on. And by starting a definition you effectively cut yourself off from the surrounding scope. For instance you cannot access variables assigned in the outer scope. Understanding variable scope, blocks and Procs takes a longer discussion that’s why I wrote a whole post about this topic.

So assuming you have an idea about variable scope, blocks and Procs you should be able to see the shortcomings of the metaprogramming using explicit method definitions. The way Ruby does metaprogramming without explicit definitions is by providing the define_method private method in the Object class. We can call this method passing it the method name (a symbol) as argument and associate a block with it which will be the method’s body and Ruby defines the method for us. Thanks to this we don’t have to start a new method definition and we have access to variables in the surrounding scope. As the block of the method body is transformed into a Proc (more on this in the post I mentioned above) the context in which it’s defined is associated with it so the block will have access to variables defined in that scope even in a different context (in the context of a class or a instance, etc.). Lets look at an example.

Person = Class::new
var = 5
Person.class_eval do
  def age
    @age ||= var
  end
end
Person.new.age
# => NameError: undefined local variable or method `age' for #<Person:0x2b78764>
#            from (irb):4:in `age'
#            from (irb):8

Person.class_eval do
  define_method(:age) do
    @age ||= var
  end
end
Person.new.age
# => 5

The first time we tried calling the method age we got an error because Ruby cannot find the variable var because it’s not a local variable. But thanks to block turning to Procs the second time everything works fine.

Q: Neat, so we can define instance methods without explicit definition. What about class methods?

The define_method adds a new method definition to a class which has the effect of a new instance method (because the class is the repository of instance methods remember?). Do you see them now?

Metaclasses to the rescue! By now it should be clear that calling the define_method method in a metaclass of a class should add a new method to the method repository of the metaclass a thus effectively making the new method a class method. Yes, it’s exactly the same as with instance methods! There is a little snag though. There is no direct way of getting to the metaclass of a class (presently at least). The way people do it is to open the metaclass’ definition and return a reference to it.

class Person
  def self.metaclass
    class << self
      self
    end
  end
end

# now we can use the metaclass to define a class method
Person.metaclass.class_eval do
  define_method(:max_age) do
    125
  end
end
Person.max_age
# => 125

Piece of cake!

Q: How does the method self.metaclass work?

Two reasons.

First because Ruby allows us to get to so-called per-object classes. The notation class << self is used to open this per-object (singleton) class associated with an object. (You can use this notation as another way of adding methods to a single object.) And since classes are objects too we can access their per-object class the same way but in reality what we get we call the class’ metaclass.

The second reason is that Ruby is a dynamic language. And therefore you can return values from the definition of a class.

class C
  10
end
# => 10

Q: This is all interesting and all but what can I do with it?

Metaprogramming is best suited for automating repetitive tasks (like creating attribute readers/writters), developing frameworks (a great example is ActiveRecord in Rails and how it gets metadata from the database and then builds up your model classes without you having to do anything) and having lots and lots of fun!!!

2 comments
  1. Maybe I missed something in my perusal of your tutorial.
    But did you mention anything related to higher order functions?
    Thanks a lot.

  2. ytoh said:

    No, not really. Just the basics of how to extend existing classes and objects so I don’t forget :-)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.