Archive

Tag Archives: ruby

Instance, class, superclass, metaclass

Q: What does metaprogramming mean exactly?

It means writing programs which modify them selves and/or write other programs. At runtime! In Ruby the most common example of metaprogramming is the shorthand for creating attribute readers/writers/accessors (i.e. getters/setters)

class Person
  attr_accessor :name
end

# is extended to (and therefore is equivalent to)
class Person
  def name=(val)
    @name = val
  end
  def name
    @name
  end
end

The attr_accessor is an ordinary class method which accepts symbols (or strings) and based on those symbols defines the methods like name= and name. That is it modifies itself!

Q: Is this it? Is this all we can do with it?

Not at all. Pretty much anything you can do statically (before compilation) you can also do dynamically (at runtime). This includes declaring new classes, adding class and instance methods to those classes, setting their instance variables and so on.

But if you want to do that you need to understand a few things about the inner workings of Ruby. Specifically:

  1. How to create new classes at runtime
  2. Where are method stored
  3. How to programmatically define a method
  4. Understand instance_eval, class_eval

How to create new classes at runtime

Creating new classes at runtime is actually the easiest thing of the above four. It’s just a call to the class method of the class Class (that’s quite a mouthful).

Person = Class::new

# is equivalent to
class Person
end

Notice the double colons ( :: ) after Class. It’s there to remind us that we are calling a class method. Class::new creates an anonymous class. When we assign it to a constant we effectively give it a name. We can also supply a superclass to the Class::new method to create a subclass of that superclass.

Where are method stored

Q: Creating classes is great and all but we would really like them to do something useful not just to sit around. So how to we define methods?

The real question is — which methods? Instance methods or class methods? This is a crucial point because instance method actually reside elsewhere then class methods. The picture below (is just an approximation) illustrates the process of resolution of method calls.

Instance method call

Instance method “source codes” reside in classes. In order for an instance of a class to run such a method the instance needs to reach in to its class, find out if the class has the method and then run the method. Imagine we want to call the instance method im on the instanceOfCustom (which is our custom class, duh):

  1. instanceOfCustom follows the klass pointer to its class
  2. searches for the method im in the repository of methods inside the class
  3. invokes the method

Class method call

Q: So if the class holds the instance methods where are the class methods?

Good question. Lets look in the superclass of our Custom class. Nope, not there … just more instance methods. So what if we would follow the same procedure like with the instance method call? We (and by we we mean our Custom class) get a request to run the class method cm:

  1. We follow our klass pointer to something
  2. then search that somethings repository of methods and are surprised that it actually has one
  3. invoke the method

What is that something that lets Ruby have such elegantly similar (i.e. the same) procedure for calling instance and class methods? Well it behaves something like a class but is not quite a class and in fact it’s a metaclass. It’s kinda like a new class hierarchy (see the above picture). Metaclasses are virtual (notice the flag V, you cannot make instance of them) and are created by the interpreter on demand and are not visible in the class hierarchy as you cannot reach them using the superclass reference. The metaclass concept can be hard to understand so try reading why’s seeing metaclasses clearly for more insight.

OK, now that we know where the individual methods reside lets see how can we actually put them there our selves programmatically.

How to programmatically define a method

Now that we know where methods need to be stored to take the property of being instance or class methods we can try to create our own method programmatically. One way to do so is to use instance_eval and class_eval methods and inside their body define the method we want to add. Lets focus on class_eval first.

With class_eval you can run code in the context of a class (it’s just like being inside class … end).

# we'll use the above defined Person class
Person.class_eval do
  def name
    @name
  end
  def name=(val)
    @name = val
  end
end

p = Person.new
p.name = 'John'
p.name
# => "John"

This way you can add the instance and class methods name and name= to whatever class you wish. For example:

def add_name(klass)
  klass.class_eval do
    def name; @name; end
    def name=(val); @name = val; end
    def self.name; self.to_s; end
  end
end

add_name(String)
a = 'hello'
a.name = 'John'
a.name
# => "John"
a.class.name
# => "String"

Q: So what is instance_eval for then? And why did we bother learning about metaclasses?

Good questions. Before answering them lets make it a bit more confusing than it is now but after that the explanation will make more sense. Lets try the same example but change the class_eval for instance_eval.

def add_name(klass)
  klass.instance_eval do
    def name; @name; end
    def name=(val); @name = val; end
    def self.name; self.to_s; end
  end
end

add_name(String)
a = 'hello'
a.name = 'John'
a.name
# => "John"
a.class.name
# => "String"

We get the same result! Surprised? There is no reason to be and this is why: classes are just instances of the class Class. That is why we get the same results with instance_eval and class_eval called on a class.

Q: So what is instance_eval for again?

The reason that you can use instance_eval on classes is a side effect of the fact that classes are instances themselves. The really cool thing about instance_eval is that you can call instance_eval on objects and execute code in their context. So we can for example “cheat” and display private attributes using instance_eval.

class C
  def initialize
    @a = 1
  end
end
C.new.instance_eval { @a }
# => 1

Or even add singleton (object specific) methods to objects!

p1 = Person.new
p2 = Person.new
p1.instance_eval do
  def say_hello
    p 'hello'
  end
end
p1.say_hello
# => "hello"
p2.say_hello
# => NoMethodError: undefined method `say_hello' for #<Person:0x2b8462c>
#            from (irb):9

How cool is that! Only the polite person (p1) can say_hello and the other one (p2) is just rude. The instance_eval method is defined in class Object so everyone can use it but class_eval method is defined in Module and can be used only by modules an classes.

Q: I still don’t see how the metaclasses fit in.

The above examples of adding instance and class methods are perfectly valid but there are things that you cannot accomplish by explicitly defining methods. Sometimes you need to pass stuff from outside into the method definition like symbols, string, blocks and so on. And by starting a definition you effectively cut yourself off from the surrounding scope. For instance you cannot access variables assigned in the outer scope. Understanding variable scope, blocks and Procs takes a longer discussion that’s why I wrote a whole post about this topic.

So assuming you have an idea about variable scope, blocks and Procs you should be able to see the shortcomings of the metaprogramming using explicit method definitions. The way Ruby does metaprogramming without explicit definitions is by providing the define_method private method in the Object class. We can call this method passing it the method name (a symbol) as argument and associate a block with it which will be the method’s body and Ruby defines the method for us. Thanks to this we don’t have to start a new method definition and we have access to variables in the surrounding scope. As the block of the method body is transformed into a Proc (more on this in the post I mentioned above) the context in which it’s defined is associated with it so the block will have access to variables defined in that scope even in a different context (in the context of a class or a instance, etc.). Lets look at an example.

Person = Class::new
var = 5
Person.class_eval do
  def age
    @age ||= var
  end
end
Person.new.age
# => NameError: undefined local variable or method `age' for #<Person:0x2b78764>
#            from (irb):4:in `age'
#            from (irb):8

Person.class_eval do
  define_method(:age) do
    @age ||= var
  end
end
Person.new.age
# => 5

The first time we tried calling the method age we got an error because Ruby cannot find the variable var because it’s not a local variable. But thanks to block turning to Procs the second time everything works fine.

Q: Neat, so we can define instance methods without explicit definition. What about class methods?

The define_method adds a new method definition to a class which has the effect of a new instance method (because the class is the repository of instance methods remember?). Do you see them now?

Metaclasses to the rescue! By now it should be clear that calling the define_method method in a metaclass of a class should add a new method to the method repository of the metaclass a thus effectively making the new method a class method. Yes, it’s exactly the same as with instance methods! There is a little snag though. There is no direct way of getting to the metaclass of a class (presently at least). The way people do it is to open the metaclass’ definition and return a reference to it.

class Person
  def self.metaclass
    class << self
      self
    end
  end
end

# now we can use the metaclass to define a class method
Person.metaclass.class_eval do
  define_method(:max_age) do
    125
  end
end
Person.max_age
# => 125

Piece of cake!

Q: How does the method self.metaclass work?

Two reasons.

First because Ruby allows us to get to so-called per-object classes. The notation class << self is used to open this per-object (singleton) class associated with an object. (You can use this notation as another way of adding methods to a single object.) And since classes are objects too we can access their per-object class the same way but in reality what we get we call the class’ metaclass.

The second reason is that Ruby is a dynamic language. And therefore you can return values from the definition of a class.

class C
  10
end
# => 10

Q: This is all interesting and all but what can I do with it?

Metaprogramming is best suited for automating repetitive tasks (like creating attribute readers/writters), developing frameworks (a great example is ActiveRecord in Rails and how it gets metadata from the database and then builds up your model classes without you having to do anything) and having lots and lots of fun!!!

Block

Ruby code block are chunks of code surrounded by do and end keywords (or single line block with curly braces). Blocks can take arguments. The arguments are declared surrounding variable names by pipe symbols. They can be associated with method calls and evaluated using yield. Passing arguments is accomplished by passing arguments to yield. Any method can be called with a block as an implicit argument. So for example:

# implicit block evaluation
def m1
  yield
end

# passing arguments to implicit block
def m2( param )
  yield param
end

# assigning a name to an implicit block
def m3( param, &block )
  block.call param
end

m1 { puts 'hello' }
# => "hello"

m2( 'hello' ) { |x| puts x }
# => "hello"

m3( 'hello' ) do |x|
  3.times { puts x }
end
# => "hello"
# => "hello"
# => "hello"

In the above example we can see how are blocks associated with method calls and how are blocks evaluated inside a method. In the m3 method call we can see how multi line blocks are associated with method calls.

Q: Whoa! Where is the yield in m3, hmm? And what is the meaning of the ampersand before the parameter block?

You got me :) The yield is replaced by block.call because we supplied a name for the block being associated (and a very unimaginative one: block) and thanks to that by the time the block gets to the method body it’s no longer a block. It’s actually a Proc. In m1 and m2 the block is anonymous and we evaluate it by calling yield. If we want to give a name to the block (by putting an ampersand before the name of the methods last parameter) we get a reference to it wrapped in a Proc object. And to evaluate a Proc you need to call it’s call method.

The m3 example is interesting in another way also. It shows how blocks handle scope of variables. The block sees the variables in the context (scope) it was declared in. The block { puts x } sees the variable x declared outside of its scope and therefore can print it. And blocks are generous and can provide that kind of scope transcending service to anyone — but only if they go through a self sacrifice and change into a Proc!

Proc

A Proc can be created by associating a block to the call of Proc.new (actually associating a block with any method call does the trick). Proc is a block associated with a context. So for example if we have a local variable say foo and we use it in a block and send the block to a method which automatically converts the block to a Proc then the formally local variable foo can be accessed in the new scope of the method. Pretty cool, huh?

def bar
  yield( 10 )
  puts var # bam! this throws an error
end

var = 1
bar { |value| var = value }
# => 10
# => NameError: undefined local variable or method `var' for main:Object
#            from (irb):3:in `bar'
#            from (irb):6

In the above example we declare a method bar which cannot access the variable var which is defined later in the scope but using a block we can assign a value to it without being able to access it directly (hence the NameError). Since the start of a method (or class) definition opens a new context we cannot assign the value of var in a method and see it change in the outer scope without the cool goodness of Procs. So only an “insane” person would try something like this:

var = 1
def bar
  var = 10
end
bar
puts var
# => 1

And expect var to be 10.

Lambda

Lambda is a Kernel method (so we should write it with a lowercase l – lambda) a call to which is equivalent to Proc.new. Except that a lambda returns a Proc which checks the number of parameters passed when called. If the number of parameters is wrong you get a warning.

l = lambda {|x| 3.times {puts x}}
l.call "hi","you"
# => (irb):2: warning: multiple values for a block parameter (2 for 1)
# => "hi"
# => "you"
# => "hi"
# => "you"
# => "hi"
# => "you"

Lambda vs Proc

From Wikipedia

Both Proc.new and lambda in this example are ways to create a closure, but semantics of the closures thus created are different with respect to the return statement.

def foo
  f = Proc.new { return "return from foo from inside proc" }
  f.call # control leaves foo here
  return "return from foo"
end

def bar
  f = lambda { return "return from lambda" }
  f.call # control does not leave bar here
  return "return from bar"
end

puts foo # prints "return from foo from inside proc"
puts bar # prints "return from bar"

Follow

Get every new post delivered to your Inbox.