NopeCode

Ruby loading and requiring files, constant name resolution

This article has started as my own research on a slightly different theme — Rails autoloading, but I couldn’t describe it without saying a single word about Ruby itself. In this topic we’ll talk about how Ruby loads and requires modules, constant name resolution and then we’ll switch to Rails autoloading. There’s more or less info about all these topics on the internet, so that sometimes I’ll be overlapping with it but sometimes not, I don’t claim it’s unique info but anyway I have to sum it up in just one big article. So let’s get started with Ruby.

Constant definition:

A constant in Ruby is like a variable, except that its value is supposed to
remain constant for the duration of a program. The Ruby interpreter does not
actually enforce the constancy of constants, but it does issue a warning if a
program changes the value of a constant. Lexically, the names of constants look
like the names of local variables, except that they begin with a capital letter.
By convention, most constants are written in all uppercase with underscores to
separate words, LIKE_THIS. Ruby class and module names are also constants, but
they are conventionally written using initial capital letters and camel case,
LikeThis.

The Ruby Programming Language: David Flanagan; Yukihiro Matsumoto.

I think it’s clear and turns out that we’ll see a warning if we’ll try to change a constant:

A = 'a'
A = 'b'
#./a.rb:2: warning: already initialized constant A
#./a.rb:1: warning: previous definition of A was here

The same thing for classes:

class A; end
A = 'b'
#./a.rb:2: warning: already initialized constant A
#./a.rb:1: warning: previous definition of A was here

Since the constant A is just a reference for the class’ object (remember class is object in Ruby, right?) and we try to reassign it with new value then we see this warning. Ok, now we know what the constant is, moving to files requiring.

Loading and requiring files

We cannot place all the code in just one single file, otherwise it would be too long and complicated for reading. Usually we put a class per file and use a few different methods in order to ‘concatenate’ it. Here they are: require, require_relative, load, autoload. Let’s start with the first one.

Kernel#require(name) loads the given name, returning true if successful and false if the feature is already loaded. If the filename does not resolve to an absolute path, it will be searched for in the directories listed in $LOAD_PATH. Any constants or globals within the loaded source file will be available in the calling program’s global namespace. However, local variables will not be propagated to the loading environment. With this method you can load even native extension(.so, .dll or the others depending on current platform). If you don’t specify the extension Ruby starts with .rb and so on. The absolute path of the loaded file is added to $LOADED_FEATURES. A file will not be loaded again if its path already appears in $LOADED_FEATURES. Kernel.require_relative(name) is almost the same as require but it looks for a file in the current directory or directories that is relative to the current.

Example with require:

# a.rb
# module A
#   C = 'constant'
# end

before = $LOADED_FEATURES.dup
require 'a'
$LOADED_FEATURES - before # => ['/Users/route/Projects/dependencies/a.rb']

A::C # => 'constant'
sleep 15 # Meanwhile changing constant value to 'changed'

require 'a'

A::C # => 'constant'

Kernel#load(filename, wrap=false) loads and executes the Ruby program in the filename. If the filename does not resolve to an absolute path, the file is searched for in $:. If the optional wrap parameter is true, the loaded script will be executed under an anonymous module, protecting the calling program’s global namespace. It also can load the content of file many times because it doesn’t rely on $LOADED_FEATURES. Notice that load needs a filename extension.

Example with load:

# a.rb
# module A
#   C = 'constant'
# end

before = $LOADED_FEATURES.dup
load './a.rb'
$LOADED_FEATURES - before # => []

A::C # => 'constant'
sleep 15 # Meanwhile changing constant value to 'changed'

load './a.rb'

# ./a.rb:2: warning: already initialized constant A::C
# ./a.rb:2: warning: previous definition of C was here
A::C # => 'changed'

With warnings but the code was reloaded and we can even see the changes we’ve made. Let’s add optional parameter wrap:

Example with load and wrap:

# a.rb
# module A
#   C = 'constant'
# end
#
# $A = A

load './a.rb', true

A::C # => uninitialized constant A (NameError)
$A::C # => 'constant'

You see that Ruby hasn’t polluted global namespace and wrapped all the constants from the file to an anonymous module, but global variables still could be retrieved.

Kernel#autoload(module, filename) registers filename to be loaded (using Kernel::require) the first time that module (String or a Symbol) is accessed.

Example 1 with autoload:

# a.rb
# module A
#   p 'loading'
# end

autoload :A, 'a'

It won’t produce anything useful, because we’ve just declared that constant A can be found in a file but we’ve never used it.

Example 2 with autoload:

# a.rb
# module A
#   p 'loading'
# end

autoload :A, 'a'
A # => Gives output 'loading'

In other words autoload makes us to load code lazily on demand decreasing time during the boot. There were some problems with thread safety and autoload, also there was a rumor that it would be deprecated, but I hadn’t found any info what the Ruby core team came up with. But the bug was fixed and I just can say it works properly even with threads for now:

# a.rb
# module A
#   sleep 5
#   def self.hello
#     'hello'
#   end
# end

autoload :A, './a'

t1 = Thread.new { A.hello }
t2 = Thread.new { A.hello }
t1.join; t2.join

I was expecting that second thread would throw an error that method hello is undefined because module A had been loaded by first thread but because of sleep threads were switched, but it worked.

Constant resolution

I find this example very comprehensive and I won’t describe it much because the code tells about itself:

module Kernel
  # Constants defined in Kernel
  A = B = C = D = E = F = 'defined in kernel'
end

# Top-level or 'global' constants defined in Object
A = B = C = D = E = 'defined at top-level'

class Super
  # Constants defined in a superclass
  A = B = C = D = 'defined in superclass'
end

module Included
  # Constants defined in an included module
  A = B = C = 'defined in included module'
end

module Enclosing
  # Constants defined in an enclosing module
  A = B = 'defined in enclosing module'

  class Local < Super
    include Included

    # Locally defined constant
    A = 'defined locally'

    # The list of modules searched, in the order searched
    # [Enclosing::Local, Enclosing, Included, Super, Object, Kernel, BasicObject]
    # (Module.nesting + self.ancestors + Object.ancestors).uniq
    puts A  # Prints "defined locally"
    puts B  # Prints "defined in enclosing module"
    puts C  # Prints "defined in included module"
    puts D  # Prints "defined in superclass"
    puts E  # Prints "defined at top-level"
    puts F  # Prints "defined in kernel"
  end
end

So the path Ruby follows in order to resolve constant name starts with Module.nesting which of course starts with itself and then all enclosing constants respectively. If the constant cannot be found there, then ancestors chain is applied.

Known pitfalls:

1) Nesting:

We can define new class/module using two different ways:

module A
  module B; end
end

or

module A; end
module A::B; end

Pay attention that Module.nesting for these two forms is different and turns out that your constant name resolution will be different too:

module A
  C = 'c'
end

module A
  module B
    C # => 'c'
    Module.nesting # => [A::B, A]
  end
end

module A::B
  C # => NameError: uninitialized constant A::B::C
  Module.nesting # => [A::B]
end

2) Inheritance:

Remember that constants use the currently opened class or module, as determined by class and module statements.

class Parent
  CONST = 'parent'

  def self.const
    CONST
  end
end

class Child < Parent
  CONST = 'child'
end

Child.const # => parent

In this example method is invoked on parent class, so its class is the innermost one. To change things you could use self::CONST this way you’re explicitly saying find my constant in self where self is Child if we call Child.const.

3) Object::

Module.nesting == [] at the top level, and so constant lookup starts at the currently opened class and its ancestors which is Object:

class Object
  module C; end
end
C == Object::C # => true

or

module C; end
Object::C == C # => true

This in turn explains why top-level constants are available throughout your program. Almost all classes in Ruby inherit from Object, so Object is almost always included in the list of ancestors of the currently open class, and thus its constants are almost always available. That said, if you’ve ever used a BasicObject, and noticed that top-level constants are missing, you now know why. Because BasicObject does not subclass Object, all of the constants are not in the lookup chain:

class Foo < BasicObject
  Kernel
end
# NameError: uninitialized constant Foo::Kernel

For cases like this, and anywhere else you want to be explicit, Ruby allows you to use ::Kernel to access Object::Kernel.

4) class_eval, module_eval, instance_eval, define_method:

As mentioned above, constant lookup uses the currently opened class, as determined by class and module statements. Importantly, if you pass a block into class_eval, module_eval or instance_eval, define_method, this won’t change constant lookup. It continues to use the constant lookup at the point the block was defined:

class A
  module B; end
end

class C
  module B; end
  A.class_eval { B } == C::B
end

Confusingly however, if you pass a String to these methods, then the String is evaluated with Module.nesting containing just the class/module itself (for class_eval or module_eval) or just the singleton class of the object (for instance_eval).

module A
  module B; end
end

module C
  module B; end
  A.module_eval("B") == A::B
end
module A
  X = 1
  module B; end
end

module C
  module B; end
  A::B.module_eval("X") # => uninitialized constant A::B::X (NameError)
end

5) Singleton class:

If you’re in a singleton class of a class, you don’t get access to constants defined in the class itself:

class A
  module B; end
end
class << A
  B # => uninitialized constant Class::B
end

This is because the ancestors of the singleton class of a class do not include the class itself, they start at the Class class.

class A
  module B; end
end
class << A
  ancestors # => [Class, Module, Object, Kernel, BasicObject]
end

Lastly, imagine we access a constant that isn’t defined at all then self.const_missing is invoked on the class that needs constant or if it wasn’t defined on that class it’s invoked on its superclass — Module (A.class.superclass # => Module). It accepts just one single argument const_name which is the constant name we’re looking for. By default this method simply throws an error NameError: uninitialized constant #{const_name}. That’s all for Ruby moving to the more interesting part — Rails autoloading.

Links and used sources:

Яндекс.Метрика