Namespace - Ruby’s in-process separation of Classes and Modules

Namespace is designed to provide separated spaces in a Ruby process, to isolate applications and libraries.

Known issues

TODOs

How to use

Enabling namespace

First, an environment variable should be set at the ruby process bootup:RUBY_NAMESPACE=1. The only valid value is1 to enable namespace. Other values (or unsetRUBY_NAMESPACE) means disabling namespace. And setting the value after Ruby program starts doesn’t work.

Using namespace

Namespace class is the entrypoint of namespaces.

ns =Namespace.newns.require('something')# or require_relative, load

The required file (either .rb or .so/.dll/.bundle) is loaded in the namespace (ns here). The required/loaded files fromsomething will be loaded in the namespace recursively.

# something.rbX =1classSomethingdefself.x =Xdefx =::Xend

Classes/modules, those methods and constants defined in the namespace can be accessed vians object.

pns::Something.x# 1X =2pX# 2p::X# 2pns::Something.x# 1pns::X# 1

Instance methods defined in the namespace also run with definitions in the namespace.

s =ns::Something.newps.x# 1

Specifications

Namespace types

There are two namespace types:

There is the root namespace, just a single namespace in a Ruby process. Ruby bootstrap runs in the root namespace, and all builtin classes/modules are defined in the root namespace. (See “Builtin classes and modules”.)

User namespaces are to run user-written programs and libraries loaded from user programs. The user’s main program (specified by theruby command line argument) is executed in the “main” namespace, which is a user namespace automatically created at the end of Ruby’s bootstrap, copied from the root namespace.

WhenNamespace.new is called, an “optional” namespace (a user, non-main namespace) is created, copied from the root namespace. All user namespaces are flat, copied from the root namespace.

Namespace class and instances

Namespace is a top level class, as a subclass ofModule, andNamespace instances are a kind ofModule.

Classes and modules defined in namespace

The classes and modules, newly defined in a namespacens, are defined underns. For example, if a classA is defined inns, it is actually defined asns::A.

In the namespacens,ns::A can be referred to asA (and::A). From outside ofns, it can be referred to asns::A.

The main namespace is exceptional. Top level classes and modules defined in the main namespace are just top level classes and modules.

Classes and modules reopened in namespace

In namespaces, builtin classes/modules are visible and can be reopened. Those classes/modules can be reopened usingclass ormodule clauses, and class/module definitions can be changed.

The changed definitions are visible only in the namespace. In other namespaces, builtin classes/modules and those instances work without changed definitions.

# in foo.rbclassStringBLANK_PATTERN =/\A\s*\z/defblank?self=~BLANK_PATTERNendendmoduleFoodefself.foo ="foo"defself.foo_is_blank?foo.blank?endendFoo.foo.blank?#=> false"foo".blank?#=> false# in main.rbns =Namespace.newns.require('foo')Foo.foo_is_blank?#=> false   (#blank? called in ns)Foo.foo.blank?# NoMethodError"foo".blank?# NoMethodErrorString::BLANK_PATTERN# NameError

The main namespace andns are different namespaces, so monkey patches in main are also invisible inns.

Builtin classes and modules

In the namespace context, “builtin” classes and modules are classes and modules:

Hereafter, “builtin classes and modules” will be referred to as just “builtin classes”.

Builtin classes referred via namespace objects

Builtin classes in a namespacens can be referred from other namespace. For example,ns::String is a valid reference, andString andns::String are identical (String == ns::String,String.object_id == ns::String.object_id).

ns::String-like reference returns just aString in the current namespace, so its definition isString in the namespace, not inns.

# foo.rbclassStringdefself.foo ="foo"end# main.rbns =Namespace.newns.require('foo')ns::String.foo# NoMethodError

Class instance variables, class variables, constants

Builtin classes can have different sets of class instance variables, class variables and constants between namespaces.

# foo.rbclassArray@v ="foo"@@v ="_foo_"V ="FOO"endArray.instance_variable_get(:@v)#=> "foo"Array.class_variable_get(:@@v)#=> "_foo_"Array.const_get(:V)#=> "FOO"# main.rbns =Namespace.newns.require('foo')Array.instance_variable_get(:@v)#=> nilArray.class_variable_get(:@@v)# NameErrorArray.const_get(:V)# NameError

Global variables

In namespaces, changes on global variables are also isolated in the namespace. Changes on global variables in a namespace are visible/applied only in the namespace.

# foo.rb$foo ="foo"$VERBOSE =nilputs"This appears: '#{$foo}'"# main.rbp$foo#=> nilp$VERBOSE#=> falsens =Namespace.newns.require('foo')# "This appears: 'foo'"p$foo#=> nilp$VERBOSE#=> false

Top level constants

Usually, top level constants are defined as constants ofObject. In namespaces, top level constants are constants ofObject in the namespace. And the namespace objectns‘s constants are strictly equal to constants ofObject.

# foo.rbFOO =100FOO#=> 100Object::FOO#=> 100# main.rbns =Namespace.newns.require('foo')ns::FOO#=> 100FOO# NameErrorObject::FOO# NameError

Top level methods

Top level methods are private instance methods ofObject, in each namespace.

# foo.rbdefyay ="foo"classFoodefself.say =yayendFoo.say#=> "foo"yay#=> "foo"# main.rbns =Namespace.newns.require('foo')ns.Foo.say#=> "foo"yay# NoMethodError

There is no way to expose top level methods in namespaces to another namespace. (See “Expose top level methods as a method of the namespace object” in “Discussions” section below)

Namespace scopes

Namespace works in file scope. One.rb file runs in a single namespace.

Once a file is loaded in a namespacens, all methods/procs defined/created in the file run inns.

Implementation details

Object Shapes

Once builtin classes are copied and modified in namespaces, its instance variable management fallbacks fromObject Shapes to a traditional iv table (st_table) because RClass stores the shape in itsflags, not inrb_classext_t.

Size of RClass and rb_classext_t

Namespace requires to move some fields from RClass torb_classext_t, then the size of RClass andrb_classext_t is now larger than4 * RVALUE_SIZE. It’s against the expectation ofVariable Width Allocation.

Now theSTATIC_ASSERT to check the size is commented-out. (See “Minimize the size of RClass and rb_classext_t” in “Discussion” section below)

ISeq inline method/constant cache

As described above in “Namespace scopes”, an “.rb” file runs in a namespace. So method/constant resolution will be done in a namespace consistently.

That means ISeq inline caches work well even with namespaces. Otherwise, it’s a bug.

Method call global cache (gccct)

rb_funcall() C function refers to the global cc cache table (gccct), and the cache key is calculated with the current namespace.

So,rb_funcall() calls have a performance penalty when namespace is enabled.

Current namespace and loading namespace

The current namespace is the namespace that the executing code is in.Namespace.current returns the current namespace object.

The loading namespace is an internally managed namespace to determine the namespace to load newly required/loaded files. For example,ns is the loading namespace whenns.require("foo") is called.

Discussions

Namespace#inspect

Currently,Namespace#inspect returns values like"#<Namespace:0x00000001083a5660>". This results in the very redundant and poorly visible classpath outside the namespace.

# foo.rbclassC;end# main.rbns =Namespace.newns.require('foo')pns::C# "#<Namespace:0x00000001083a5660>::C"

And currently, if a namespace is assigned to a constantNS1, the classpath output will beNS1::C. But the namespace object can be brought to another namespace and the constantNS1 in the namespace is something different. So the constant-based classpath for namespace is not safe basically.

So we should find a better format to show namespaces. Options are:

Namespace#eval

Testing namespace features needs to create files to be loaded in namespaces. It’s not easy nor casual.

IfNamespace class has an instance method eval to evaluate code in the namespace, it can be helpful.

More builtin methods written in Ruby

If namespace is enabled by default, builtin methods can be written in Ruby because it can’t be overridden by users’ monkey patches. Builtin Ruby methods can be JIT-ed, and it could bring performance reward.

Monkey patching methods called by builtin methods

Builtin methods sometimes call other builtin methods. For example,Hash#map callsHash#each to retrieve entries to be mapped. Without namespace, Ruby users can overwriteHash#each and expect the behavior change ofHash#map as a result.

But with namespaces,Hash#map runs in the root namespace. Ruby users can defineHash#each only in user namespaces, so users cannot changeHash#map‘s behavior in this case. To achieve it, users should override bothHash#map andHash#each (or onlyHash#map).

It is a breaking change.

It’s an option to change the behavior of methods in the root namespace to refer to definitions in user namespaces. But if we do so, that means we can’t proceed with “More builtin methods written in Ruby”.

Context of $LOAD_PATH and $LOADED_FEATURES

Global variables$LOAD_PATH and$LOADED_FEATURES controlrequire method behaviors. So those namespaces are determined by the loading namespace instead of the current namespace.

This could potentially conflict with the user’s expectations. We should find the solution.

Expose top level methods as a method of the namespace object

Currently, top level methods in namespaces are not accessible from outside of the namespace. But there might be a use case to call other namespace’s top level methods.

Split root and builtin namespace

NOTE: “builtin” namespace is a different one from the “builtin” namespace in the current implementation

Currently, the single “root” namespace is the source of classext CoW. And also, the “root” namespace can load additional files after starting main script evaluation by calling methods which contain lines likerequire "openssl".

That means, user namespaces can have different sets of definitions according to when it is created.

[root] | |----[main] | |(require "openssl" called in root) | |----[ns1] having OpenSSL | |(remove_const called for OpenSSL in root) | |----[ns2] without OpenSSL

This could cause unexpected behavior differences between user namespaces. It should NOT be a problem because user scripts which refer toOpenSSL should callrequire "openssl" by themselves. But in the worst case, a script (withoutrequire "openssl") runs well inns1, but doesn’t run inns2. This situation looks like a “random failure” to users.

An option possible to prevent this situation is to have “root” and “builtin” namespaces.

This design realizes a consistent source of namespace CoW.

Separate cc_tbl and callable_m_tbl, cvc_tbl for less classext CoW

The fields ofrb_classext_t contains several cache(-like) data,cc_tbl(callcache table),callable_m_tbl(table of resolved complemented methods) andcvc_tbl(class variable cache table).

The classext CoW is triggered when the contents ofrb_classext_t are changed, includingcc_tbl,callable_m_tbl, andcvc_tbl. But those three tables are changed by just calling methods or referring class variables. So, currently, classext CoW is triggered much more times than the original expectation.

If we can move those three tables outside ofrb_classext_t, the number of copiedrb_classext_t will be much less than the current implementation.

Object Shapes per namespace

Now the classext CoW requires RClass andrb_classext_t to fallback its instance variable management fromObject Shapes to the traditionalst_table. It may have a performance penalty.

If we can applyObject Shapes onrb_classext_t instead ofRClass, per-namespace classext can have its own shapes, and it may be able to avoid the performance penalty.

Minimize the size of RClass and rb_classext_t

As described in “Size of RClass and rb_classext_t” section above, the size of RClass andrb_classext_t is currently larger than4 * RVALUE_SIZE (20 * VALUE_SIZE). Now the size is23 * VALUE_SIZE + 7 bits.

The fields possibly removed fromrb_classext_t are:

If we can move or remove those fields, the size satisfies the assertion (<= 4 * RVALUE_SIZE).