JRuby Compiler

From JRubyWiki

Jump to: navigation, search

The compiler supports both ahead-of-time (AOT) and just-in-time (JIT) compiling.

Contents

Ahead-of-time (AOT) Compilation

The typical way to run the AOT compiler is to run "jrubyc <script name>". It will output a .class file in the current dir with parent dirs and package matching where the file lives. So...

jrubyc foo/bar/test.rb

will output

foo/bar/test.class

To run, include jruby.jar in CP along with current dir (parent dir for foo/bar/test.class above), and execute like a normal Java class (named foo.bar.test):

java -cp .:/path/to/jruby.jar foo.bar.test

Just-in-time (JIT) compilation

JRuby also supports using the compiler in JIT mode, where it will attempt to compile methods as they're called. After a call threshold is reached, the JIT tries to compile the method body in question (including any blocks it contains). If the compilation succeeds, the compiled version is used from then on. Otherwise, the method is permanently marked "interpret only".

As of JRuby 0.9.9, the JIT is enabled by default with a threshold of 20. You can turn it off using the following flag.

jruby -J-Djruby.jit.enabled=false myscript.rb

Performance

A few microbenchmarks comparing Ruby, JRuby interpreted, and JRuby compiled (server VM numbers show worst and best numbers):

fib(30) Ruby:                1.67s
fib(30) JRuby interp (client VM):    3.93s
fib(30) JRuby interp (server VM):    2.28s to 2.08s
fib(30) JRuby compiled (client VM):    1.89s to 1.79s
fib(30) JRuby compiled (server VM):    1.66s to 0.86s

Tweaking and troubleshooting

There are other interesting properties of use:

jruby.compile.mode=OFF|JIT|FORCE (default JIT)
sets compilation to none, JIT, or AOT
jruby.jit.threshold=## (default 50)
sets the threshold for methods to get jitted (I usually use it to make all methods compile before execution, using threshold=0)
jruby.jit.logging=true|false (default false)
logs each method as it's compiled, so you can see what kind of coverage you're getting or if the methods you want to JIT are getting JITed.
jruby.jit.logging.verbose=true|false (default false)
logs each method that fails to compile, so you can see if compilation problems are keeping methods from JITing.

As of late September 2007, the JRuby compiler is considered complete. Any features missing are to be considered bugs, but as part of JRuby's test run the entire Ruby standard library is compiled to Java bytecode.

There are two known issues with the JRuby compiler at present, which may or may not get fixed:

  • Calling a method with a "while false" loop as its parameter will fail to execute
foo(while false; end) # results in an error
  • The retry keyword is not currently supported outside a rescue block, largely because of performance considerations, general confusion around how it's supposed to work, and due to the fact that nobody appears to use retry outside rescue.

Design

JRuby compiles Ruby code to Java bytecode. Once complete, there's no interpretation done, except for eval calls. evaluated code never gets compiled; however, if the eval defines a method that's called enough, it will also eventually get JIT compiled to bytecode. JRuby is a mixed-mode engine.

Given a single input .rb file, JRuby produces a single output .class file. This was a key design goal I wanted for the compiler; other languages (including Groovy) and other Ruby implementations (including XRuby) produce numerous classes from an input file; in some cases, dozens and dozens of classes if the input file is very large and complex. JRuby produces one .class file.

JRuby compiles from the same AST it interprets from. There is a first pass over the AST before compilation to determine certain runtime characteristics:

  • does a method have closures in it?
  • does a method have calls to eval or other scope and frame-aware methods?
  • does a method have class definitions in it?
  • does a method define other methods?
  • .... and so on

Based on this pass, we determine scoping characteristics of all code in the method, selectively choosing pure heap-based variables or pure stack-based variables. Only methods and leaf closures without eval, closures, etc can use normal stack-based local variables. Performance is significantly faster with stack variables.

The resulting class file from JRuby contains at a minimum methods to start:

  • a normal main() method for running from the command line (grabs a default JRuby runtime and launches itself)
  • a load() instance method that represents a normal top-level loading of the script into a runtime. This performs pre/post script setup and teardown.
  • a run() instance method that represents a bare execution of the script's contents. This is used by the JIT, where setup/teardown is handled outside the JITed code on a method-by-method basis
  • a __file__() method that represents the body of the script. This is where script execution eventually starts.

Then, depending on the contents of the file, additional methods are added:

  • normal method definition bodies become Java methods
  • class/module bodies become Java methods
  • closure bodies become Java methods
  • rescue/ensure bodies become synthetic methods
  • if the normal top-level script method is too long, it's split every 500 top-level syntactic elements and chained (we did run into one large flat file that broke the method size limit). We do not yet perform chaining on normal method bodies, because we have not encountered any that are too large.

Of these, only class bodies, rescue/ensure bodies, and chained top-level script methods get directly invoked during script execution. The others are bound into the MOP at runtime.

Binding occurs in one of two ways:

  • by generating a small stub class that implements DynamicMethod and invokes the target method on the target script directly
  • by doing the same with reflection

In our testing, generating stub "invoker" classes has always been faster than reflection, especially on older JVMs. For the time being, that's the preferred way to bind methods, but I'm going to get reflection-based binding working again for limited/restricted environments like applets. With reflection-based binding and pre-compiled Ruby code with no evals, JIT compilation could be completely turned off and no classes would ever be generated in memory by JRuby.

So then here's a walkthrough of a simple script:

# we enter into the script body in the __file__ method
# require would first look for .rb files, then try to load .class
require 'foo'

# normal code in the method body
puts 'here we go'

# upon encountering a method def, a new method is started in the class
def bar
  # this is a simple method body, and would use stack-based vars
  puts 'hello'
end
# once the method has been compiled, binding code is added to __file__

# class definitions become methods as well, building the class
class MyClass
  # this is code in the body of the class
  puts 'here'

  # a method in the class is compiled like any other method body
  def something(a, b = 2, *c, &block)
    # this method has all four param types:
    # normal, optional, "rest" or varargs, and block argument
    # the compiler generates code to assign these from an incoming
    # IRubyObject[]

    # this method has a closure, so it would use heap-based vars
    # ... but the closure would use stack vars, since it's a simple leaf
    1.times { puts 'in closure' }
  end
  # method is completed, bound into the class we're building
end
# end of class definition; __file__ code invokes the class body directly 

# any begin block or method body with a rescue/ensure attached will
# be compiled as a synthetic method. This also necessarily means that
# method bodies containing rescue/ensure must be heap-based.
begin
  puts 'rescue me'
rescue
  puts 'rescued!'
ensure
  puts 'ensured!'
end

A sample run of the JRuby compiler:

~/NetBeansProjects/jruby $ jruby sample_script.rb
here we go
here
rescue me
ensured!

~/NetBeansProjects/jruby $ jrubyc sample_script.rb
Compiling file "sample_script.rb" as class "sample_script"

~/NetBeansProjects/jruby $ ls -l sample_script.*
-rw-r--r--   1 headius  headius  8396 Oct  4 09:38 sample_script.class
-rw-r--r--   1 headius  headius  1449 Oct  4 09:38 sample_script.rb

~/NetBeansProjects/jruby $ export
CLASSPATH=lib/jruby.jar:lib/asm-3.0.jar:lib/jna.jar:.

~/NetBeansProjects/jruby $ java sample_script
here we go
here
rescue me
ensured!
Personal tools