How the ruby debugger Byebug works with TracePoint API
In the series ‘How it works’ I look at awesome code of others. This week I look at Byebug a Ruby 2.0 debugger gem.
What is Byebug?
Byebug is an amazing ruby 2.0 debugger gem. It allows for debugging of ruby 2.0 using the
TracePoint API. It is a gem with native extensions meaning it requires C compilation to
be used. We are looking at byebug 1.8.2
in this post.
Why am I reviewing Byebug?
Byebug is the first 2.0 debugger. It’s elegant, easy to use. And written by an amazing developer David Rodriguez. It’s the default reference project for anyone who wants to write native, well tested-gem.
Terminology
- REPL - The debugger, the inspector, the thing where you type commands after it has hit a breakpoint. REPL stands for Read-eval-print-loop.
Overview
Byebug consists out of a couple of parts.
- C extension with hooks to the Tracepoint API and deals with low level concepts such as stack frames.
- A breakpoint and catch point system
- A ruby support library with acts as a bridge to the C API and allows for command processing and a REPL.
- A beautiful test suite
Personally, I find the code and the library absolutely stunning. It’s both beautiful and elegant.
The gem file structure
The command tree
comes up with the following file structure. I’ve removed a lot of files. But this is
the basic outline.
.
├── Rakefile
├── bin
│ └── byebug
├── byebug.gemspec
├── ext
│ └── byebug
│ ├── Makefile
│ ├── breakpoint.c
│ ├── breakpoint.o
│ ├── byebug.bundle
│ ├── byebug.c
│ ├── byebug.h
│ ├── byebug.o
│ ├── context.c
│ ├── context.o
│ ├── extconf.rb
├── lib
│ ├── byebug
│ │ ├── command.rb
│ │ ├── commands
│ │ │ ├── breakpoints.rb
│ │ │ ├── ....
│ │ ├── context.rb
│ │ ├── remote.rb
│ │ └── ....
│ ├── byebug.bundle
│ └── byebug.rb
├── logo.png
└── test
├── examples
│ ├── breakpoint.rb
│ └── variables.rb
├── stepping_test.rb
├── support
│ ├── breakpoint.rb
│ ├── context.rb
│ ├── matchers.rb
│ ├── test_dsl.rb
│ └── test_interface.rb
└── variables_test.rb
10 directories, 131 files
Let’s go over them quickly:
- The
Rakefile
contains tasks to build and test the gem.Rake compile
compiles the gem using therake-compiler
gem to compile native extensions. The output of the C extension goes to./lib/byebug.bundle
. After whichrequire
will load it. Check my other article on making a native gem for details on how this works. - The
byebug.gemspec
contains the dependencies used and the specification for Byebug. You can see its a native gem because it has thes.extensions
defined.
require File.dirname(__FILE__) + '/lib/byebug/version'
Gem::Specification.new do |s|
s.name = 'byebug'
s.version = Byebug::VERSION
s.authors = ['David Rodriguez', 'Kent Sibilev', 'Mark Moseley']
s.email = '[email protected]'
s.license = 'BSD'
s.homepage = 'http://github.com/deivid-rodriguez/byebug'
s.summary = %q{Ruby 2.0 fast debugger - base + cli}
s.description = %q{Byebug is a Ruby 2.0 debugger. It's implemented using the
Ruby 2.0 TracePoint C API for execution control and the Debug Inspector C
API for call stack navigation. The core component provides support that
front-ends can build on. It provides breakpoint handling and bindings for
stack frames among other things and it comes with an easy to use command
line interface.}
s.required_ruby_version = '>= 2.0.0'
s.files = `git ls-files`.split("\n")
s.test_files = `git ls-files -- test/*`.split("\n")
s.executables = ['byebug']
s.extra_rdoc_files = ['README.md']
s.extensions = ['ext/byebug/extconf.rb']
s.add_dependency "columnize", "~> 0.3.6"
s.add_dependency "debugger-linecache", '~> 1.2.0'
s.add_development_dependency 'rake', '~> 10.1.0'
s.add_development_dependency 'rake-compiler', '~> 0.9.1'
s.add_development_dependency 'mocha', '~> 0.14.0'
end
You can see Byebug depends on rake
, rake-compiler
and mocha
as development dependencies. And debugger-linecache
(responsible for caching lines of code for showing context) and columnize
(responsible for showing information in columns) as run time dependencies.
Let’s continue:
- The
./lib/byebug
directory contains the debugging support framework written in Ruby. And handles command processing for the REPL and code for managing breakpoints. - The
./test
directory contains the tests for testing the gem.
The debugging process explained
The debug process uses an internal ruby API called TracePoint, which basically allows you to hook into the Ruby interpreter and execution process.
You can register a callback (hook) whenever a certain ruby line gets executed or when a certain event happens such as an exception thrown or returned from a method. Byebug works by hooking into these calls using the TracePoint API.
The basic process (highly oversimplified) works like this:
- You run your regular ruby program.
- You
require
the Byebug gem. - Byebug registers a couple of
Tracepoint
events using the TracePoint API from C. (In pseudo code: ‘call-this-hook-on-every-executed-ruby-line’, and a ‘call-this-hook-when-an-exception-occurs’ event) - It starts the TracePoint API and Byebug hook code gets called on every line.
- It checks whether there are breakpoints defined for that line and file, and if so it breaks into the debug REPL. And gives you a prompt, where it waits for commands.
- Additionally when a exception happens it breaks into the REPL as well.
Something to understand is, is that the code that is executed when a TracePoint is hit, is not being trace-pointed. Which is good or else we would end up in some weird tracepoint’ception.
I’ve build a small pseudo debugger in Ruby to explain the concept. Save it as tinydebug.rb
and run this via ruby tinydebug.rb
state = :break; size = 0
# Here we hook into the TracePoint API, this block gets executed on every line.
trace = TracePoint.new(:line) do |tp|
lines = File.read(tp.path).split /\n/
line = lines[tp.lineno-1]
puts "#{tp.path}: #{tp.lineno} - #{line}"
p tp.binding.eval('local_variables')
if state == :step
if size == caller.size then state = :break end
end
if state == :break
action = (gets).strip
puts "use n,s,bt" unless %w(s bt n).include? action
if action == 's'
state = :step
size = caller.size
end
end
end
# From here on we enable the tracepoint API
trace.enable
puts "Use n to execute next, and s to step over a method"
def myfunc
a = "im a local val"
puts "Hey i am in a method"
puts "I'm in a method"
end
puts "line one"
myfunc
puts "line two"
How the C extension works
A part of Byebug is written in C. This is primarily because of the following reasons:
- Speed. It’s fast. Breakpoint checking and the TracePoint callbacks are done in C.
- Low level access. Some things like binding-as-caller are not accessible from Ruby and therefore done in C.
A quick glance of some interesting lines and files in C.
In byebug.c
the Init_byebug
gets called when the C library gets loaded.
void
Init_byebug()
{
mByebug = rb_define_module("Byebug");
rb_define_module_function(mByebug, "setup_tracepoints",
Byebug_setup_tracepoints, 0);
rb_define_module_function(mByebug, "remove_tracepoints",
Byebug_remove_tracepoints, 0);
rb_define_module_function(mByebug, "context", Byebug_context, 0);
rb_define_module_function(mByebug, "breakpoints", Byebug_breakpoints, 0);
rb_define_module_function(mByebug, "add_catchpoint",
Byebug_add_catchpoint, 1);
rb_define_module_function(mByebug, "catchpoints", Byebug_catchpoints, 0);
rb_define_module_function(mByebug, "_start", Byebug_start, 0);
rb
... SNIP ..
}
You can see here it defined a module named Byebug
.
If we take a look at Byebug_start
you can see that it will setup the TracePoint in case it has not yet
been registered.
static VALUE
Byebug_start(VALUE self)
{
VALUE result;
if (BYEBUG_STARTED)
result = Qfalse;
else
{
Byebug_setup_tracepoints(self);
result = Qtrue;
}
if (rb_block_given_p())
rb_ensure(rb_yield, self, Byebug_stop, self);
return result;
}
You can see here which TracePoint are registered, in short ‘exception raised’, ‘line execution’, ‘class’ and ‘return’ events.
static VALUE
Byebug_setup_tracepoints(VALUE self)
{
if (catchpoints != Qnil) return Qnil;
breakpoints = rb_ary_new();
catchpoints = rb_hash_new();
context = context_create();
tpLine = rb_tracepoint_new(Qnil,
RUBY_EVENT_LINE,
process_line_event, NULL);
tpCall = rb_tracepoint_new(Qnil,
RUBY_EVENT_CALL | RUBY_EVENT_B_CALL | RUBY_EVENT_CLASS,
process_call_event, NULL);
tpCCall = rb_tracepoint_new(Qnil,
RUBY_EVENT_C_CALL,
process_c_call_event, NULL);
tpReturn = rb_tracepoint_new(Qnil,
RUBY_EVENT_RETURN | RUBY_EVENT_B_RETURN | RUBY_EVENT_END,
process_return_event, NULL);
tpCReturn = rb_tracepoint_new(Qnil,
RUBY_EVENT_C_RETURN,
process_c_return_event, NULL);
tpRaise = rb_tracepoint_new(Qnil,
RUBY_EVENT_RAISE,
process_raise_event, NULL);
rb_tracepoint_enable(tpLine);
rb_tracepoint_enable(tpCall);
rb_tracepoint_enable(tpCCall);
rb_tracepoint_enable(tpReturn);
rb_tracepoint_enable(tpCReturn);
rb_tracepoint_enable(tpRaise);
return Qnil;
}
The actual processing of the lines happens in the method process_line_event
.
static void
process_line_event(VALUE trace_point, void *data)
{
EVENT_SETUP;
VALUE breakpoint = Qnil;
VALUE file = rb_tracearg_path(trace_arg);
VALUE line = rb_tracearg_lineno(trace_arg);
VALUE binding = rb_tracearg_binding(trace_arg);
int moved = 0;
EVENT_COMMON();
if (dc->stack_size == 0) dc->stack_size++;
if (dc->last_line != rb_tracearg_lineno(trace_arg) ||
dc->last_file != rb_tracearg_path(trace_arg))
{
moved = 1;
}
if (RTEST(tracing))
call_at_tracing(context, dc, file, line);
if (moved || !CTX_FL_TEST(dc, CTX_FL_FORCE_MOVE))
{
dc->steps = dc->steps <= 0 ? -1 : dc->steps - 1;
if (dc->stack_size <= dc->dest_frame)
{
dc->lines = dc->lines <= 0 ? -1 : dc->lines - 1;
dc->dest_frame = dc->stack_size;
}
}
if (dc->steps == 0 || dc->lines == 0 ||
(CTX_FL_TEST(dc, CTX_FL_ENABLE_BKPT) &&
(!NIL_P(
breakpoint = find_breakpoint_by_pos(breakpoints, file, line, binding)))))
{
call_at_line_check(context, dc, breakpoint, file, line);
}
cleanup(dc);
}
You can see here the call to find_breakpoint_by_pos
which brings us to the following.
How breakpoints are implemented
Breakpoints are not a ruby concept, but one created by Byebug. What happens under the hood is the following:
- The Tracepoint hook gets called on every line being executed
find_breakpoint_by_pos
- Byebug checks this line against its collection of breakpoints. In this breakpoint collection there is a filename, and a line-number. It checks if the current file and line-number match. If so it returns a breakpoint.
- The method
call_at_line_check
is called with given breakpoint. - That will call
call_at_breakpoint
static VALUE
call_at_breakpoint(VALUE context_obj, debug_context_t *dc, VALUE breakpoint)
{
dc->stop_reason = CTX_STOP_BREAKPOINT;
return call_at(context_obj, dc, rb_intern("at_breakpoint"), 1, breakpoint, 0);
}
- And that will call the method
at_breakpoint
on thecontext.rb
object.
def at_breakpoint(brkpnt)
handler.at_breakpoint(self, brkpnt)
end
And the handler will show the REPL.
How stepping works
Stepping is nothing more then breaking out of the REPL, and let ruby continue is next line. And then begin called back into the REPL.
How stepping over works
Stepping over is a bit more complicated. Stepping over works by checking the current length of the stack and saving this in a variable. Then continue away from the REPL and when called back to the Tracepoint check whether the current stack-frame size is the same as the one saved, if this is the case break into the REPL.
This way, when going into a new method the stack-frame size is higher than the one saved, so keep going executing every line without breaking until the stack size length is the same as the that is saved.
The REPL (command processor)
The command processor, is nothing more then an abstraction layer over the REPL. It basically allows for pluggable commands.
You can find the code of this in lib/bybug/commands
module Byebug
# Implements byebug "continue" command.
class ContinueCommand < Command
self.allow_in_post_mortem = true
self.need_context = false
def regexp
/^\s* c(?:ont(?:inue)?)? (?:\s+(\S+))? \s*$/x
end
def execute
if @match[1] && !@state.context.dead?
filename = File.expand_path(@state.file)
return unless line_number = get_int(@match[1], "Continue", 0, nil, 0)
return errmsg "Line #{line_number} is not a stopping point in file " \
"\"#{filename}\"\n" unless
LineCache.trace_line_numbers(filename).member?(line_number)
Byebug.add_breakpoint filename, line_number
end
@state.proceed
end
class << self
def names
%w(continue)
end
def description
%{c[ont[inue]][ nnn]
Run until program ends, hits a breakpoint or reaches line nnn}
end
end
end
end
The test suite
The test suite is absolutely amaze-balls because it’s very declarative. Uses therefore less brain-cycles and enables you to spend those on real problems.
it 'must leave on the same line by default' do
enter 'next'
debug_file('stepping') { $state.line.must_equal 10 }
end
And these are the main parts of Byebug. Come back to get more content like this.