Domain Specific Languages in Ruby
It is probably every developer’s dream to create a programming language. It is a complex and difficult task, so it will remain just an unfulfilled dream for many.
But there is a chance to create something simpler – a domain specific language (DSL). A language, that covers a specific (and limited) domain.
Ruby helps here a lot – to create a DSL is really simple and easy.
A domain specific language provides a comfortable way how to express and perform operations in a specific domain such as spreadsheet functions, financial operations, transformations, AI.
Another advantage is that a DSL program could be defined externally, e.g. by a user, so your solution could easily reflect changes in a domain logic.
A financial example:
transfer_between_accounts(100, 123456789, 987654321) # transfer $100 close_if_empty(123456789) # close the account if empty
A semaphore example:
light :red wait 30 light :red, :orange wait 5 light :green wait 30 light :orange wait 5
DSLs (implemented in Ruby) are based on Ruby so it is possible to use Ruby language structures such as loops or conditions.
Let’s create a real DSL – a tape domain specific language.
I. Tape
A tape is an unlimited space divided into cells.
A head is a device that is capable to move left or right on the tape and can read and write symbols from/to a cell on the tape.
This tape is a (very) simplified tape of a Turing machine.
II. Basic Tape DSL
We would like to achieve this result on the tape: 3,1,2
There are many ways how to achieve this, here is one of them:
write 1 right write 2 left left write 3
III. Common Code for Various DSLs
So, how to create a Ruby program that will understand a DSL?
The main magic is done with a method that executes a Ruby code. The method is located in the Object class and its name is instance_eval
, for more details see its Ruby doc.
- Create a class, a good name could be
Tape
. - Add a method to this class that calls the
instance_eval
method. I named it asload
. It loads a file and executes its content:def self.load(filename) new.instance_eval(File.read(filename), filename) end
-
And the last thing is to call the load method. Place this line on the very end of the file, out of the body of the
Tape
class. It constructs a new instance of theTape
class, takes the first parameter from the command line as a tape program file name that is passed to theload
method.Tape.load(ARGV.shift)
This is a common part that could be reused for any DSLs. (Just rename Tape to your favourite DSL name. :)
IV. Tape State
We defined a tape as something with a state: tape content and head position.
A tape content will be stored in the array @tape
and a head position in the variable @pos
as an integer.
It would be nice to initialise these variables; add the class constructor:
def initialize @pos = 0 @tape = [] end
V. Tape DSL Commands
Almost done :), just the DSL commands are missing. They will be implemented as normal Ruby methods.
The command right
is defined as a Ruby method in the Tape
class. The right
command increments the head position and, if necessary, extends (adds a cell at the end of) the tape.
def right @tape.push nil if @tape.size == @pos @pos += 1 end
Similarly, the left
command, is defined as a method. It decrements the head position and, if necessary, extends (adds a cell at the beginning of) the tape.
def left if @pos == 0 @tape.unshift nil else @pos -= 1 end end
To read a content of the tape or write there a value is even simpler:
def write(val) @tape[@pos] = val end def read @tape[@pos] end
The last command that is missing is to display the state of a tape – the dump
method. It shows cells of a tape separated by the comma and the head position is highlighted with two vertical bar characters (or pipes if you like) .
def dump t = @tape.map {|c| " #{c} "} t[@pos] = "|#{@tape[@pos]}|" puts "[#{t.join(',')}]" end
We created five commands: left, right, read, write and dump.
(download the tape.rb file)
VI. Run a Tape Program
To see steps of the 123 tape program, add the dump
command after each write
command:
write 1 dump right write 2 dump left left write 3 dump
(download the 123.tape file)
To run the 123 tape program (stored in the 123.tape file) type this on the command line:
ruby tape.rb 123.tape
The result should be:
[|1|] [ 1 ,|2|] [|3|, 1 , 2 ]
Try to change the program and see/check the result of your changes with the dump
command.
VII. Combination with Ruby
As mentioned before, a DSL implemented in this way is a Ruby code, that is, you may use Ruby language structures.
The repeat example, writes numbers from 1 to 5 to the tape (download the repeat.tape file):
(1..5).each do |i| write i right dump end
The result will be (command line: ruby tape.rb repeat.tape
):
[ 1 ,||] [ 1 , 2 ,||] [ 1 , 2 , 3 ,||] [ 1 , 2 , 3 , 4 ,||] [ 1 , 2 , 3 , 4 , 5 ,||]
Another more complicated example that is using some Turing machine related techniques: the invert program writes 0, 1, 0 on the tape and then inverts the content. Beginning and end of the content is defined by the special characters B
and E
. (download the invert.tape file)
# prepare tape write 'B' right write 1 right write 0 right write 1 right write 'E' dump # return to the beginning while read != 'B' do left end dump # invert right while read != 'E' do if read == 0 write 1 else write 0 end right end dump
The output is (command line: ruby tape.rb invert.tape
):
[ B , 1 , 0 , 1 ,|E|] [|B|, 1 , 0 , 1 , E ] [ B , 0 , 1 , 0 ,|E|]
You could try to change the program to invert the content in just two steps – generating the content and inverting the content while returning to the beginning.
VIII. Hide Obvious Dependencies or Too Much Ruby in DSL
From the definition of our tape DSL, the only relevant information for conditions and loops is a tape cell content, where the head is pointing to. Let’s define another two tape DSL commands that hide this necessary dependency (so it not necessary to specify it).
The check
command is replacing the while read !=
part:
while read != 'B' do left end
to
check 'B' do left endcondition command is replacing the
if read ==
part:
if read == 0 write 1 else write 0 end
to
condition 0, lambda {write 1}, lambda {write 0}
lambda {code} is one of the ways how to define a block of code (i.e. a method body) in Ruby.
The commands implementation is slightly more complicated. The check
command also examines the provided block of code, if it requires one parameter (check 'x' do |c|
) or none (check 'x' do
).
def check(val, &block) return unless block_given? while read != val do if block.arity == 1 yield read else yield end end end def condition(val, positive, negative) if read == val positive.call else negative.call end end
Additionally, a tiny command comment
is added - to show usage of the check
command with one parameter required in the provided block of code.
def comment(text) puts text end
(download the tape.rb file)
The modified invert tape program (download the invert2.tape file):
# prepare tape write 'B' right write 1 right write 0 right write 1 right write 'E' dump # return to the beginning check 'B' do left end dump # invert right check 'E' do |h| comment "Inverting #{h}" condition 0, lambda {write 1}, lambda {write 0} right end dump
The result is (command line: ruby tape.rb invert2.tape
) (download: tape.rb and invert2.tape files):
[ B , 1 , 0 , 1 ,|E|] [|B|, 1 , 0 , 1 , E ] Inverting 1 Inverting 0 Inverting 1 [ B , 0 , 1 , 0 ,|E|]
IX. Further Extending
So far all commands were statically added to the class representing (and implementing) a DSL. There is a way how to catch and process requests for a non-existing method: the method_missing
method of the Kernel module (see the Ruby doc).
It allows to solve situations such as assignments to variables (of course, with names defined by a user), to access user's dynamic data (parsers, mappings) or to wrap a dynamic functionality (web services)...
In the tape DSL it is only used to show an error message to a user if he/she specifies a non-existing method name.
def method_missing(sym, *args, &block) puts "# unknow #{sym} with args #{args} #{block_given? ? '' : 'and code'}" end
The End
luck(good, your_projects, dsl)
enjoy ':)'
Resources
Domain-specific language (Wikipedia)
Turing machine (Wikipedia)
A Ruby HOWTO: Writing A Method That Uses Code Blocks
Ruby blocks gotchas
Understanding Ruby Blocks, Procs and Lambdas
Creating DSLs with Ruby