Sunday, April 30, 2006

Rubyfront 0.2.0 released

This new version of Rubyfront fixed a bug found when parsing Ruby on Rails 1.1.2. You can download the source code from here.

In rubyfront 0.2.0, I changed the way it handles expression substitution. The older versions of rubyfront ignores the content in expression substitution. For example, if the input is:

"hello, #{name}"

The lexer will first start to scan the input as a double quote string. After it sees "#{", it will go find the first '}' and think that is the end of the expression substitution. Unfortunately, expression substitution may contain '}' itself. For example:

"begin #{1.times {"block"}} end"

The lexer can not match the above input as one string. Expression substitution can contain any legal ruby program (compoundStatement), and only the parser understands the structure.

To correctly handle this situation, the new lexer will suspend its state when the start of expression substitution is seen (only 'begin ' is matched so far). After the parser finishes its processing of expression substitution, it will notify the lexer and the lexer will resume its state and matches the rest of the string (' end').

Btw, when you run parser smoke test against Ruby on Rails 1.1.2, you will see the following failures:

C:\Documents and Settings\xxx\workspace\rubyfront>build parsersmoketest
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\activerecord-1.14.2\test\connections\native_sqlite3\in_memory_connection.rb: line 18:76: expecting EOF, found ')'
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\environments\environment.rb: line 8:1: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\controller\templates\controller.rb: line 1:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\controller\templates\functional_test.rb: line 1:41: unexpected token: ..
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\controller\templates\helper.rb: line 1:8: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\integration_test\templates\integration_test.rb: line 3:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\mailer\templates\mailer.rb: line 1:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\mailer\templates\unit_test.rb: line 4:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\migration\templates\migration.rb: line 1:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\model\templates\migration.rb: line 1:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\model\templates\model.rb: line 1:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\model\templates\unit_test.rb: line 1:41: unexpected token: ..
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\plugin\templates\generator.rb: line 1:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\plugin\templates\unit_test.rb: line 3:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\scaffold\templates\controller.rb: line 1:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\scaffold\templates\functional_test.rb: line 5:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\scaffold\templates\helper.rb: line 1:8: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\session_migration\templates\migration.rb: line 1:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\web_service\templates\api_definition.rb: line 1:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\web_service\templates\controller.rb: line 1:7: unexpected token: <
[java] parser exception for C:\ruby\lib\ruby\gems\1.8\gems\rails-1.1.2\lib\rails_generator\generators\components\web_service\templates\functional_test.rb: line 1:41: unexpected token: ..
[java] 2428 ruby programs have been parsed, 21 failed.

Those errors are expected as those files are in .rhtml format (not valid ruby, ROR requires erb to preprocess them).

Monday, April 17, 2006

Compile-time type inference for ruby has its limit

I have been thinking about doing type inference at compiler time for ruby. Whiles lots of the cases can be handled in a straightforward way, type inference won't work in all situations.

One of the problems is, ruby supports multiple return types:

def f
if rand(2) == 1
return 1
return "hello"

In the above code, method 'f' will randomly return 1 (Fixnum) or "hello" (String). At compiler time there is no way for a compiler to know what type to return. All it can do is to insert meta data so that typing checking can be done at runtime.

And multiple return types is not uncommon in real code. For example, Ruby' s array or hash can hold objects of different types:

a = [1, 1.5, "hello"]

In the above code, Array 'a' contains three different types of objects: Fixnum, Float and String. And element reference("[]") is an example of method that has multiple return types.