1. SQL to S-Expressions

    programming language

    painting by Hieronymus Bosch

    Toy languages are hell !

    There always comes some time when your custom made fast and pretty parser breaks because you need some important enhancement to your toy language. When adding some more signs to regular expressions is not enough, you can either:

    1. adopt another language
    2. build a good parser

    We chose (1) for zafu’s ruby expressions (rubyless), but we have to take the second route for query builder.

    pseudo sql parser

    We now have a ragel based parser that generates nice s-expressions from pseudo sql code. For example:

    objects where event_at > REF_DATE + custom_a months

    becomes:

    [:query, [:filter, [:relation, "objects"], [:>, [:field, "event_at"], [:+, [:field, "REF_DATE"], [:field, "custom_a"]]]]]
    

    If you indent this mess, you get something a Lisp coder would kill for:

    [:query,
      [:filter,
        [:relation, "objects"],
        [:>,
          [:field, "event_at"],
          [:+,
            [:field, "REF_DATE"],
            [:field, "custom_a"]
          ]
        ]
      ]
    ]
    

    This is quite fun: it reminds me of my old calculator with reverse polish notation.

    processor

    Generating proper SQL from this is now simply a matter of processing this tree.

    benchmarks

    I did some testing with the ragel parser: ruby vs C extension. Both parsers do exactly the same work with the same actions. The ruby specific part of both parsers (actions) should compile to the same ruby code so the only difference between the two is the ragel generated code.

    This test finally boils down to 1300 lines of ruby vs 1300 lines of C...

    parser speed
    suite ruby C
    basic 1.35 0.03
    errors 0.74 0.02
    filters 1.29 0.03
    total 3.38 0.08

    It’s quite obvious from these results that even though managing a C extension complicates deployment, it’s really worth it in terms of speed and probably also in terms of memory usage.

    Gaspard Bucher

    comments

    1. leave a comment