A Tour of Morfa

Tour of DSL creation

This chapter will guide you through the basics of DSL (Domain Specific Language) creation in Morfa programming language.

In the following sections, we will present a selection of Morfa features, which are most commonly used when creating DSLs. Then we give two step-by-step examples of creating DSLs using these features.

Overview

Before we start to look at some specific Morfa features, it is good to keep in mind that some general design principles of the language are helpful when it comes to natural DSL programming:

  • static typing - it proves to be very useful when coding an elaborate DSL syntax. The main goal is to have any incorrect DSL-code non-compilable and the type system comes to our aid in this
  • small amount of built-in syntax - ., :, ;, " and , are the few non-overloadable "operator-like" entities in Morfa. The rest can be (almost) completely freely customized. The general rule is that re-definition of built-in operator behavior is not allowed, for example the "call" operator () cannot be re-defined on func types, "index" operator [] cannot be re-defined on arrays and binary + cannot be re-defined on integers.

On the other hand, if you find that sometimes Morfa is too verbose or keyword-abundant for a DSL-oriented language, note that this is a sort of a trade-off - it makes "non-DSL" parts like new, class, func, return etc. stand out and remain readable.

Language features to use

Operator overloading and user-defined operators

Overloading existing operators for user-defined types is the most basic tool for the DSL creation:

public func +(l: SomeType, r: SomeType): SomeType;
public func in(l: SomeInLHS, r: SomeInRHS): bool;
public func $()(c: SomeCallableLikeType, args: SomeArgumentType[]...): SomeReturnType;
public func $[]=(storage: SomeStorageType, newValue: Something, index: Somewhere): void;

and so on.

User-defined operators come in handy when we leave the world of built-in operators, which happens relatively quick in Morfa:

public operator ^
{
    kind = infix,
    precedence = not
}
public func powInt(l: int, r: int): int;
public alias ^ = powInt;

Another common scenario of user-defined operators is providing a new keyword-like identifier which gives the DSL a natural feel:

public operator take_action_with
{
    kind = prefix,
    precedence = assign
}
public func take_action_with(a: Actionable): void {}

To be next used like:

take_action_with new Actionable;

Quoting operators

More exotic case of user-defined operator declaration are quoting operators:

public operator '
{
    kind = prefix,
    precedence = max,
    quoting = right
}
public func '(s: text): SomeResult
{
    return SomeResult();
}

and then:

var result = 'this_will_be_quoted;

The quoting = right property makes the ' operator act as if it decorated an identifier standing to the right with double-quotes "" - note that the func ' trades in a text argument.

Other options of quoting = are left and both.

This quoting operator infrastructure is useful when there is a need to operate on strings in a symbol-like fashion.

Properties and constants

Defining an identifier as a global property (or constant):

property α(): SomeProxyType
{
    return SomeProxyType();
}
// or const α: SomeProxyType;

gives you an opportunity of providing following constructs in the DSL's syntax:

α[some, arguments];
α{different, meaning};
α;

An important variant of this is to define the property (or constant) within a class to limit its visibility. What is more, we can surprisingly make this property void:

public class NotSpecialYet
{
    public property mark_this_instance_special(): void
    {
        // do something special here
    }
}

usage:

var special = new NotSpecialYet with
{
    mark_this_instance_special;
};

Proxy types and conversions

Lots of DSL logic and syntax is represented via the use of various proxy structures. The lifespan of such structures is usually limited to the execution of some part of the DSL "command" or "statement". Consider the following abstract example:

class Memory
{
    // a "command" to "memorize" some Object
    public operator memorize
    {
        kind = prefix,
        precedence = assign
    }
    public func memorize(a: Object): void 
    {
        // memorize for some default duration
    }
    public func memorize(a: ForeverProxy): void 
    {
        // memorize forever
    }

    // use to indicate to memorize forever
    public operator forever
    {
        kind = postfix,
        precedence = not
    }

    // binds maximum duration and an Object
    public func $forever(obj: Object): ForeverProxy
    {
        return ForeverProxy(obj);
    }

    // temporarily holds an Object
    struct ForeverProxy
    {
        var obj: Object;
    }
}

used as:

var myMemory = new Memory with
{
    memorize someObj;
    memorize otherObj forever;
};

A very important note to make here is the use of struct for such entities as ForeverProxy. Since structures live on the stack and follow value semantics they are much more light-weight than class instances and make the impact on performance minimal.

Also observe, that we declared ForeverProxy as private which makes it unavailable in a direct manner, i.e. the user of Memory cannot instantiate such structure on his own. This makes it hidden and only available through the means of the DSL "API", which are the forever and memorize "commands".

Higher-order functions

Using higher-order functions in DSL creation makes two things possible.

One is making a DSL to do some operations on functions, like for instance function composition:

public operator ∘
{
    kind = infix,
    precedence = mul
}
public func ∘(l: func(int): int, r: func(int): int): func(int): int
{
    return func(input: int) { return l(r(input));};
}

usage:

var h = f ∘ g;

Second is the ability to introduce a block of code into a DSL "command", consider:

public operator delayed
{
    kind = prefix,
    precedence = not
}
public func delayed(f: func(): void): void
{
    // wait some time
    f();
}

usage:

delayed func()
{
    println("Hello, after a while!");
};

These examples seem trivial, but read on to see some more complex cases when DSL programming and higher-order functions become allies.

Templates

Template programming grants lots of power to the creator of a DSL. Firstly, it is a convenient way to batch-define important functions like user conversions in the DSLs, provided these conversions share the same implementation:

template <T>
if (Is<T, SomeSourceType1> or Is<T, SomeSourceType2>)
public func convert(t: T): SomeDestinationType;

Next, using templates one can bring some abstraction to the DSL. For example, to have a --> operator which fixes a function to some argument, one can do this:

public operator -->
{
    kind = infix, precedence = assign
}

template<Arg, Return>
public func -->(arg: Arg, f: func(Arg): Return): func(): Return
{
    return func() { return f(arg);};
}

usage:

var fixed = 5 --> func(i: int) { return i + 0.5;};
fixed(); // gives 5.5

Another specific template programming technique that is a common tool in DSL creation, is variadic template functions, which allow to define functions like:

template <TList...>
public func $()(c: Consumer, items: TList)
{
    for (i in items)
        c.consume(i); // type of i matters!
}

With expressions and member operators

An extremely useful feature of Morfa, when it comes to DSL creation, are the with expressions in conjunction with member operators. Their goal is to provide a way to build some special object using DSL syntax.

A fine example of this could be defining routes of a web server using a hypothetical web framework:

var server = new Server with
{
    get "/" route_to func()
    {
        // process request
    };
    get ("/user/", 'id, "/status") route_to func()
    {
        var id = get('id);
        // process request for status of user with id
    };
};
server.start();

In order for this to work, one must define the get, post, route_to and ' operators inside class Server definition, along with their (some of them non-static!) implementations:

public class Server
{
    public operator get
    {
        kind = prefix, precedence = not
    }
    public operator route_to
    {
        kind = infix, precedence = assign
    }
    public operator '
    {
        kind = prefix, precedence = max, quoting = right
    }

    template <TList...>
    public static func get(args: TList): RouteInformationProxy
    {
        return prepareRouteInformation(args);
    }

    public func route_to(route: RouteInformationProxy, f: func(): void): void
    {
        // add the route and bind its processing to f
    }

    public func start(): void {}
}

However, you don't have to be instantiating anything, like a web server in the above example, to use with. Sometimes limiting the validity of a DSL to a class and using it inside a with statement or expression can be beneficial in terms of code readability and quality. In some cases it is worthwhile to avoid the DSL syntax "leaking out" of a very specific block of code:

public struct UnsafeDSL
{
    // define some unsafe operations with a DSL
}
public var unsafe: UnsafeDSL;

usage with with statement:

with (unsafe)
{
    // use the unsafe DSL
}
// Whew! We are safe again.

Note on anonymous classes

A slight variant of the above with expressions usage can be done using anonymous classes like this:

var server = new class Server { func new()
{
    //... same as in the with block
}};

The differences are:

  • server has now some anonymous type which only derives from Server
  • there is some more control over how the constructor is called

Diving deeper into OOP, one might employ:

public class ImplementMe
{
    public abstract func required(): void;
    // implementation of the DSL here
}

usage:

var instance = new class ImplementMe { public override func required(): void 
{
    // use the DSL to implement required
}};

// now, instance can do something intricate using the DSL-driven implementation of required

This gives you even more fine-grained control over when and how the DSL code gets executed.

Unicode identifiers

Last but not least remember to keep an eye on the ability to define pieces of your DSL as Unicode identifiers. This can be done in a strictly opt-in manner with:

public operator ascii_operator
{
    kind = infix, precedence = mul
}
public alias Θ = ascii_operator;

Even if using Unicode requires more effort to write code, it can make code more readable, concise and natural-looking, provided it is used reasonably. Remember, code is read more than it is written.

Note that if you plan to use a custom surrounding operator (i.e. one allowing to write var a = ⁅...⁆;), you are required to use a Unicode bracket pair like:

operator 〔〕{ kind = surrounding, precedence = max }
operator ⁅⁆{ kind = surrounding, precedence = max }

This is caused by the fact that (), [] and {} have their special meaning in Morfa, when applied in such context.

Still, providing a non-Unicode version of whatever such operator is doing is good practice.