Fowler M., Domain-Specific Languages (2011)
Probably a primary and essential reading for more experienced developers specifically for practical applications and secondary due to omnipresent usage of DSLs in the industry.
Contents
Summary
Domain-Specific Languages share a common ground with DDD Ubiquitous Language.
Both aim to improve communication and understanding between developers and domain experts. Both are fundamentally linked to a domain model or an underlying framework that captures core domain concepts and insights. Both involve adapting or refining terminology from the domain.
Ubiquitous Language is strategically aimed at forming a subset of notions expressed in human language, used both in human communications and in code to build software for a specific problem domain.
Conversely, Fowler’s book describes DSLs 1 - human-readable but a much narrower technically focused computer languages whose purpose is to conveniently and fluently encode specific software system behaviors. Yet DSLs may or should take roots in a concrete system Ubiquitous Language in parts where they automate problem domain tasks.
Value
This is probably, the primary and essential reading to see the practical application of widely used software engineering concepts 2 critical for making valuable and reliable software. The book is organized in a very clean and practically digestible way.
Part I conceptually introduces DSL problem space, a basic example, used throughout the book, and general approaches to help developers decide whether and how to use DSLs. It demonstrates how DSLs boost productivity by simplifying complex code and improve communication with domain experts through shared, executable documentation.
Parts II, III, IV, V and VI are practically a reference of DSL techniques. They offer implementation guidance for both external and internal DSLs—covering parsing, parser generators, and language constructs.
These chapters in these parts are structured the specific way helping to practically implement the techniques. Each technique chapter provides sections "How It Works", "When to Use It" where applicable expanded with specific cases, code examples and recommended reading sections.
Part I: Narratives
The starting example introducing a practical problem: the need to easily program different sequences of actions for a family of similar systems. The challenge is tailoring system behavior efficiently.
The solution starts from modeling the system as a State Machine and then introduces the concept of a Domain-Specific Language (DSL) as a way to textually encode the behavior of the model, its benefits and challenges.
The DSL is produced from the model as a readable way to instruct a computer and populate an underlying model - a Semantic Model. The Semantic Model enables a clear separation of concerns between processing the language and the actual semantics. It allows the model and the DSL to be tested and evolved independently
The visualization tools (e.g. DOT, Graphviz) help to refine the Semantic Model and DSL.
Part II: Common Topics
[UTILITY] Add the SVG Navigator extension to your Chrome to comfortably view the diagrams with zoom and pan in the separate tab.
Semantic Model
The underlying system model that a DSL populates. It captures domain semantics independently of DSL syntax, typically as an in-memory object model or data structure. The model should be usable without the DSL (e.g., via command-query interface) to enable separate testing of model and parser logic. Handles validation.
Symbol Table
Stores identifiable objects during parsing to resolve references. Can contain either final Semantic Model objects or intermediate builders. Simplifies cross-referencing in both internal and external DSLs, though not always mandatory.
Context Variable
Holds parsing context in a mutable variable, updated as the parser processes different input sections. Used when context affects expression parsing and can't be derived from the parse tree alone. May contain Semantic Model objects or builders.
Construction Builder
Incrementally assembles data for immutable Semantic Model objects. Provides read-write fields matching the model's read-only fields, deferring object creation until all data is collected during parsing.
Macro
Represents code blocks as first-class objects that maintain lexical scope. Implementation varies by host language (may be called Closures). Can be textual or syntactic.
Notification
Collects and reports errors/messages during processing. Designed to simplify caller code by handling message composition internally with all relevant data.
Part III: External DSL Topics
An external DSL is a domain-specific language that is represented in a separate language from the main programming language of the application it works with. It usually has a custom syntax, though it can also follow the syntax of another representation like XML. Scripts in an external DSL are typically parsed by code in the host application using text parsing techniques.
This part is a reference of the topics related to implementing external DSLs. It aims to show you the role of a parser, the usefulness of a Parser Generator, and different ways of using a parser to parse an external DSL. Implementing an external DSL involves techniques from parsing programming languages applied to the simpler context of DSLs.
The following diagram presents the content of the Part III at a glance.
Part IV: Internal DSL Topics
The part references techniques for implementing domain-specific languages within a host general-purpose programming language. While external DSLs require learning about grammars and language parsing, internal DSLs allow you to work within your regular language environment, often leveraging existing language features in unconventional ways to achieve a fluent syntax.
Part V: Alternative Computational Models
These models are declarative models as opposed to mostly imperative ones presented in the preceding parts.
You don't strictly need a DSL to use an alternative computational model, as the core behavior comes from a Semantic Model. However, a DSL can significantly improve the ease with which people can manipulate these declarative programs that populate the Semantic Model. Using an Adaptive Model is a good way to provide an alternative computational model, and a DSL helps make it easier to program that model.
Adaptive Model
This is the key to using any alternative computational model. Use it to build a processing engine for an alternative computational model that can then be programmed for specific behavior. Any of the alternative computational models mentioned in the book would usually be implemented with an Adaptive Model. The decision to use one is qualitative, based on whether it seems to fit the way you think about the problem.
Decision Table
This is a very effective way to capture the results of a set of interacting conditions. They communicate well to both programmers and domain experts, and their tabular nature allows manipulation using familiar spreadsheet tools. It is well suited to problems involving a composite conditional expression. Decision Tables allow checking for missed or repeated condition permutations and can shift the execution context to runtime.
Dependency Network
This is a good choice when you have computationally expensive tasks with dependencies between them. It underpins tools like Make and Ant by capturing the prerequisites for tasks. It is suitable when representing a graph structure of dependencies.
Production Rule System
This is a good choice when you have a list of conditions to be evaluated. It differs from a Decision Table by focusing on the behavior of individual rules rather than the entire table structure.
State Machine
Use this model when you need to represent the behavior of an object by dividing it into a set of states, with events triggering transitions between these states. It is highlighted as a good fit for an Adaptive Model because it inherently provides an alternative computational model.
Part VI: Code Generation
The part focuses on various approaches for producing output code from a DSL's Semantic Model. It is one of the methods for processing the results of parsing a DSL, the alternative being direct interpretation of the Semantic Model. The core idea is to automate the writing of code that would be repetitive or difficult to write by hand, often taking advantage of the DSL's structured representation (the Semantic Model).
Relations to Rest of the Book
The part's topics relate to other parts of the book in several ways.
Semantic Model (Part II): Code Generation is fundamentally driven by the Semantic Model, which serves as the input for the generation process. The Semantic Model, discussed in Part II, is highlighted as being the central element in most DSL efforts. Code Generation is presented as an output mechanism for this model.
Parsing (Part III): Techniques covered in Part III, such as using Parser Generators and Tree Construction, are used to populate the Semantic Model, which is the prerequisite for Code Generation. Concepts like Embedment Helper, discussed in Part VI, are recommended for use in grammar files (Part III) to keep code actions clean during parsing. Code generation can even be used before parsing to generate information needed during the parsing process, often related to the Semantic Model or Symbol Table.
Alternative Computational Models (Part V): Part VI provides techniques for generating code that implements behavior defined using Alternative Computational Models from Part V . The Secret Panel Controller example, which uses a State Machine (a Part V topic), is used extensively to illustrate various code generation techniques like Transformer Generation and Templated Generation. Model-Aware Generation explicitly discusses generating code that replicates a simulacrum of the Semantic Model (often representing an alternative computational model) in the target language. Model Ignorant Generation is presented as an alternative approach when the target language is less expressive.
General Purpose Languages: Many techniques in Part VI assume you are generating code in a general-purpose language (like C# or Java). While Part I and Part IV cover using general-purpose languages to implement internal DSLs or DSL processing, Part VI focuses on generating code in such languages.
Techniques
Transformer Generation
Use when you write code that navigates the input model (the Semantic Model) and produces output code. It often pairs well with Model-Aware Generation.
Templated Generation
Use when the generated output has a lot of static content and the dynamic parts are occasional and simple. It helps visualize the generated output by looking at the template file.
Embedment Helper
Use when embedding general-purpose code (like in a grammar file or template) becomes complex and obscures the host representation. It moves complex code to a helper object, keeping the host representation clearer.
Model-Aware Generation
Use when the generated code needs to explicitly represent the Semantic Model (or a version of it), preserving generic-specific separation in the target code.
Model Ignorant Generation
Use when the target language is not expressive enough to represent the Semantic Model directly, requiring logic to be hardcoded into the generated code.
Generation Gap
Use to cleanly separate generated code from handwritten code, typically using inheritance, allowing safe regeneration of code.
Footnotes
-
DSLs are omnipresent in software. Examples: IaC and CI/CD automation DSLs, data manipulation (SQL, GraphQL), diagraming (DOT, PlantUML), testing (BDD's Gherkin/Cucumber) etc. ↩
-
Domain (Semantic) Model, Model-Driven Design, Layered Architecture, Separation of Concerns, TDD and Iterative Development. ↩