Tuesday, July 29, 2025

How we implemented FHIRPath in Go

In our first post of Verily’s Tech Blog, we gave the rationale behind why we use FHIRPath, which provides a smarter way to navigate and collaborate on healthcare data. But getting it to work in code isn’t exactly plug-and-play. At Verily, our first attempt relied on regular expressions and quick fixes. It got messy and labor-intensive. So, we stepped back and did it better by utilizing ANTLR, a tool that turns a language’s grammar into working code.

And now, we’re sharing this journey of product engineering enlightenment, and how in the end, we made FHIRPath parsing cleaner, faster, and much easier to extend.

Leveraging ANTLR

Disclaimer: Reading this might give you scary flashbacks about your compilers class. (Just kidding — we hope!)

FHIRPath is a programming language that can be represented by a context-free grammar. When developing FHIRPath grammar, our first challenge was implementing a compiler to convert the FHIRPath syntax into objects that our program understood how to process. Compilers consist of lexers and parsers, which are both programs that process some alphabet and match it to an understandable grammar.

Lexers understand regular grammar. They process symbols from the input and produce tokens. On the other hand, parsers process these strings of tokens to match a context-free grammar, and produce a parse tree. Here’s one way to think of the difference:

  • Lexers understand letters and produce words.
  • Parsers understand the meaning of the words and produce sentences.

Actually, this is our second attempt at implementing the FHIRPath grammar. First, we tried to use naive regular expressions and string operations to implement a small subset of the supported operations. To add a new type of operation, we had to modify our implementation to include a new regex. Without a proper language compiler, it became difficult to add new types of expressions that we needed, especially given how complex some FHIRPath expressions can be.

To save us from this “regular expression rabbit hole", we leveraged the open-source ANTLR tool, which is a parser generator. Given the FHIRPath grammar, ANTLR generates a parser object with methods that enable the implementer to walk the parse tree, unlocking the complete grammar for implementation.

This boils things down into a simpler, albeit still difficult, problem. The question remains, how do we walk the parse tree and produce something evaluatable against FHIR resources?

Two examples of parse trees. On the left, an algebraic expression and on the right, a FHIRPath expression.

Language features

The figure below demonstrates three fundamental aspects of the FHIRPath language:

  1. Input and output as a collection: The input to a FHIRPath expression is a collection of FHIR or FHIRPath data types, as is the output. In this example, the input is a singular Patient, where the telecom field is being accessed. The telecom field is a collection of entries — like an array — which then gets reduced to a single result by the where() function.
  2. Homogenous expression interfaces: The above input and output paradigm allows all FHIRPath expressions to be expressed in a homogenous way, essentially enabling infinite chaining of expressions. Neatly, it also enables the implementation of each expression to be abstracted away into a single interface, which we’ll touch on later.
  3. Function support: FHIRPath supports a vast number of defined functions, such as where(). Therefore, we had to find a way to support each function in a homogenous way.

This sample expression demonstrates three fundamental aspects of the FHIRPath language.

We decided to design the library so that FHIRPath evaluation is broken down into two steps:
1. Compilation
2. Evaluation

With this design, users can reuse expressions without the computational overhead of re-parsing each time. Additionally, it enables syntax errors to be caught earlier.

The system.Collection type

FHIRPath can process any of the FHIR-defined native types, as well as additional FHIRPath-defined system types.

The Golang implementation of FHIR doesn’t represent this type hierarchy as defined. Additionally, the FHIRPath spec implicitly converts between FHIR native and FHIRPath native types. In order to process all different types with one homogenous interface, we defined a custom type system.Collection as a type alias for [ ]any. This allowed flexibility of working with all types, while concurrently allowing us to override the type with useful custom methods to work with the returned type.

Steps to homogenous
expression interface

We broke down all FHIRPath expressions to follow one homogenous interface that accepts a system.Collection as input, does some processing, and produces a system.Collection as output.

Here’s how we modelled this in Golang:

Next, we defined a custom expression type for each type of FHIRPath expression that implements this interface. The result: the compiler parses the expression, and based on the expression that it is currently visiting, produces one of these expression types.

The figure below is an example implementation for the TypeExpression. The first path element to a FHIRPath expression is a filter on the resource name. We implemented this filter with the Google FHIR proto API.

After putting it all together, we produced these expressions while parsing the FHIRPath expression, and combined expressions by piping the output of one to the input of another. We’ll discuss the details of how this works in the next post of Verily’s Tech Blog.

Supporting functions

FHIRPath defines a set of functions as part of the specification. Similarly to how expressions are defined, functions all fit within one common interface: they take in an input system.Collection and spit out an output system.Collection. However, they also support a set of parameters, from 0 to N. To massage the interface to support all different kinds of functions with varying input parameters, we defined the abstraction as shown in the figure below.

To check the function arity at compile time, we defined a min and max arity value for each function. This was essential because catching at compile-time means less failure as run time. And because the function arguments can be any type, or even a FHIRPath expression in itself, we could not implement compile time type checking. Rather, we verified the types at run time.

Another unique implementation feature is that we supported the injection of custom golang functions at compile time. With some protoreflect magic, we can now convert regular Golang functions into the FHIRPathFunc abstraction, making it easy to leverage complete bespoke functions that the FHIRPath specification doesn’t support.

We did this because most of our users don’t know protoreflect. So, we thought allowing developers to define their function with Go code would be helpful, which could also seamlessly interoperate with the FHIR API.

What we shared, what’s to
come

By using the parser generator, ANTLR, we turned our FHIRPath engine from a patchwork of hacks into something much more stable and flexible. Now, we catch errors early, reuse compiled expressions, and even support custom functions — all without battling the system. At Verily, it’s transformed the internal mechanics of how our product engineering team works with healthcare data.

Watch for the next post of Verily’s Tech Blog that explores challenges we encountered during the FHIRPath implementation, and the tricks, including some that could be described as fancy, that we used to overcome issues and create persistent improvements.