Validation Expressions

I am currently plowing through a C code base to port its functionality to an FSharp library. An interesting experience in itself, I often find myself wondering how to translate imperative constructs into concise and idiomatic FSharp code.

One of these patterns in imperative programming languages I personally also use a lot, is the die young approach, where methods/procedures validate input and return early on to save CPU cycles by passing control back to the caller.

int validate(int arg) {
  if(arg < 10)
    return 1;

  if(arg > 5)
    return 1;

  return 0;
}

This is concise and efficient. When compiling this simple procedure to assembler it is possible to see that the return statements are (conditional) jmp instructions to label .L3, preceded by moving the desired return value to %eax.

validate:
.LFB0:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	movl	%edi, -4(%rbp)
	cmpl	$9, -4(%rbp)
	jg	.L2
	movl	$1, %eax
	jmp	.L3
.L2:
	cmpl	$5, -4(%rbp)
	jle	.L4
	movl	$1, %eax
	jmp	.L3
.L4:
	movl	$0, %eax
.L3:
	popq	%rbp
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc

However, this notion jumping around locations in the program (for instance by setting the EIP register in the case of x86 assembly code) does not exist in functional languages. Expressions get evaluated in their entirety, as there is no way to define what a half-evaluated expression might conceptually be, type-wise.

This is a genuine difference in approach between the functional and imperative mind-sets and provides for interesting challenges when attempting to model them in terms of the other.

In fact, when emulating this approach in F# one quickly ends up with an utter mess. Conciseness and brevity get lost due to having to supply implementations for all if/then/else branches in order for the program to type-check.

This code does not scale well at all. Think more complex validation logic or adding statements, and even in this short form, it is already much harder to read and understand.

let mytest i = 
  if i < 10
  then true
  else if i > 5
       then true
       else false

A (slightly) better solution

Based on the idea of the Option/Maybe Monad, I came up with the following approach. Given that I want a simple Boolean answer to a set of questions about the validity of some input, my type ended up looking like this:

type Continue<'a> = 
  | Continue of 'a
  | Return of bool

In order to run a computation encoded in this type, we simply match on the constructor to see if we need continue evaluating it, or stop to return the Boolean return value.

let rec runCont = function
  | Continue _ as cont -> runCont cont
  | Return b -> b

But caution! This function diverges if the validation run never produces a Return value. Hence, we need to be careful to always end the chain of validations with a default return value (in our C program that was 0 or false).

In order to validate an assumption about some input, we need a simple combinator which takes a predicate, a Boolean Return value in case the predicate evaluates to true, as well as the input value.

let validate (pred : 'a -> bool) (ret : bool) (value : 'a) = 
  if pred value  
  then Return(ret)
  else Continue(value)

// inline combinator
let (<??>) = validate

Instead of passing a predicate, we could already pass its result and make this function even simpler. However, this leads to much less efficient code, as all predicates would have to get evaluated before passed to validate and regardless of whether they are in fact needed at all.

Next, we need to implement a combinator that chains together multiple validation operations, the bind function, as well as a way to bring an ordinary value (in this case a simple true/false) into the world of Continue<‘a>.

let bind (v : Continue<'a>) (f : 'a -> Continue<'b>) =
  match v with
    | Continue(v') -> f v'
    | Return(b) -> Return(b) // necessary due to 'a -> 'b type conversion
    
let (.>) = bind

bind looks at the value v and, when it finds a Continue constructor, applies the function f to its enclosed value. If not, the calculation stops here with the Return value.

Armed with this minimal set of tools we can already build some validations:

(id <??> false) true 
.> returnC true
|> runCont

let input = 8

input
|> (((<) 10) <??> true)
.> (((>) 5)  <??> true)
.> returnC false
|> runCont

While one might be able to argue that the above is more declarative than plain if/then/else expressions, it is certainly not easy to understand what is going on to a new user of this APO.

However, we can clean up this code some more, by creating a computation expression for our Continue type, which provides some syntactic sugar to make the validation logic more clear.

Validation Expressions

The inspiration for this approach comes from an example off of Scott Wlaschin’s invaluable F# resource. The programming interface we’d like to achieve is along the lines of:

* evaluate predicate and, if true, return supplied result 
* evaluate predicate and, if true, return supplied result 
* ...
* no matches, return default result

The implementation of the ValidationBuilder is pretty short, since we have already defined most of the building blocks before.

type ValidationBuilder() =
  member x.Bind(comp, func) = bind comp func
  member x.Return(value) = Return value
  member x.ReturnFrom(value) = value
  member x.Delay(f) = f               // return the _function_ for lazy evaluation
  member x.Run(f) = f()               // _evaluate_ the delayed function above
  member x.Combine(a,b) =
    match a with
      | Continue _ -> b()             // the predicate failed, so continue to evaluate
      | Return _   -> a               // return first value, as the validation passed

Key are the Combine, Delay and Run methods. Because we would like to short-circuit, i.e. interrupt a validation process once a check passed, we have to make sure that we pass around closures of the next computation (in this case b in Combine) instead of the evaluated result. We would otherwise compute all checks regardless of the end result.

Translated to computation expression syntax, our validator now looks like this:

let lowerTenPrime input = runCont (validation {
  return! validate ((=) 1) true input
  return! validate ((=) 2) true input
  return! validate ((=) 3) true input
  return! validate ((=) 5) true input
  return! validate ((=) 7) true input
  return false
  })

Conclusion

To me this looks like a useful abstraction for the problem at hand. It is possible to emulate imperative-style coding patterns through the use of computation expressions and reach a high level of conciseness and readability, while maintaining efficiency.