Over the last few months, I've started making an attempt (with the help of my patient coworkers) to more intentionally learn the programming language Haskell. I have professional reasons why this is directly important, but it's also been a good way to challenge some long held assumptions. Being purely functional and lazily evaluated, Haskell lacks traits like implicitly sequenced evaluation, forcing its reconstruction where needed. Sequence is among the first concepts taught in programming, and it can be disconcerting to have to consider it explicitly. Despite my good intentions and years of experience with functional programming (mainly in various Lisps), the Haskell learning curve has presented challenges. It has been a rewarding journey with a number of connections to other parts of software design. I hope to share some of that here, and hopefully draw some connections that will make it easier to approach this content whether you use Haskell or not.
To see what I mean about the challenges of the Haskell learning curve, there is no better place to start than the beginning. First consider the definition of a function to double an integer.
doubleInt :: Int -> Int
doubleInt x = x * 2
The first line declares the type of doubleInt
. It is a function from Int
to Int
. The arrow in the type declaration (->
) signifies doubleInt
as a function. Types without arrows are value types. With that in mind, consider this complete Haskell program. As you might guess, this is a classic "Hello World" with the message split on two output lines.
main:: IO ()
main = do
putStrLn "Hello"
putStrLn "World"
If you read carefully, you may notice that even though main
defines the entry point for the program, it is not actually a function. There is no arrow in the type declaration, and so main
is declared to be a value. Specifically, main
is declared to be a value of type IO ()
. To give a more direct sense of how profoundly strange that is, consider a similar Python program also written with main
as a value:
main = [
lambda : print('hello'),
lambda : print('world')
]
To bring this example at least a little closer to the Haskell, it is also possible to build main
using functions to streamline the syntax:
def putStrLn(msg):
return lambda : print(msg)
main = [
putStrLn('hello'),
putStrLn('world')
]
Run as written, neither of these Python programs do anything useful. They both allocates some data structures, bind them to a variable (main
), and immediately exit witout doing anything useful. The second version is more sophisticated in that it uses Python code to help build those data structures, but it is fundamentally the same. In both versions, here is a gap between the value of main
and externally visible action.
It is not necessarily obvious, but the Haskell program has the same problem. The only reason it works in the Haskell is the fact there is magic in the Haskell language runtime. This magic knows how to interpret an IO
value and turn it into the actions that might make the program useful.
The gap between main
and useful action can be solved in the Python code by bringing our own magic. This is code that takes the array value of main
, imposes a meaning and executes according to that interpretation. A simple example might just execute the array elements in sequence:
def putStrLn(msg):
return lambda : print(msg)
main = [
putStrLn('hello'),
putStrLn('world')
]
def run(blocks):
for block in blocks:
block()
run(main)
Run this version of program, and you get the following output:
$ python main.py
hello
world
This is exactly the same output as the alternative implementation you might be wishing I began with:
def main():
print('hello')
print('world')
main()
This simpler version relies on the property of Python that statements execute in the order in which they are written. This is built into the language, can be assumed, and does not need to be explicitly stated as in the run
function. Underneath this, however, the Python code is interpreted in very similar fashion. There is an array of operations and a loop that processes the contents of that array to process them according to their semantics. You can inspect the array with the Python bytecode disassembler:
>>> dis.dis(main)
1 0 RESUME 0
2 2 LOAD_GLOBAL 1 (NULL + print)
14 LOAD_CONST 1 ('hello')
16 PRECALL 1
20 CALL 1
30 POP_TOP
3 32 LOAD_GLOBAL 1 (NULL + print)
44 LOAD_CONST 2 ('world')
46 PRECALL 1
50 CALL 1
60 POP_TOP
62 LOAD_CONST 0 (None)
64 RETURN_VALUE
Dropping down a level, the Python interpreter itself works the same way. In this case, the 'array' is machine code, and the 'loop' is the hardware in the CPU (working in conjunction with the operating system) to execute the intent of that code. This is as good a time as any to point out that run
is about as simple an interpreter as it gets, and it is possible to extensively elaborate that same basic idea.
The point here is that the idea of expressing control flow as data to be interpreted by some external magic is not such a strange concept. In fact, it is essential to the way modern computers work. While all this hints at the power of what is happening with main
and run
, it leaves open the question of why you would bother. Why would you ever bother with such explicitly mangaged control flow, when it is already so common?
The justification boils down to control. A program that explicitly manages its execution is a program that has control over its execution. In the case of Haskell, need for this control is easy to justify. Haskell is functional, lazily evaluated, and does not offer the same sort of implicit ordering provided by languages like Python. Given that I/O operations typically assume at least some level of ordering, Haskell needs a way to reconstruct ordering when needed. This role is served by IO
and its associated machinery. If you have ever wondered why Haskell I/O is built in terms of Category Theory and usually covered in something like Chapter 7 of whatever Haskell book you are reading, this is why.
As ever, there is an alternative. Early versions of Haskell handled sequencing by delegating I/O operations to an external server process. This server process consumed a stream of I/O requests and produced a stream of I/O responses. Haskell code could then be expressed as a mapping across these two streams. This essentially outsourced the the ordering concerns to the outside world, leaving Haskell functionally pure.
Here is a rough example of what my Hello World might look like in that style. (Note that unlike before, this main
is actually a function.)
data Request = PutStrLn String | GetLine | Exit | ...
data Response = Success | Str String | ...
main :: [Response] -> [Request]
main _ = [
PutStrLn "hello",
PutStrLn "world",
Exit
]
It is probably obvious at this point, but I have never written code in this style. Even still, it is easy to imagine how awkwardly it scales to force all I/O through two sequential streams and an external process. I do not know how long it took, but it seems like the Haskell community quickly abandoned that approach in favor of the approach we have been discussing: monadic I/O.
With monadic I/O, Haskell programs are expressed as values of type IO
. These values represent I/O operations interleaved with computation. Internally, they data structures with an explicit sequence. If the mapping from the code to the data structure is not obvious, it is because Haskell has a well developed syntax for making monadic I/O code look a little more imperative. The do
notation is a good example. do
looks sequential, but it is really syntactic sugar for an expression that constructs a value using a specific pattern.
Once assembled, values of type IO
can be passed to an internal component of the Haskell runtime for interpretation. The runtime automatically does this for main
. A Haskell program starts up, computes main
and then immediately passes it to a component inside the runtime that knows what to do with an IO
.
For users of traditional (non-functional) languages, this may sound limiting, but it turns out not to be. Haskell includes functions among its value types, so the full power of Haskell functions is available within IO
values. Haskell code is used to construct IO
values that contain Haskell code. The use of IO
just provides a specific structure and establishes a boundary between the code that cares about order and the code that does not. This is similar to the partitioning achieved in the older stream based I/O model, with the exception that it is fully expressed within Haskell, type checked, and does not require explicitly reducing all I/O to two streams.
Even without quite that pressing a need to manage sequence, there is still utility in taking explicit control. For languages that already have a notion of sequence, explicitly managing control flow can be useful for the opposite reason. Control flow is usually modeled at the language level as a single stream of execution. Taking more control can make it possible to run multiple streams at once, suspend or alter computations in the event of failures, pause and restart, or even selectively introduce concurrency. If these sound like operating system facilities, they usually are. The difference is that at the operating system level, they are often provided at a higher cost than what can be done if managed internally to a process. A thread is expensive, but a list of operations is not. There is a reason so much of high throughput server computing is expressed in terms of asynchronous operations with control flow expressed in data.
Closing out, there is another parallel I would like to call out. If you have worked with Promises in JavaScript or other languages, the notion of modeling control flow as a data structure might feel vaguely familiar. Consider the following:
new Promise((resolve) => resolve())
.then(() => console.log('hello'))
.then(() => console.log('world'));
This code constucts a description of a desired computation in the form of a linked list of promises - a control flow graph. The control flow graph is then implicitly passed into the JavaScript runtime for evaluation.
The previous code explicitly lists out each step in the control flow graph (log hello
then log world
), but the full power of JavaScript is available when constructing a sequence of actions. This uses promises to print a list of numbers from 0 to 4. It constructs a chain of promises rather than printing the numbers directly.
function putStrLn(msg) {
return () => console.log(msg);
}
let p = new Promise((resolve) => resolve());
for(var ii = 0; ii < 5; ii++) {
p.then(putStrLn(ii));
}
I hope the parallel here is clear. The Python code used Python to generate an array to beinterpreted. The Haskell code used Haskell to generate an IO
to be interpreted. This JavaScript uses JavaScript to generate a chain of promises to be interpted. In all three cases, there are elements both of code generation and interptation.
If this example seems a bit contrived, printing a list of numbers is very much so. The pattern in general is not. A similar scenario came up practically at my day job not too long ago. I was building a data conversion process at the time. The processes needed to query for a list of ID's and then issue a series of asynchronous commands to convert batches of data items in sequence. The solution involved standard functional programming techniques to partition the ID's into the batches followed by construction of a chain of promises to issue the API calls. The details were different, but the structure was the same.
There are two key insights I would like to leave you with. The first is that the justification from these techniques comes from the ability to more fluently control your program's execution. Sometimes you need more structure, sometimes you need less structure, and sometimes you just need something different than the default. Being explicit about it can open up new possibilities.
The second insight is that in all of these cases, the code that you write as a developer is not quite the code that directly executes a computation. Rather, you are writing code that computes a description of a computation. The description is then processed elsewhere and according to a slightly different rules. This is the origin of the technique's power, but if it feels harder to write code in this style, it probably should. If it feels harder to understand a stack trace from this sort of code, again, it probably should.
For me, I have found it helpful over the years to understand the concepts I use with at least some amount of depth. This makes it easier to understand and predict the design. This also makes it easier to develop instincts for how to use a given technique, which is particularly important if there is no other option. I hope that what I have written here provides a useful viewpoint into what it means to be explicit about control flow, a few places where you will see it, and how to use it effectively.