Simplicity and Global Mutable Values

Page content

At the beginning of this series, I said I was going to annoy every programmer in the world in some way or other. So hang on, here comes one of those super hot takes that I was promising:

Global variables… are bad.

Alright, for those of you who were not so offended that you instantly snapped your browser window shut, obviously I am aware that this is not news. The higher-skill end of the programming world was well aware that global variables are dangerous since before functional programming1 was even conceived of. Therefore, clearly, it is completely possible to figure out even in a purely-imperative world that they are a bad idea.

Functional programming provides a good way to view why they are bad. It may not be “the” way, but it is a good way. This idea comes to functional programming via its academic heritage, where this idea was developed for the purposes of proof.

Suppose you have this Go code, which trivially translates to most other imperative languages2:

var message = "Hello World!"
var greetCount = 0

func Greet() {
    fmt.Println(message)
    greetCount++
}

What are the arguments of the Greet function?

The function signature is supposed to answer that question. In this case, it has no arguments. It also returns nothing.

But you can also view this function from another point of view. For a variety of reasons ranging from “generally liking math” to “wanting to be able to use the tools of math to prove things about code”, academia would like to turn the Greet programming-language function into a true mathematical function.

A true mathematical function depends only on its inputs, and has only outputs. It does not “change” anything; it is simply a mapping of inputs to outputs. If you want the function to be affected by something, it must be in its inputs, and there are no secret side channels to communicate or change anything else, there is only the outputs of the function.

The particular combination of the way we teach math and the way we learn programming sometimes makes this difficult to conceive of. We tend to learn “mathematical functions” as these cute little things that take one or two real numbers and output another one. Some teachers even get confused and teach their students that what “a function may only return one value for a given input” means is that a function can only return a real number or some other atomic thing like that. But in reality, the input and the output of functions may be arbitrarily large; nothing stops a function from taking a thousand complicated matrices as input and emitting a million new functions as their output. You can make anything into a mathematical function just by enlarging the input and output until they are big enough to handle the function.

So to model Greet in this style up above, you need something like

func Greet(message string, prevCount int) (newCount int) {
    fmt.Println(message)
    return prevCount + 1
}

But, uh, wait a second here. We’ve still got a change happening in the world with the fmt.Println. That’s not a mathematical function either. So now you need something like

type IOOperation interface {
    // lots of stuff here that is its own lengthy
    // diversion to discuss
}

type IOPrintMessage struct {
    Message string
}

func Greet(message string, prevCount int) ([]IOOperation, newCount int) {
    return []IOOperation{
        IOPrintMessage{message},
    }, prevCount + 1
}

This is now a mathematical function. You could write a function to examine the count and print only if the count is even or something:

func Greet(message string, prevCount int) ([]IOOperation, newCount int) {
    if prevCount % 2 == 0 {
        return []IOOperation{
            IOPrintMessage{message},
        }, prevCount + 1
    }
    return []IOOperation{}, prevCount + 1
}

and so on.

Now you need some sort of execution engine to run this function and process the newCount so that it will be placed correctly into the arguments of any future Greet calls, as well as an engine to run the IOOperations3, and an engine to allow a programmer to somehow put all these functions together into a useful program since now these pure functions are just sitting there, ambiently existing, rather than the vivacious and proactive “functions” we’re used to having in imperative languages. That’s beyond the scope of this article, but to see what it looks like, see Haskell4.

And even this is simplified just to sketch the point and move on; if your function also accepts user input this model breaks down… but that’s a separate point.

My point in this particular post is the view that FP can provide on to why globals are bad. It is also a well-known fact in the imperative world that functions with dozens of arguments are a bad idea. They are difficult to understand due to the size of their state space.

Functional programming gives us a point of view where all global variables are additional parameters to the function call of any function that can see them. For instance, in the Go examples above, the message was not exported, so it would only affect that module. This is technically not “global”, only module-scoped. But you can also export it so the whole program can indeed see it, and then it is basically joining that global variable on to every function parameter in the program, and also every set of returned variables. In the presence of a global value, a programmer must examine a function, and the transitive closure of all functions it may call, in order to determine whether a given global variable may be used or modified.

That means that in a program with hundreds of global variables, the simplest possible function that nominally takes no arguments and returns no values actually takes hundreds of arguments and returns hundreds of values!

No wonder extensive use of globals can trash an entire program.

This means that every component that uses all these values is that much less simple, and that much harder to put together into the component-based architecture I was pitching in the previous post. It is true that it is at least the same set of global variables, so it is not as bad as trying to assemble an architecture out of components that take hundreds of different parameters to each function call, but it still means the code is much more complicated than a casual reading of the source code would indicate.

Mathematically, you could model the functions as only taking the variables they use in the function themselves and only outputting the variables they could conceivably change, which is quite a bit less bad that all funtions simply unconditionally taking them all. However, for a human being trying to use one of these functions, they don’t know simply from the function signature which of these globals may be used or changed, since it isn’t in the function signature, so as a practical matter, every individual global value a function can see adds to the real-world complexity of the function a little bit, enough that it starts to add up over dozens and hundreds of such values.

Obviously, in practice, a global is not literally exactly equivalent to adding an input and output parameter to every function. If it were so, globals would not be “dangerous”, but rather, “unusable”. But they do corrode away the program as they are added, and while at first they tend to not cost too much on the margin, many of us have witnessed programs where they passed an event horizon and interacted until they became insanely complicated.

Call it perhaps the logarithm of the number of global variables being added as complexity to every function… but then, that log value being multiplied by the number of functions you have, so it still hurts pretty badly. The log of the number of global variables may sort of level off as the program increases in size but the number of times you pay the penalty does not.

So What Does Functional Programming Have To Say About Globals?

In Haskell, if you have a function of Int -> Int, that function receives one value and returns one. The inside of that function may reference other values, but since all external values are immutable, it is still equivalent to a constant function that is term-rewritten to have that value directly incorporated, e.g.,

someValue :: Int
someValue = 6

increment :: Int -> Int
increment x = x + someValue

increment may superficially seem to resemble an imperative function that incorporates an external value, but because there is no way to change someValue, increment is in every way equivalent to the constant function increment x = x + 6. In general const values don’t count towards the global variable count, only mutable global values.

As a result, a real Haskell program that is doing real work is doing so from a whole bunch of components that are individually much simpler than most imperative components.

It is tempting to think that imperative code requires the use of these sorts of tools. Granted, the imperative world has largely gotten over the use of globals, but there’s still plenty of languages that provide a variety of ways to make functions “bigger”; for example, as another exercise to the reader you can consider what a method on an instance of some class with a complicated inheritance hierarchy that is changing values on multiple levels looks like from the mathematical perspective. It’s not necessarily hard to work it out, but it is tedious, and going through the process will give you some intuition into why the academic world eschews viewing the world that way. Mutations may still be contained by structured programming’s scopes and the rules of the inheritance hierarchy but can still bloat out of control fairly easily through the transitive closure of all modifiable values accessible through some superficially-simple method call.

The lesson of functional programming is that it provides an existence proof that these are not always necessary. It gives a framework for thinking about how extensive use of these tools can actually get you into trouble. It gives a demonstration of how composing simpler things together, even when you are limited to tools simpler than anything your imperative code provides, can still provide ways to solve problems that you may not have considered if you have not practiced with the simpler tools.

Bear in mind I’m not an absolutist on this question, because the entire thrust of all these essays is to pragmatically adapt our lessons, not try to blindly jam functional programming into the imperative world. You will have to use some of these features the languages you use offer you. However, a purely-imperative understanding of these features often accurately captures the benefits of these features but fails to teach you about their costs. This viewpoint can show the costs of using these features that complicate understanding what a given bit of code does. It can also demonstrate that there are in fact ways to get the benefits without necessarily paying the cost, by putting together simpler pieces in certain idiomatic ways that functional programming can teach you.

Adapting To Mutation

Functional programming, being an absolutist approach, simply eliminates the mutation entirely. While nifty, it can be impractical in conventional languages, and comes with some costs.

However, as hinted at early, the ability to mutate things is not a ravening monster that instantly destroys your program the moment you introduce a single mutable variable, because, again, if that were the case none of our code would work at all. At first, the costs are gentle, and the benefits can be substantial. It becomes problematic only when too much mutable state gets into one place, mutated in too uncontrolled a fashion, and it collapses into a black hole.

So the adaptation to this is to explicitly think about mutation as something that itself flows through your program, and needs to be considered on the architectural level. You make sure no part of the program collapses into a black hole by building fences around the mutation.

This is certainly a view onto a very similar concept as “encapsulation”, but it is not quite the same… it is still possible to build a well-encapsulated program where the encapsulation is not properly leveraged to control mutation. But it is the encapsulation tool set that you will generally turn to to build these fences.

There is also a natural parallel to the Law of Demeter, which I would say is a bit more general in that it worries about any communication, whereas I’m talking here about “things that result in mutation”, which is arguably the worst way to violate the Law of Demeter, but certainly not the only way. It is also bad to simply have a long chain of hard-coded references appear at all, even if it is a read-only operation.

One example of a mutation fence is an “actor”. While actors are generally thought of as providing concurrency services to their callers, they also frequently double as mutation barriers, because the callers don’t get generalized mutation ability on the values being managed by the actors. They can only interact via explicit messages. Hence, an actor forms a barrier against the kind of transitive closure of complexity explosion.

In a recent project I was rewriting some very “imperative”, in all the bad ways5, code to handle email processing. I replaced it with a pluggable module system to do the various tasks we do to the email. Using a pluggable module system is a fairly obvious architectural move, but where I applied the “functional programming” lesson is that the communication between these modules is itself tightly controlled and explicitly logged, as its own moral hazard to the code.

And rather than each job getting direct access to manipulate the email “directly”, there’s a system where the jobs get handles that only allow them to mutate a view of the email. The original email is immutable, so all jobs get equal access to it without the other jobs having interfered, and all changes through these handles can also be directly logged and inspected. As a result, while there have been a couple of issues with job order or two jobs trying to “own” the same header, they’re swiftly identified and dealt with because they were operating through an interface deliberately designed with the idea of constraining mutation on day one.

This is also a good case of what I mean by constraining mutation. The jobs have work to do, headers to change, variables to set for future jobs to save work, etc. It’s not about somehow “preventing” mutation; in general, one way or another the entire point of our programs is to change something in the world. The “constraint” in this case is to push it all through a system that allows for logging and visibility, not to deny it entirely, and to make sure it doesn’t run away unconstrained and untrackable… as it was in the original program.

To my mind, thinking about the flow of value mutation through your program is one of the basics of architecture, from the medium level, all the way up to the “how do I design the whole of AWS?” Everyone draws diagrams showing “who is talking to whom”, and that’s certainly an important diagram, but “what is being mutated, by whom, when, and why” is equally important and provided much less often.

Which is heavily related to the classic:

Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious. - Fred Brooks

I would draw a distinction between the micro-scale flow of value mutation and the macro-scale. Programming languages differ substantially in their microscales, ranging from “immutability” meaning that all changes flow through function calls, the burgeoning field of mutation-controlling languages which Rust leads but in not alone in, through to arbitrary mutability in conventional languages.

However, I think that by the time you get to the macro-scale the impact of these differences is substantially muted… perhaps not irrelevant, but muted. It doesn’t matter if your 138 microservices are all written in purely immutable languages if the architecture of the microservices as a whole is sloppy and dependent on wild, unconstrained mutation as messages fly everywhere without a plan; a well-designed monolith in a conventional language may be far easier to understand and change. A well-designed Haskell program and a well-designed C# program may have very similar high-level mutation flows, even though on the micro-level they are very different.

I promised in the intro post to even anger some functional programming advocates by claiming that FP gets things wrong. I think total immutability is generally wrong here. What is much more important is constraining mutability. Rust generally has a better approach. I would love to say Erlang generally has a better approach, except that I think it whiffed here; it needed immutability between actors, not immutability within actors. But a hypothetical Erlang that was mutable within actors but never, ever let any mutability leak across actors would be another better approach6. Mutability isn’t scary in small doses, and therefore there are more solutions around than entirely eliminating it. There is a certain simplicity to it, yes, and as I also mentioned in the intro, if you want to say it’s a better approach than the standard imperative approach I won’t argue, but I do not believe it is the globally optimal approach.


  1. As a reminder or for those new to the series, I am using this term in its modern sense. Lisp predates the general recognition of globals being bad, but Lisp is not a functional programming language by the modern definition, even though it was called “functional programming” back then. ↩︎

  2. The global-ist of globals, such as C had, where there is only one global namespace, are fairly rare nowadays, but the key aspect of the variables in this example is that there is only one instance of them, shared by all users. Namespacing may allow for multiple instances of the same base name, but if they’re trivially available by all code just for the asking they’re global enough for today’s discussion. This may be called a “static” variable, or “defined on the module”, or a “package variable”, or any number of other things. The key is that any other bit of code has some way of reaching it and potentially mutating it. ↩︎

  3. This all gets horrifyingly worse if you introduce concurrency. Imagining what sort of inputs and outputs are necessary to fully model that is left as an exercise for the reader. And, indeed, we collectively know from experience global variables become even more dangerous in highly concurrent contexts. ↩︎

  4. The much-ballyhooed IO type is less about “monads” and more about being the implementation of this approach for Haskell, meaning that technically, all function in Haskell really are completely pure. It is the combination of these functions with the aforementioned engines that makes a Haskell language actually do something. The purpose of the IO type is to abstract away the RealWorld (as Haskell calls it) and the various actions you can take, and also to wrap away concurrency (see previous footnote), giving an interface to interact with other computations without having to directly witness their input/output as explicitly declared inputs and outputs of every function. ↩︎

  5. One thing I hope you understand by the time this series is done is that this is not an apologia for “business as usual” in the imperative world! Every criticism a functional programmer has about imperative code in the general sense, I’ve lived. Something does need to be done about those problems, I just do not necessarily agree that “throw everything away and switch languages” is the only valid choice. A valid choice sometimes, yes, but not the only. ↩︎

  6. As I understand it, Elixir sits on top of BEAM and provides an implementation of a language on top of it that allows for mutability, which in this case amounts too, “allows you to rebind the same name to multiple values”. Which just goes to show once again that purity is relative. ↩︎