title: Unspoken Rules Of Go draft: true categories:

Golang

Or perhaps you might call them “idioms” in Go, as in, “idiomatic code”, that I’ve not seen clearly documented anywhere.

I’m sure this is not a complete list, but by its nature it is hard to make a complete list.

if err != nil { return nil } Has A Specific Meaning Now

With the introduction of fmt.Errorf, the “default” error handling in Go should now look like:

if err != nil {
    return fmt.Errorf("what I was trying to do failed: %w", err)
}

where the error message is descriptive. Try to avoid using the same string multiple times in the same function without very good reason.

Ye Olde if err != nil { return err } in my code now has a very specific meaning, which is that the function called to get this error is conceptually still a part of the calling function and we are sure that the scope is definitely something we want to hide from callers. For instance, there is some code that is just irreducibly a long list of “things to do”… update this DB row, update that DB row, construct a notification email, trigger this metric, and so on and so forth. As you refactor the things this big function does into various unexported methods, it may not always be necessary or desirable for higher level code to end up with what may be very deep inside baseball of the structure of the code that does all of this.

Returning couldn't notify admin because: SMTP connection failure is generally better than couldn't notify admin because: while constructing admin email: in shared SMTP connection code: while connecting to shared SMTP meta-client: SMTP connection failure. The calling code doesn’t care about exactly how you’ve refactored your functions internally, so the internal functions here that I’ve hypothesized like one to look up the correct SMTP server and send to it is not necessarily data you need to send back. Your errors should reflect what the caller needs to know, not the internal structure of your package.

As a rule of thumb I’d say one package should emit one “level” of error message into any error it is propagating (assuming the packages are themselves well-designed). It doesn’t necessarily matter much where in the package it is generated, and if the package has internal organization that may result in some bare return errs occurring.

Don’t Change Top-Level Variables In External Packages

Go has sharp restrictions on what can be truly constant. Things like instances of structs that we may want to be globally readable, such as distinguished errors like io.EOF, or the default transport in net/http, thus end up as top-level variables.

A package should never change the variables for another package. It is OK for packages to assume that no one will reach in and change their top-level variables, and if they do, the breakage is the responsibility of those packages making the change, not the original package for not “protecting” them. This does not even need to be documented; in fact it should be documented if it is valid to change package-level variables.

Changing globally-accesible writable values you don’t “own” is already a bad idea in non-threaded languages, for all sorts of reasons, but in languages where the casual expectation is that it is highly concurrent, variables in external packages can be considered de facto protected by the fact that any change can be presumed to introduce a race condition, and therefore, any variable not documented as being changeable should be presumed not to be changeable.

Datastructures With a `New` Constructor Should Not Be Directly Created

If a struct comes equipped with a New constructor, or something that starts with New, you should not instantiate the struct yourself. You can assume that there is some additional initialization of some sort necessary to populate unexported parts of the struct that you can’t see. If you subsequently call a method on this value, and the method panics or misbehaves, you should consider that a fault of the code that constructed the object in violation of its contract, not the fault of the code that panics or misbehaves.

A common exception to this I have in my code is when I construct something to immediately unmarshal something into it, with a custom Unmarshal* method on the value. The Unmarshal can be seen as an alternate constructor. There may be other ways that a struct supports to be “officially” created. For clarity for a user, if any of these ways are more “official” than another you should have a New* function unless you can’t.

On the provider side, once you have a New function of some sort, it is necessary for you to ensure there is no other way of constructing the value that will bypass your validity constraints, excluding a direct zero-value instantiation. When that goes wrong, the fault is with the code that did the direct instantiation, not the type’s implementation.

Avoid Multiple Slices Pointing At The Same Underlying Array

I’m not going to describe slices here because that has been done by the project. This is about how to use them.

Slices are very convenient, but have some annoying corner cases. These have actually been well known since before Go started using them, so these helpful rules of thumb also apply in other languages or libraries that have similar constructs.

I’m not going to talk about every last corner case, because that has also been covered elsewhere. I’m just going to discuss the fact that the vast majority of the real problems people have arise from having multiple slices pointed at the same backing array. This is generally going to be OK, by which I mean, it’s hard to get in much trouble with code like this:

mySlice := []string{}

for _, val := range someOtherSlice {
    mySlice = append(mySlice, process(val))
}

This code raises the possibility for problems:

func DoSomething(in []string, params []string) []string {
    processing := params
    
    for _, val := range in {
        processing = append(processing, process(val))
    }
    
    // ...
}

Later references in the function will have to chose between processing and params, and this is where you can get into trouble with things like appending to params and being suprised that the values in processing got changed, because that first line copied the length, capacity, and underlying backing array pointer, but does not automatically clone the backing array.

This includes across functions and method calls. If they need to return something, they should return a new slice value. Recall that slices are three words and as mentioned above, definitely do not copy the backing array. Returning new slice values is cheap and maximizes the flexibility and safety.

Don’t be afraid to use slices.Clone to avoid false sharing. Especially if you know your slice is going to be relatively small, which is quite a lot of slices, a little bit of copying can avoid some bugs, especially as the code changes into the future.

Locking Constructs In Structs Should Be Grouped With What They Lock

This is not unique to me (fortunately!), I’ve seen it in a few other places, but not the official documentation that I can recall. When using something like a sync.Mutex in a struct, group it with the things it locks:

type SomeStruct struct {
    ExportedValue int64

    m       sync.Mutex
    count   int
    subvals map[string]struct{}

    server string
}

}

This indicates to a later reader that the mutex protects count and subvals, but not the server (probably set once at construction and never changed).

Also, while I’m here, do not embed sync.Mutex. Earlier in Go’s lifespan examples were frequently written without the m and simply embedding it. However, embedding the mutex allows users of the struct in other packages to directly access the mutex, potentially allowing them to lock or unlock it directly. While I never actually had this happen, it’s just a generally good practice to keep it unexported.

Never Take More Than One Mutex

In the 1990s, threaded code worked on the principle of trying to figure out what data needed to be protected and ensuring that anything accessing it took the necessary locks. Unfortunately, this has terrible scaling characteristics, because taking multiple locks on things is extremely tricky. I generally think of it as exponentially tricky, though that’s an admittedly fuzzy definition I’m running on.

This was so bad that a lot of people came to the conclusion that multithreaded code is simply impossible to write. I disagree with them strongly, yet even so I am deeply sympathetic to how they arrived at that conclusion. What they were trying to do is effectively impossible. It just isn’t the only way to do multithreaded code… in fact I’d almost characterize it as the worst, beaten only by “don’t worry about locks or anything else at all and just let threads run around doing whatever”.

Go does use locks and provide locks, so you still have the tools to get yourself into this trouble. However, there’s a simple rule of thumb you can use to make sure that you never get stuck in that world: Never take more than one lock a time. This includes transitively; it is important not to call any code that might take other locks while you have a lock.

For instance, it is common to have something like a multi-thread capable map of some sort simply by providing read and store methods that do the obvious, naive thing with a sync.Mutex. This is fine and can be a very potent way to get little, yet critical, bits of concurrency into a system. It is just important to not take any more locks once you’ve taken one.

“Not taking locks” also includes things like sending on a channel, which internally involves taking locks.

If you need to write code that might take multiple locks, you need to switch to some other mechanism for communication, like, an actor, or if not a full “actor”, a goroutine that is responsible for collecting the multiple bits of data you need into one place and doing the operation all inside of itself. In a nutshell, find a way to have a goroutine own all the things in question in one place, not shared with anything else, where it can do all its work without constantly coordinating, and then share the results out as necessary.

You can also depend on external sources of concurrency that have solved the problem you are facing, most notably databases.

A consequence of this is that people providing code need to document whether their code takes locks. It is generally enough to say “this struct’s methods are concurrency-safe” to imply that it is taking some sort of lock internally and people should treat it as such, rather than having to annotate each method with that fact. Of course, if a struct has a complicated pattern of what is and is not concurrency-safe, you may need per-method or per-field documentation.

Which leads me smoothly right into…

Read The Documentation For The Functions You Use

Go is a simple language with a simple type system. It is legal and common for functions to document things that are not expressible in the type system, such as the concurrency-safety of various values.

For instance, io.Reader comes with a lot of conditions the type system can not express, such as, code implementing the interface must not retain the passed-in buffer. Special handling for error values is declared. The fact that .Read(...) is one of the rare function that may both return an error and return usable data is documented.

Another thing I sometimes document is something to the effect of “once you pass this slice to this method, you must not modify that slice any more”. I would be very reluctant to put such a thing in a public library on GitHub, but within some applications I’ve written, some particular chunk of data may be too expensive to prophylactically copy, but the caller needs to be aware that they no longer “own” that data. Generally callers can assume they are not permanently turning over parameters to a called function, but documentation can be used to override this.

There’s two sides of this; as a consumer you should expect to need to read the documentation for the code you use, but also as a provider, you need to be sure to be documenting any weird thing you do with function parameters. Godoc is pretty good, if not necessarily perfect… use it! Code that violates documented preconditions or postconditions is at fault, not the code that was depending on them.

Every Goroutine Needs A Panic Policy

Every time you start a goroutine, you need to know what it will do when it panics. Like error handling, there is no one-size-fits-all answer and you need to carefully think and consider what you not, not just bash out some boilerplate without thinking.

The default behavior for an unhandled panic that escapes a goroutine is to crash the entire process. This is a sensible default for the Go runtime because in the absence of some assurance that it is safe to do something else, it is the only safe action it can take. It is, however, probably only rarely what you want.

Most of the time, the action I want is for one way or another the program to continue on, so most of my go statements look something like this:

go func() {
    defer func() {
        r := recover()
        if r != nil {
            // log something, flip a metric, etc.
        }
    }()
    
    // do the work here
}

You can cancel a context to terminate a larger piece of work, you can log a nicer message and then terminate the whole program, there’s a whole bunch of things you may want to do, but every time you type go you need a conscious answer to the question of “how do I handle panics?”

Be careful of deciding “my policy is that this goroutine can’t panic so I don’t need to worry about it”. If you are correct that it is impossible for it to panic, it is indeed a correct policy. However, it is vulnerable to future changes to the code that may add a way to panic, and of course, there’s always the chance that you’re wrong. At the very least this calls for a comment about how sure you are it just can’t panic on the creation statement.

Go Open Source Projects Do Not Need Weekly Commits

Some language communities are in constant major flux. People become accostomed to projects having to make releases every month or two just to keep up with all the flux in all their dependencies, or changes in best practices, or integrations with the newest hot library, or the constant routine beat of security vulnerabilities at every level in the stack.

Go is not immune to any of those, but it is generally affected by all of them quite a bit less. Consequently, it is completely normal and not necessarily concerning to encounter a Go library that may not have had a commit in a few months or even years, that is still perfectly well up-to-date, safe to use, and may even be the best-of-breed solution for Go.

I would say that generally, “last commit date” is not a good metric for analyzing Go libraries. Consider instead:

The other GitHub metrics: Stars and forks can be bought, but I don’t see this happening for your average open source libraries.
pkg.go.dev for a package has an “imported by” count. That is generally a strong vote in favor of a given package. Don’t forget to give out metaphorical bonuses to packages whose users won’t necessarily be other open-source packages, e.g., an interface to a time-tracking system isn’t necessarily going to be imported publicly even if it is in wide use privately.
Total number of commits: Three commits, one of which is the basic GitHub repo creation, one a big pile of code, and one that is a fixup to the README.md, is probably not a well-tested library.
Time span of the commits: Similarly, if all the commits are over a week, it probably isn’t well tested. 30 or 40 commits over the course of a few years indicates that it is probably being used for real somewhere.
Contributor count: In many cases even a single additional commit from a non-primary contributor is a good sign that the library is in real use and getting tested. Remember that most people do not engage with libraries at all, so even a single additional PR or commit indicates it is probably used by a minimum of dozens of people.
Issues: It may seem counterintuitive that more issues is better, but as you should well know, there are always issues in code. If they’re getting filed, again, that means someone is actually using it. Closed issues are also a good sign that the bugs are being worked out. Of course, if the project has hundreds of open issues and no closed issues, that does indicate some problems, but this is an unusual pattern.
Signals in the godoc: Namely, is there any? Is it any good?
Signals in the README: I would consider a few badges and a link to the godoc a bare minimum and not a strong positive signal, but it’s still something. Positive signals include honest assessments in the README.md about the production-readiness of the package, indications of who is using it, any sort of useful indications of when not to use it, and so forth. Admissions against interest are a strongly positive signal in my book.
For security, something like govulncheck should already be integrated into your workflow somehow, so you’ll be notified of security issues in any package, not just potentially dubious ones. Of course you need to make sure you aren’t using something so obscure that no report will ever be filed, but for that, consult all previous bullet points and you’ll probably be fine.

Well, I didn’t exactly expect that to turn into an impromptu lesson on how to evaluate open source libraries in general, but, here we are I guess.

In light of all that it should be clear that even in langauges with churn, “last commit date” isn’t a very good heuristic there either. If all of these other signals flash green except “last commit date” is a year ago, you’re probably still looking at a solid choice. It doesn’t carry a lot of information on its own.

The Caller Controls Concurrency

The go statement is always available to user, and it has fairly clean and normalized semantics. Therefore, it is not advantageous for libraries to try to “wrap” using go. For instance, an image library that decodes a JPEG should not offer an “async” decode of the JPEG, where the caller passes it some data and it will come back with a “promise” of the data being available later. An image library should just provide a routine that decodes the JPEG, in the same goroutine that called it.

This is because in Go, any caller can easily spawn a goroutine to run your code in any desired concurrency framework. Maybe they want an errgroup, or maybe they’ve got some stuff with contexts, or maybe they’re using a structured concurrency library. If you try to “helpfully” spawn a goroutine, you will fail to integrate with those things and work at cross purposes with that code.

If there is any reason for a library to accept a context.Context, it should, as that further gives the caller some control over cancellation. There is otherwise no clear or clean way to implement it.

For people coming from an async/await-type background, it is closer to the truth to imagine that everything in Go is already asynchronous, rather than imagining everything in Go is synchronous. This isn’t the truth either; the truth is that Go is a threading-based system and that doesn’t precisely map to either side of “sync” or “async”. But any Go code (give or take things that are doing heavy syscalls or other rare things that require specific interaction with OS threads) can be made “async” by the caller, so it is not necessary to provide “support” for “async” in the called library.

In general libraries should fall into one of two categories:

The library has no go statements at all.
The entire purpose of the library is to provide some sort of concurrent functionality.

There’s a non-zero, but very small, set of libraries in the middle and you should be think very carefully about whether you really have one of those cases before you implement one.

(An example is a UI binding library that is binding to a UI library that cares deeply about the “main thread”. The binding may automatically spawn a goroutine and pin it to an OS thread and use it to implement the interaction with the UI. But you can get a sense for how exceptional this is by the fact there’s probably a single-digit set of libraries in Go that should do that. Things like HTTP APIs or things that read and write files are not exceptions.)