Many applications rely on default configurations, static resources and other types of data for operation. Usually this is achieved by reading resource content during initialization or on-demand, but both of these approaches suffer from problems with:
availability. In case resource is not available due to incorrect path or inability to access storage device, it cannot be read and can cause application/service unavailability or crash;
unpredictable latency. There are many layers involved in reading resource from the file system, especially if it’s a network file system, and congestion on any of the layers can cause increase in latency or failures. File systems and kernel usually try to hide the latency by using caches but it’s something that application developers have little to no control, apart from using
madvise
, and as such hard to rely upon;API. Since file systems can fail, their APIs usually require clients to deal with potential errors, adding to accidental complexity and boilerplate required to access the resource.
One of the reasons for Go’s popularity was its single binary philosophy, which drastically simplifies deployment and avoids dll hell. Unfortunately until version 1.16 there was one glaring gap in this paradise - resources. Static resources either had to be bundled and dealt with separately and embedded using third-party tools.
Fortunately in Go 1.16, a new go:embed directive was added to close this gap. There is a lot to say about the benefits of this feature, but in this article I’d like to focus specifically on its performance implications and how accessing resources using go:embed fares against traditional file system APIs. To do this we’ll use the following benchmark
package main | |
import ( | |
"embed" | |
"io/ioutil" | |
"testing" | |
) | |
//go:embed citadel_of_the_star_lords.txt | |
var f embed.FS | |
//go:embed citadel_of_the_star_lords.txt | |
var content string | |
//go:embed citadel_of_the_star_lords.txt | |
var bytes []byte | |
func mostCommonWordEmbed() int { | |
data, _ := f.ReadFile("citadel_of_the_star_lords.txt") | |
return findMostFrequentWord(string(data)) | |
} | |
func mostCommonWordFs() int { | |
data, _ := ioutil.ReadFile("citadel_of_the_star_lords.txt") | |
return findMostFrequentWord(string(data)) | |
} | |
func mostCommonWordString() int { | |
return findMostFrequentWord(content) | |
} | |
func mostCommonWordBytes() int { | |
return findMostFrequentWord(string(bytes)) | |
} | |
func findMostFrequentWord(text string) int { | |
return len(text) | |
} | |
func BenchmarkCommonWordEmbedded(b *testing.B) { | |
total := 0 | |
for i := 0; i < b.N; i++ { | |
total += mostCommonWordEmbed() | |
} | |
} | |
func BenchmarkCommonWordFs(b *testing.B) { | |
total := 0 | |
for i := 0; i < b.N; i++ { | |
total += mostCommonWordFs() | |
} | |
} | |
func BenchmarkCommonWordString(b *testing.B) { | |
total := 0 | |
for i := 0; i < b.N; i++ { | |
total += mostCommonWordString() | |
} | |
} | |
func BenchmarkCommonWordBytes(b *testing.B) { | |
total := 0 | |
for i := 0; i < b.N; i++ { | |
total += mostCommonWordBytes() | |
} | |
} |
that reads the content of the Edmond Hamilton’s “Citadel of the Star Lords”.
And the results are
As expected, embedding the resource as a string results in 0 cost of its runtime access, while converting bytes to string adds significant overhead, which is still more than 2X lower than reading content through embed.FS
. But accessing file’s content using regular file system API is still more than 2X slower than through embed.FS
.
Conclusion? Super convenient API of embedding static resources and deployment simplicity are already great reasons to use go:embed
but a significant performance boost is an additional extremely welcome bonus.