Slices & Strings in Odin

Written:
Tags:

This is a look into how strings compare to slices in the Odin programming language. I write this, as it was one of the stumbling blocks to my first steps of learning Odin. It seems simple to me now, but maybe this will help a newcomer.

Slices

A slice is a mutable data type that holds a length and a pointer. It can be thought of as a window or a view into a span of data that you can change. It may or may not own the data it points to, and unlike a dynamic array, it is not resizeable. Like an array, the length is always a function of how many units of data that it points to, so if we have a slice of bytes, the length will represent the number of bytes in the slice.

buf := make([]u8, 4) // Allocate a slice of 4 bytes.
l := len(buf) // 4
sz := len(buf) * size_of(u8) // 4

If we have a larger data type, such as a series of runes, then the length will reflect the number of runes and not the number of bytes. A rune in Odin is represented by a signed 32-bit integer.

buf := make([]rune, 4) // Allocate a slice of 4 runes.
l := len(buf) // 4
sz := len(buf) * size_of(rune) // 16

buf in the above code block has a length of 4, but it takes up 16 bytes of memory.

Strings

Strings in Odin are fundamentally slices, yet an important difference is that they are immutable on the surface; their index slots cannot be assigned to with different values. Normally, changing anything about a string requires creating a new string, unless you have access to the underlying buffer.

Where len can be confusing is that an Odin string can be thought of as a slice of bytes that has special iteration properties when using the for x in y construct.

str := "Hellope!"
l := len(str) // 8

uni := "こにちは"
luni := len(uni) // 12

This shows us that Odin’s len reports the byte count of the string, as opposed to the number of codepoints.

However, if we try to iterate the string, we’ll receive the UTF-8 decoded Unicode codepoints as runes with each stage of iteration, along with the byte index:

for r, i in uni {
    fmt.println(r, i)
}
// こ 0
// に 3
// ち 6
// は 9

This won’t be the case if you iterate by index over a string.

for i := 0; i < len(uni); i += 1 {
    fmt.println(uni[i])
}
// 227
// 129
// 147
// 227
// 129
// 171
// 227
// 129
// 161
// 227
// 129
// 175

fmt.println(typeid_of(type_of(uni[0])))
// u8

As you can see, the string data type in Odin is a slice of u8s or unsigned bytes.

It’s important to keep this in mind when working with strings, as this is the only case in Odin where iteration over a slice-like data type behaves differently from the usual slices or arrays.