#StandWithUkraine

Russian Aggression Must Stop


Programming fundamentals - Part 04: Compound data with tables

2025/08/31

Tags: programming tech programming fundamentals

This is the fourth chapter in the series on covering programming fundamentals for newcomers to programming and for those refreshing their basics.

Today we'll go over Lua's tables and how they can be used to construct compound data such as lists or structure data into fields of keys and values.

Compound data

Similar to how we learned how we can form compound expressions to make more complicated calculations, data can also be combined together into structured data to represent more complex concepts than we could with just individual numbers or strings.

The most obvious case is when we are dealing with an undetermined number of elements. In that kind of a case we cannot simply assign values into pre-determined variables each holding one value - we would inevitably lose track and our code would become ridiculously complex!

Another case is data coupling. Sometimes we want to associate a bunch of data points together under a unified structure. For example, if we are dealing with information related to people, we might want to make keep information about a specific person together. This way we don't need to declare multiple variables for that person, we just declare one and use that variable to access different fields, such as the person's name or age or occupation.

In most programming languages there are different data types used for lists and for structured data. However, Lua makes our life easy by only providing one data type for both: the table. Depending on how we use the table, it will allow us to express either lists or structured data. We will first go through these two cases separately and then explain how this versatility comes about.

Making and indexing lists

As our introduction to the syntax surrounding the table type, let's start of easy and make a list of the numbers from 1 to 5. Then let's print it out and also have a look at the data type:

local myList = {1, 2 , 3, 4, 5}
print(myList)
print(type(myList))
table: 0x564310236980
table

The data type is pretty clear, tables are simply known as tables. However, printing the value of the table resulted in some pretty odd output. Instead of printing the values contained within our table, printing the table prints out a hexadecimal address, meaning the location where our table is stored in memory. That is why if you try out that code, you will almost certainly see a different value.

This might make you think that we need to do something with that hexadecimal value and you would be partially correct, but in practice we don't typically directly benefit from the address of the table at all.

If we want to read into the table and print out the values we stored in there, we have to do what it known as indexing. We have to point to a specific slot inside the array to retrieve the value stored in that slot. It's not much different from pointing at a variable by giving its name, just the syntax is slightly different.

If you already have some programming experience, then this might be a point of confusion though. If we want to get the first element of our list, we have to point at the index (or slot) 1. Like so:

local myList = {1, 2, 3, 4, 5} 
local firstValue = myList[1]
print(firstValue)
1

So, providing the name of our table variable and adding square brackets with our index key inside will allow us to access individual elements of the list. This means that if we want to print all the 5 elements, we can do so with a loop:

local myList = {1, 2, 3, 4, 5} 

for i = 1, 5 do
   print(myList[i])
end
1
2
3
4
5

That works if we know how many elements are in the list, but what if we don't? In that case we can first retrieve the length of the table using the length operator # and iterate that many times:

local myList = {1, 2, 3, 4, 5} 

for i = 1, #myList do
   print(myList[i])
end
1
2
3
4
5

However, if we want to go through every element in a list, there is also an alternative for loop syntax we can use:

local myList = {1, 2, 3, 4, 5} 

for index, value in ipairs(myList) do
   print(index, value)
end
1	1
2	2
3	3
4	4
5	5

This is known as the iterator syntax. The function ipairs() returns a special object known as an iterator, which produces a sequence of values, or more specifically in this case a sequence of pairs of values. These are the index and the value stored at that index. We won't go into details on how iterators are made just yet, but the key thing to know is that we can consume them using for loops. The structure in all cases is the same:

for VALUES in ITERATOR do
   BODY
end

Consuming lists with iterators can make your life a bit easier, since you do not need to separately index into the list and can instead just pull the value out directly for processing. However, if you only need to deal with a part of the list, creating for loops with indices may be more suitable and avoid unnecessary loops.

Managing lists

Typically when we have data in a list, we probably want to do various tasks on the elements. Say we have a to-do list for example, we probably aren't going to be able to declare all of the tasks we must do ahead of time. And we also might not be interested in keeping completed tasks around. We might also want to reorder tasks based on priority or move tasks into separate lists to categorize them.

Let's start by creating an example of a to-do list and see how we can start managing that list.

local myTodoList = {"Write new chapter on compound data", "Check the grammar", "Eat a snack"}

Let's say we want to add a new task to remind ourselves to publish the chapter when we are done with everything else. For that purpose there is a function for inserting elements into lists:

local myTodoList = {"Write new chapter on compound data", "Check the grammar", "Eat a snack"} 

table.insert(myTodoList, "Publish the new chapter")

print(myTodoList[#myTodoList]) -- Let's print the last element to see it was added to our to-do
Publish the new chapter

By default table.insert() will add the specified element to the end of the list, but if we already know where it should be placed, we can also insert things positionally by supplying 3 parameters instead of two:

local myTodoList = {"Write new chapter on compound data", "Check the grammar", "Eat a snack"} 

table.insert(myTodoList, 3, "Publish the new chapter")

print(myTodoList[3])
print(myTodoList[4])
Publish the new chapter
Eat a snack

As you can see, this caused the task to be added into the list before our important snack time.

However, let's suppose we are getting a bit hungry and we decide to eat a snack right away. This means we don't need to keep that task around anymore and can remove it from our to-do list. For that we have the inverse operation table.remove() which will either remove the last element of the list if called with no position or the element at the designated position otherwise. It will also return the element in case we wish to look at the element we removed. We could for example add it to a list of completed tasks.

local myTodoList = {"Write new chapter on compound data", "Check the grammar", "Publish the new chapter", "Eat a snack"} 
local completedTasks = {}

local removedTask = table.remove(myTodoList)

table.insert(completedTasks, removedTask)

print("Task completed: " .. completedTasks[1])
Task completed: Eat a snack

You can also overwrite an element in a list by using normal assignment operation. Let's say we want to make sure we check not only grammar but also spelling:

local myTodoList = {"Write new chapter on compound data", "Check the grammar", "Publish the new chapter", "Eat a snack"} 

myTodoList[2] = "Check the grammar and spelling"

print(myTodoList[2])
Check the grammar and spelling

Now, let's try a slightly different reordering. Let's assume that instead of having our tasks in priority order, we want to put them in an alphabetical order. Maybe we want to do that to help us find a specific task or maybe we just prefer to make our own lives difficult.

Either way, Lua has us covered with the table.sort() function.

local myTodoList = {"Write new chapter on compound data", "Check the grammar", "Publish the new chapter", "Eat a snack"} 
table.sort(myTodoList)

-- Note the use of "_" here: we can use it to ignore values we don't care about, like the index!
for _, task in ipairs(myTodoList) do
   print(task)
end
Check the grammar
Eat a snack
Publish the new chapter
Write new chapter on compound data

Excellent, now we'd have to check the grammar of a non-existent chapter and publish it before we even write it. Quite the causal conundrum.

The table.sort() function is actually quite powerful. By default it will simply take each element, compare it to the next with the < operator and put them in ascending order. In the case of strings, this will be an alphabetical order. However, if we want to implement alternative sorting strategies, we can pass it a comparison function that implements the sorting logic.

Let's try to do something a bit complex. Let's assume that our tasks have gotten jumbled, but we prepared for this by giving each of the tasks a priority value to sort by. This way we can reconstruct the order of the tasks via table.sort(). For now we will represent the pairing of task and priority by storing each of the tasks as a list of two elements.

local myTodoList = {{2, "Check the grammar"}, {4, "Eat a snack"}, {3, "Publish the new chapter"}, {1, "Write new chapter"}}

This way we have a list of lists. This sort of nesting can be done to any depth we desire, although many layers of nesting may become difficult to grasp if we aren't careful. This might also not be the best way to represent associative data, but we will get to that soon. For now this will work for our purposes.

In order to sort the tasks into the right order, we need to sort by the first value and not by the second value. So, we need a function that takes two lists and returns the comparison of the first elements from both. Like so:

function byPriority(listA, listB)
   return listA[1] < listB[1]
end

print(byPriority({1, "Higher priority"}, {2, "Lower priority"}))
true

We can then pass that function as a parameter to table.sort() to get our list back into priority order:

function byPriority(listA, listB)
   return listA[1] < listB[1]
end

local myTodoList = {{2, "Check the grammar"}, {4, "Eat a snack"}, {3, "Publish the new chapter"}, {1, "Write new chapter"}} 

table.sort(myTodoList, byPriority)

for _, task in pairs(myTodoList) do
   print(task[2]) -- Since we want to just print the task text
end
Write new chapter
Check the grammar
Publish the new chapter
Eat a snack

Functions taking functions as arguments are known as "higher-order functions" and they can be quite versatile. We will cover them further when we delve deeper into functions and a programming paradigm known as functional programming in a later chapter.

Associative key-value data

Technically we already touched on associative data by creating something called an "association list". These are just lists containing lists, where the first element is the key and the second element is the value. They theoretically allow us to represent any associative data we want, but they come at the cost of speed.

In an association list, finding a specific element by its key would require us going through potentially the entire list and we would also need to deal with the boilerplate of indexing into the list pairs to separate the keys and the values, likely by creating some helper functions.

However, Lua has provided for us a mechanism that allows us to do things more quickly and with less code. Because while tables allow us to store sequential lists, they also allow us to store data as arbitrary key-value pairs.

We can declare a key-associative table and index into it like so:

local myKeyValueMap = {someKey = "Some value", foo = "bar", answer = 42} 

print(myKeyValueMap["someKey"])
print(myKeyValueMap["foo"])
print(myKeyValueMap["answer"])

-- Alternatively we have the following syntax
print(myKeyValueMap.someKey)
print(myKeyValueMap.foo)
print(myKeyValueMap.answer)
Some value
bar
42
Some value
bar
42

The alternative syntax is functionally equivalent, but works as an easy short-hand when the keys are known ahead of time. If you don't know the keys, you can use the normal square bracket indexing.

This looks quite similar to lists and that's because lists and key-value tables are essentially one and the same (discounting some optimizations under the hood). If we omit the keys when we create a table, Lua will just substitute them with an incrementing index. This allows functions to work on both key-value data and lists and functions that are list-specific such as ipairs(), table.remove() and table.insert() simply assume the table can be indexed by sequential numbers.

However, since aforementioned functions are meant for use with sequential tables, we need to do certain things a bit differently. If we want to add a new element into a key-value table, we should use assignment:

local myKeyValueMap = {answer = 42} 

myKeyValueMap.answer = "Thanks for the fish"

print(myKeyValueMap.answer)
Thanks for the fish

Deletion of a key is simply done by assigning the value to nil:

local myKeyValueMap = {answer = 42} 

myKeyValueMap.answer = nil

If you want to iterate through all of the key-value pairs in a table, the pairs() function will work nicely for you. It works very similarly to ipairs() with the following differences:

  • All key-value pairs will be returned. ipairs() only returns numeric keys starting from 1.
  • Key-value pairs are returned in an unspecified order. Even executing the same code will randomly return the pairs in different order.

So, in theory you can get the index-value pairs out of a sequential list with pairs(), but if you were to do so, you might end up getting the pairs in the wrong order. And in the reverse case, you wouldn't be able to get arbitrary key-value pairs out of a table with ipairs().

There are no surprises to the function:

local myKeyValueMap = {someKey = "Some value", foo = "bar", answer = 42} 

for key, value in pairs(myKeyValueMap) do
   print(key, value)
end
someKey	Some value
foo	bar
answer	42

Tangent: why are key-value tables faster than association lists?

Since I asserted that association lists are a slower way to represent associative data, you might be wondering why that is. We'll cover data structures more in a future chapter, but as a small taster we can have a quick look into how tables work on a high level.

A search operation in a general association list would probably need to be done linearly, meaning each key-value pair is checked in sequence to find the matching key. The following function shows how this might be implemented:

function assocLookup(list, key)
   for _, elementPair in ipairs(list) do
      if elementPair[1] == key then
         return elementPair[2]
      end
   end
end

local myAssocList = {{"foo", "bar"}, {"answer", 42}}

print(assocLookup(myAssocList, "answer"))
42

In a key-value table, we can use hashing as a method to make lookups run essentially in constant time. This works by implementing a hashing function that can produce a moderately unique numeric value for a given key. We can then use that numeric hash value to index into a list directly without having to inspect the other elements. Because of the use of hashing, such data structures are often referred to as "hash maps" or "hash tables", since they map a key to a value using the hash of the key. Let's assume the existence of a hash() function that generates hashes and implement something like a table.

function hashLookup(list, key)
   local index = hash(key) % #list

   return list[index]
end

In this case we assume that the list has been created with a pre-determined length and we use the modulo operation to convert the hash into an index into this list. Because we don't have to concern ourselves with the other elements of the list, this function will be significantly faster on larger tables than the associative lookup.

There are a few caveats. The hashing function needs to produce values that are sufficiently distributed, meaning that as few values as possible should map into the same hash. And even with an evenly distributed hash function, it's likely that there will be hash collisions. A properly implemented hash table takes this into account in order to resolve collisions and to manage the table as it grows to reduce the likelihood of collisions.

Trees of hierarchical data

We already know that tables can be nested inside other tables. A nested table has a specific term: tree. In some cases we don't necessarily think of them as such, but in practice that's what nested lists or tables are.

Trees are just tables organized into hierarchical form and they allow us to represent very complex data indeed. For example, an entire game's state could be represented as a multi-level tree structure based on tables:

local gameState = {
   gameOver = false,
   tiles = {1,1,1,1,1,
            1,0,0,0,1,
            1,1,1,0,1,
            1,0,0,0,1
            1,1,1,1,1},
   player = {name: "Samsai", hp = 10, class = "rogue", position = {x = 2, y = 2}},
   enemies = {
      {enemyType = "orc", hp = 3, position = {x = 2, y = 4}},
      {enemyType = "skeleton", hp = 2, position = {x = 4, y = 4}}
   }
}

This gives you a lot of freedom in how to group your data based on what it's related to, how to name it in a predictable way and how to process only the data that is relevant to a specific situation.

With the powers of compound expressions and compound data, creation of basically any software we'd like is within our grasp.

What's next?

Since we now know how to structure our data, next time we are going to look at how to move data in and out of our programs with I/O, input and output, and how to store and retrieve it with files.

Stay tuned!

>> Home