Beyond the knowing the basics, writing more featureful programs requires the programmer to understand structure in order to manage the growing complexity of the code. A long program, spanning thousands of lines of code can easily become a quagmire if it's all in one long file, with functions and variables places haphazardly throughout without rhyme or reason. Over time the pace of development, so-called velocity, will drop and the program becomes fragile and difficult to reason about.
By structuring our program by splitting it up into components we can reason about the program's behavior not at the level of individual lines of code, but at the component or module level, where hundreds of lines of code interact with each other. Assuming our components are working as we expect them to, we can focus on a specific area of the problem without needing to account for the totality of our code at any given time.
What is a module?
In programming, modules (or submodules) are typically a file of source code that provides functionality for use in another file. The modules, when put together, form the actual program. So, this means that files and programs aren't the same thing - you can have programs that are singular files but not all files are a program by themselves.
In a well-structured program a module typically contains functions (and maybe some values or variables) that deal with one clearly defined area of the program. For example, in a game you might have a module that handles loading and saving save files or a module that contains the logic for enemy behaviors. Typically if you cannot clearly and easily explain what the one thing that a module is responsible for is, then the module likely could use further splitting or reorganizing. Sometimes a module might be put together to contain a number of miscellaneous utility functions which don't have a better place to be and I consider this fine as long as the number and complexity of those utilities is low. These smorgasbords of utility modules are probably the first places to be restructured though.
Sections of the program pull in modules via import expressions, which tell the computer to load and then expose features inside that module to that section of the program. It's worth noting that in most programming languages importing a module does not imply that the module is available in other sections of the program besides the ones that imported the module directly, so the same module may need to be imported multiple times across the program's files if the module is used in many places. This is fine though and most programming languages avoid wasting time reading the same modules again.
Declaring modules and requiring them
Let's create a simple Lua module for handling basic 2D vector math. We'll
create a new file called vector.lua and fill in some basic functions for
creating a vector, vector additions and scalar multiplication:
function makeVector(x, y)
return {x=x, y=y}
end
function vectorAdd(v1, v2)
local newX = v1.x + v2.x
local newY = v1.y + v2.y
return makeVector(newX, newY)
end
function scalarMult(v1, mult)
local newX = v1.x * mult
local newY = v1.y * mult
return makeVector(newX, newY)
end
This file can now be used as a module in our program. To import the
module, let's create a new file test.lua in the same directory as
the vector.lua file and then use the require expression to bring
the module into the scope of our program. Our program looks like this:
require "vector"
local myVec1 = makeVector(1, 1)
local myVec2 = makeVector(2, 0)
local newVec = scalarMult(vectorAdd(myVec1, myVec2), 10)
print(newVec.x, newVec.y)30 10
This way of declaring modules simply takes all of the functions and
values that are not declared as local and brings them into the global
scope of the program when require is used on them. This is very simple,
but since all of the module's contents are just inserted into the global
scope, it can get tricky to figure out which module a particular value
came from and it gets easy to accidentally introduce collisions.
Another way is to manually specify the exports of a function inside a table
and returning it from a module. When a table-style module is imported, the
global scope remains untouched and the module's functionality must be used
via the table. Let's create a variant of our module inside a vectors.lua
file that is structured in the table style:
local vectors = {}
function vectors.makeVector(x, y)
return {x=x, y=y}
end
function vectors.vectorAdd(v1, v2)
local newX = v1.x + v2.x
local newY = v1.y + v2.y
return vectors.makeVector(newX, newY)
end
function vectors.scalarMult(v1, mult)
local newX = v1.x * mult
local newY = v1.y * mult
return vectors.makeVector(newX, newY)
end
return vectors
We use this module by assigning the return value of our require statement
into a local variable, which will contain the table with all of its vector
functions:
local vectors = require "vectors"
local myVec1 = vectors.makeVector(1, 1)
local myVec2 = vectors.makeVector(2, 0)
local newVec = vectors.scalarMult(vectors.vectorAdd(myVec1, myVec2), 10)
print(newVec.x, newVec.y)30 10
This approach might look a bit clumsy, since it involves look-ups into the vectors table to access the module's functions, but these sorts of hygienic modules have the benefit of clarity with regards to what functionality we are exporting and where the functionality is coming from. But if you prefer the flat modules built around global values, you can also use them where strict import/export hygiene is not required.
Note also that modules may also require other modules, so if you want
to reuse code from another module, you need to just require it the
same way as you would elsewhere in the code.
Another thing to keep in mind is how require finds modules. Lua has
a set of default locations in which globally installed modules can be
found, but in addition Lua will try to look at files relative to where
your program is executed. This is how we got the imports to work with
our little test program by putting the two files
side-by-side. Sometimes you might not want to put all of the program
files in the same directory, in which case you either need indicate
which folders you are loading files from. To look inside subfolders,
you can simply separate the directories in the module name with dots,
like this: require "subfolder.mymodule". With some string
manipulation, you can also require modules relative to the current
file's location, if you want: https://stackoverflow.com/a/9146653.
When you are starting out, a relatively flat hierarchy will likely be easier to deal with, but feel free to experiment with more complex module hierarchies to get the hang of handling imports in Lua.
Libraries and third-party dependencies
Oftentimes in programming we like to use libraries when solving problems. Libraries are modules like any other, but with the distinction that instead of only being built as a way of structuring a program, typically libraries are built with the idea of deduplication of effort in mind. So, if you wanted a rule to distinguish between a module and a library, I would argue that a library is a module that has been built with the intention of solving a problem across a number of programs.
Once a library has been created, it usually can be slotted right into another program either by manually copying the files into the program's file structure or by using some kind of a library management tool to install the library files into an accessible location, which might be shared by multiple programs.
Because libraries are typically fairly self-contained and allow using a solution to a problem multiple times, one important scaling factor of software development is the vast ecosystem of third-party libraries, which are often available entirely free of charge for developers. Without third-party dependencies, same problems would need to be solved again and again by different teams of programmers, which would make software development significantly less economical.
For Lua there exists a centralized repository of third party libraries in the form of https://luarocks.org/. Luarocks, alongside its package manager, provide a simple way to install software packages (which in the Luarocks ecosystem are called "rocks") others have written for use in your programs. The package manager tool needs to be installed first and you can refer to the official getting started guide for that, but in the case of most Linux operating systems it's probably easier to install the "luarocks" package (or equivalent) and the same applies to Macs with Homebrew installed.
Once the Luarocks package manager is installed, it's very easy to install packages. For demonstration purposes, let's install the "xyz_math" library, which provides functions for vector math.
We will simply invoke the luarocks command in terminal and specify the
library we want to install:
λ luarocks install xyz_mathNote that you may need to use administrative privileges (e.g. sudo) to install dependencies into the global Lua directories. There are ways to install them locally, but they require altering Lua's default library paths, so that Lua can locate the locally installed libraries.
Once the package is installed, you can require it like any other
module.
local xyz = require "xyz_math"
local v1 = XVec2(2, 0)
local v2 = XVec2(-1, 3)
local v3 = v1 + v2
print(v3.x, v3.y)1 3
Many, or probably most, of the libraries on Luarocks can be installed
like that without extra steps, but some libraries may have special
requirements, such as having some specific system libraries or
developer tools installed, since often Lua libraries are built on some
other more foundational libraries that are not written in Lua. In those
cases luarocks might throw an error and require you to investigate the
issue separately.
Even though third-party libraries simplify a lot of things quite significantly, you do need to keep a few things in mind when using them.
Firstly, when you are using other people's code, you should be aware that such code can, either on purpose or accidentally, introduce security issues. If a package is not well-known or popular, it's a good idea to attempt to vet the code to make sure it's not doing anything suspicious. And if the library is dealing with things like networking, it's also worth keeping it updated regularly to get potential security patches.
Secondly, it's worth checking out the license the library is offered under. Fully understanding licensing would require a degree in intellectual property law, but a general understanding is both achievable and necessary if you plan to use third-party code. Libraries should come with a license declaration and that should be visible on the library's home pages or alongside the source code. Most libraries come under an open source license of some sort and their terms can usually be found at OSI: https://opensource.org/licenses and you can find a comparison table at Wikipedia. Most licenses, even the permissive ones, carry a minimum requirement to at least credit the authors of the code you are using, but some licenses may impose additional restrictions, such as requiring the code using the library to use the same or compatible license.
In practice worrying about license compliance is only something you need to worry about if you are planning to distribute your programs, but taking a few steps to understand what licenses you are subjected to early makes it easier to deal with potential distribution later.
Structuring a program - general guidelines
To wrap things up, let's go over some general suggestions for planning the structure of your programs. Software architecture is a whole field of its own and therefore we cannot be fully comprehensive here, but I'll offer what I believe are good guidelines that hopefully guide you towards programs that are easy to understand.
Firstly, layer your programs in such a way that you separate out intricate details from the higher-level flow of the program. This way at the highest level your program is almost like a story that describes the kinds of actions the program is taking, while leaving the specific details of how those actions work in the lower-level sections. You could also consider it like writing a scientific article: you write text that references the details of other studies instead of describing the whole study where you are planning to use its results. This way the reader of the code can at each point forget about unnecessary detail and only concern themselves with what is happening at the current layer.
Secondly, split your program into chunks that are easy to digest. If you have a very long section of code, split it up into functions. If you have a very long file, split it up into modules. However, do so at points that feel natural and not at some arbitrary point. When splitting code, consider the kind of interface you will have to create between those two chunks, if the interface seems complex, you are probably splitting at the wrong part.
To help with properly splitting code, try to make each section of the code do one thing. If you can't really assign one specific role to a section of the code, it's probably trying to do too many things at once. This will also become quite obvious when you are naming things in the code: if it becomes hard to name a module, a function or a variable, that is most likely a sign that that thing doesn't have a clear enough role in the program.
Also, when you are writing your software, it's good to consider testability of the code as an attribute of the structure. It's easy to make code that is really hard to test in individual chunks and often writing code that is easily testable results in a structure where the modules and functions are cleanly separated from each other, which helps maintain a good layering and separation of concerns in the program.
Finally, if you solve a problem that you expect to need to solve later, split that functionality into a library you can easily copy into other programs and try to keep the interface of that library generic and unconcerned with program-specific details. If you think that the library might also be useful for others, consider releasing it as an open source library for the whole community to benefit from, you might also gain some helpful friends that will help you improve and maintain the library in the future.
That's it for structuring programs. Next time we'll delve a bit into algorithms and data structures to figure out how to analyze the efficiency of our programs and to make them go fast. See you then!