Jump to content

Lua tables as sets (instead of arrays)


Recommended Posts

I don't know why I even have to point this out since this is something I figured out automatically in my early days of MTA scripting. But it seems necessary because I see people using Lua tables but not taking advantage of their flexibility.

Very brief overview of tables as arrays

Anyway, Lua tables are often used as arrays - that is, data structures that store values under consecutive integer keys starting at 1 (0 in most other languages, but that's another story), making a sequence. Various functions that operate on tables, including those in table namespace, treat them as arrays. Take this example:

local fruits = {}
table.insert(fruits, "apple")
table.insert(fruits, "banana")
table.insert(fruits, "lemon")
table.insert(fruits, "orange")

It creates an empty table, then uses table.insert to insert values. Because table.insert treats the table as an array and because it inserts at the end of the sequence if position argument is omitted, we end up with values "apple", "banana", "lemon" and "orange" values under keys 1, 2, 3 and 4 respectively.

Using index operator [], you can retrieve the associated value from the key:

-- assuming standalone Lua, using print
-- replace with outputChatBox or another output function if running in MTA
print(fruits[1]) -- outputs: apple
print(fruits[2]) -- outputs: banana
print(fruits[3]) -- outputs: lemon
print(fruits[4]) -- outputs: orange

Tables as sets

However, table keys don't have to be integers. You can use any value except nil and nan. Strings, other tables, MTA elements. Therefore, if the order of values is irrelevant and the values are unique, instead of inserting the value under some key, you can use that value itself as the key. And set the associated value to true:

local fruits = {}
fruits["apple"] = true
fruits["banana"] = true
fruits["lemon"] = true
fruits["orange"] = true

Now we have a table with keys "apple", "banana", "lemon" and "orange", while the associated value true is only used to indicate the presence of key and nothing else. That effectively makes the table a set, a collection of unique values with no particular order.

Usage and comparison

Insertion

So we have two different ways to insert values:

-- array
table.insert(fruits, "apple")

-- set
fruits["apple"] = true

This alone doesn't say much. Both operations are simple and take a roughly constant amount of time to execute. However, it makes other operations very different.

Removal

Because arrays require you to know the key to remove the value, you have to loop through the array to find it. In contrast, with sets you can just assign nil (because nil is the value for unassigned fields):

-- array
function removeFromArray(array, value)
    for i, v in ipairs(array) do
        if v == value then
            table.remove(array, i)
        end
    end
end

removeFromArray(fruits, "banana")

-- set
fruits["banana"] = nil

Arrays are very inefficient for this if there is a large number of values, because the more values there are in total, the longer the removal of a single value will take - whereas removing from the set will take more or less the same.

Checking if exists

Checking for presence of a value is a lot like removal, you need to know where the value is, so arrays require looping. With sets, you just retrieve the associated value. If the checked value exists, the associated value will be true, otherwise it will be nil.

-- array
function existsInArray(array, value)
    for i, v in ipairs(array) do
        if v == value then
            return true
        end
    end

    return false
end

if existsInArray(fruits, "lemon") then
    print("found")
else
    print("not found")
end

-- set
if fruits["lemon"] then
    print("found")
else
    print("not found")
end

Arrays are again inefficient in the same way.

Looping

Looping is roughly the same, but as far as I know, Lua tables are optimized to be used as arrays, so I guess looping may be a little faster for arrays than sets. I have never checked this myself.

-- array
for key, value in ipairs(fruits) do
    print("found value: "..value)
end

-- set
for value in pairs(fruits) do
    print("found value: "..value)
end

Notice that ipairs is used for the array and pairs is used for the set. pairs will work for the array as well, but the order will be unspecified. ipairs is meant to be used on arrays, it starts at index 1 and increments until no value is found.

Size retrieval

Lua has length operator #, which returns the array length when used on tables. But there is no built-in operator or function to retrieve the total number of fields in a table, so it takes more to get the size of a set.

-- array
local count = #fruits
print("fruit count: "..count)

-- set
function getSetSize(set)
    local count = 0

    for value in pairs(set) do
        count = count+1
    end

    return count
end

local count = getSetSize(fruits)
print("fruit count: "..count)

This makes size retrieval much more efficient for arrays than sets because the more values the set has, the longer it takes to loop through them.

That's speaking of sets in their simplest form though. You can have a structure like this:

local fruits =
{
    count = 0,
    values = {}
}

And modify the count every time a value is inserted or removed. Then you can retrieve the size simply by reading fruits.count, which is efficient. But then other operations have to be altered too so we're not getting into this here.

Checking if empty

Checking if a collection is empty means checking if its size is 0. For arrays, nothing changes, but for sets it can be done efficiently.

-- array
if #fruits == 0 then
    print("array is empty")
else
    print("array is not empty")
end

-- set
if next(fruits) == nil then
    print("set is empty")
else
    print("set is not empty")
end

next is a Lua function for traversing the table. It's used by pairs too, called once for each pair. But here we call next directly. Its first returned value is one of the keys in the table, or nil if no key is found.

Conclusion

For some of the most common operations, sets are both more efficient and simpler to operate on than arrays. In its simplest form, it loses against array in size retrieval, but you can keep track of size separately. When you want to store unique values and their order is unimportant, using tables as sets is probably the right choice.

Syntax summary:

tbl = {}

-- insertion
tbl[value] = true

-- removal
tbl[value] = nil

-- checking if exists
if tbl[value] then

end

-- looping
for value in pairs(tbl) do

end

-- checking if empty
if next(tbl) == nil then

end

 

Edited by CrystalMV
  • Like 6
Link to post
  • IIYAMA pinned this topic
  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...