avatarLynn Kwong

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

9205

Abstract

ote that unlike in most other programming languages, the index starts at 1 for Lua!</p><div id="70d3"><pre><span class="hljs-built_in">print</span>(arr[<span class="hljs-number">1</span>]) <span class="hljs-comment">-- red</span> <span class="hljs-built_in">print</span>(arr[<span class="hljs-number">3</span>]) <span class="hljs-comment">-- blue</span></pre></div><p id="be3c">Under the hood, arrays in Lua are still associative arrays, which are collections of key-value pairs where each key is linked to a specific value. The above array is the same as:</p><div id="6095"><pre>arr = {[<span class="hljs-number">1</span>]=<span class="hljs-string">"red"</span>, [<span class="hljs-number">2</span>]=<span class="hljs-string">"green"</span>, [<span class="hljs-number">3</span>]=<span class="hljs-string">"blue"</span>}</pre></div><p id="3555">Note that the indices must be put in square brackets if they are specified explicitly.</p><p id="18b7">We can use the for loop to iterate the items of an array. Let’s define a function that can do such a job and print the array in a nice way:</p><div id="be74"><pre><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">printArr</span><span class="hljs-params">(arr)</span></span> <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> arr <span class="hljs-keyword">then</span> <span class="hljs-built_in">print</span>(arr) <span class="hljs-comment">-- nil</span> <span class="hljs-keyword">return</span> <span class="hljs-keyword">end</span>

repr = <span class="hljs-string">'['</span>

<span class="hljs-keyword">for</span> _, v <span class="hljs-keyword">in</span> <span class="hljs-built_in">ipairs</span>(arr) <span class="hljs-keyword">do</span>
    repr = repr .. <span class="hljs-built_in">tostring</span>(v) .. <span class="hljs-string">', '</span>
<span class="hljs-keyword">end</span>

repr = <span class="hljs-built_in">string</span>.<span class="hljs-built_in">gsub</span>(repr, <span class="hljs-string">",%s*$"</span>, <span class="hljs-string">""</span>) .. <span class="hljs-string">']'</span>
<span class="hljs-built_in">print</span>(repr)

<span class="hljs-keyword">end</span>

arr = {<span class="hljs-string">"red"</span>, <span class="hljs-string">"green"</span>, <span class="hljs-string">"blue"</span>} printArr(arr) <span class="hljs-comment">-- [red, green, blue]</span></pre></div><p id="a643">This simple example has several commonly used knowledge points of Lua:</p><ul><li>Note the syntax of the <code>if</code> condition and <code>for</code> loop in Lua. We need to use <code>then … end</code> or <code>do … end</code> explicitly to denote a code block in Lua.</li><li><code>ipairs()</code> return the index/value pairs of an array. Since we are not using the index here, it’s assigned to a dummy variable (<code>_</code>) which is the same as in Python.</li><li><code>..</code> is used to concatenate strings in Lua. Non-string values will be converted to strings using the <code>tostring()</code> function before concatenation.</li><li>The <code>string.gsub()</code> function searches for a pattern in a string variable and replaces it with a replacement string. The pattern is similar to regular expressions and in most cases works in the same way.</li></ul><p id="05bd">As a side note, we can get the length of an array with the hash operator (<code>#</code>) and can thus loop through it using the numeric for loop:</p><div id="f57d"><pre>arr = {<span class="hljs-string">"red"</span>, <span class="hljs-string">"green"</span>, <span class="hljs-string">"blue"</span>}

<span class="hljs-comment">-- Note that the range includes both ends.</span> <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>, #arr <span class="hljs-keyword">do</span> <span class="hljs-built_in">print</span>(arr[i]) <span class="hljs-keyword">end</span>

<span class="hljs-comment">-- red</span> <span class="hljs-comment">-- green</span> <span class="hljs-comment">-- blue</span></pre></div><p id="abcb">We can use <code>table.insert()</code> and <code>table.remove()</code> to insert or remove an item in an array:</p><div id="71c0"><pre><span class="hljs-built_in">table</span>.<span class="hljs-built_in">insert</span>(arr, <span class="hljs-string">"black"</span>) <span class="hljs-comment">-- Inserted in the end.</span> <span class="hljs-built_in">table</span>.<span class="hljs-built_in">insert</span>(arr, <span class="hljs-number">2</span>, <span class="hljs-string">"pink"</span>) <span class="hljs-comment">-- Insert at a specific position.</span> printArr(arr) <span class="hljs-comment">-- [red, pink, green, blue, black]</span>

arr.<span class="hljs-built_in">remove</span>(arr) <span class="hljs-comment">-- Remove the last item.</span> arr.<span class="hljs-built_in">remove</span>(arr, <span class="hljs-number">3</span>) <span class="hljs-comment">-- Remove the item at a specific position.</span> printArr(arr) <span class="hljs-comment">-- [red, pink, blue]</span></pre></div><h2 id="ffa0">Associative arrays</h2><p id="96b6">An associative array in Lua is a collection of key-value pairs, where each key is linked to a specific value. It is similar to the dictionaries in Python. However, from a technical point of view, it’s more similar to the objects in vanilla JavaScript.</p><p id="7009">Firstly, if the keys are strings which are also valid identifier names in Lua, we can use them as keys directly, and no need to use quotes and square brackets. This is also the most common use case:</p><div id="3ffd"><pre>myTable = {value=<span class="hljs-number">100</span>}</pre></div><p id="f3c6">Technically, it’s the same as:</p><div id="064b"><pre><span class="hljs-attr">myTable</span> = {[<span class="hljs-string">'value'</span>]=<span class="hljs-number">100</span>}</pre></div><p id="7752">When the keys are variables, numbers, reserved keywords like <code>if</code> and <code>for</code>, or strings not valid as identifier names in Lua, they must be put in square brackets:</p><div id="67af"><pre>myTable= { <span class="hljs-keyword">value</span>=<span class="hljs-number">100</span>, [<span class="hljs-meta">1</span>]=<span class="hljs-string">'color'</span>, [<span class="hljs-string">'if'</span>]=<span class="hljs-literal">true</span>, [<span class="hljs-string">'1stName'</span>]=<span class="hljs-string">'John'</span>, [<span class="hljs-string">'last name'</span>]=<span class="hljs-string">'Doe'</span> }</pre></div><p id="8308">When the key is a string that is also a valid identifier name, we can access its value either using a dot or a pair of square brackets:</p><div id="44e2"><pre><span class="hljs-built_in">print</span>(myTable.value) <span class="hljs-comment">-- 100</span> <span class="hljs-built_in">print</span>(myTable[<span class="hljs-string">"value"</span>]) <span class="hljs-comment">-- 100</span>

<span class="hljs-built_in">print</span>(myTable[value]) <span class="hljs-comment">-- nil</span></pre></div><p id="0a74">Note that the third one returns <code>nil</code>. This is because the value of the <code>value</code> variable is used as the key, which is <code>nil</code>. <code>nil</code> does not exist as a key in the table, and it’s actually not allowed. However, if there is a variable called <code>value</code> in your code, you may get unexpected results.</p><p id="547b">For other types of keys, you must always use square brackets to access the value:</p><div id="f193"><pre><span class="hljs-built_in">print</span>(myTable[<span class="hljs-number">1</span>]) <span class="hljs-comment">-- color</span> <span class="hljs-built_in">print</span>(myTable[<span class="hljs-string">'if'</span>]) <span class="hljs-comment">-- true</span> <span class="hljs-built_in">print</span>(myTable[<span class="hljs-string">'1stName'</span>]) <span class="hljs-comment">-- 'John'</span> <span class="hljs-built_in">print</span>(myTable[<span class="hljs-string">'last name'</span>]) <span class="hljs-comment">-- 'Doe'</span></pre></div><p id="ceaf">We can loop through the key/value pairs of a table using the <code>pairs()</code> function. Note that the keys are not ordered and may be different the sequence when they are created:</p><div id="84ed"><pre><span class="hljs-keyword">for</span> k, v <span class="hljs-keyword">in</span> <span class="hljs-built_in">pairs</span>(myTable) <span class="hljs-keyword">do</span> <span class="hljs-built_in">print</span>(k .. <span class="hljs-string">' -> '</span> .. <span class="hljs-built_in">tostring</span>(v)) <span class="hljs-keyword">end</span>

<span class="hljs-comment">-- value -> 100</span> <span class="hljs-comment">-- last name -> Doe</span> <span class="hljs-comment">-- 1 -> color</span> <span class="hljs-comment">-- if -> true</span> <span class="hljs-comment">-- 1stName -> John</span></pre></div><h2 id="4b79">Classes and objects</h2><p id="1c07">Lua is not a native object-oriented programming (OOP) language and thus there are no such concepts as classes or objects. Everything (classes or objects) is just tables in Lua. Ho

Options

wever, OOP can be realized easily with tables.</p><p id="e9b8">Firstly, a table can be seen as an object already and we can add functions to it as we saw previously. The one demonstrated above is like a static method that does not require an instance. We can also create classical instance methods that do require an instance. It can be realized with the “magical” colon in Lua:</p><div id="74ab"><pre>person = {firstName = <span class="hljs-string">"John"</span>, lastName = <span class="hljs-string">"Doe"</span>}

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">person:getFullName</span><span class="hljs-params">()</span></span> <span class="hljs-keyword">return</span> <span class="hljs-built_in">self</span>.firstName .. <span class="hljs-string">' '</span> .. <span class="hljs-built_in">self</span>.lastName <span class="hljs-keyword">end</span>

<span class="hljs-built_in">print</span>(person:getFullName()) - John Doe</pre></div><p id="54c1">In this example, <code>self</code> refers to the object itself calling the function, similar to the <code>self</code> in Python.</p><p id="bc3a">The function declaration using a colon is just a syntactic sugar for the following declaration (yes, there are many sugars 🍬 in Lua):</p><div id="41ff"><pre>person = {firstName = <span class="hljs-string">"John"</span>, lastName = <span class="hljs-string">"Doe"</span>}

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">person.getFullName</span><span class="hljs-params">(self)</span></span> <span class="hljs-keyword">return</span> <span class="hljs-built_in">self</span>.firstName .. <span class="hljs-string">' '</span> .. <span class="hljs-built_in">self</span>.lastName <span class="hljs-keyword">end</span>

<span class="hljs-built_in">print</span>(person.getFullName(person))</pre></div><p id="0d6d">Understanding this syntax sugar is very important to understand classes, instantiation, and inheritance in Lua. Let’s demonstrate it with a simple example:</p><div id="1d37"><pre><span class="hljs-comment">-- Create a class, which is just a table in Lua.</span> Animal = {}

<span class="hljs-comment">-- Create a constructor function for the class:</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">Animal:new</span><span class="hljs-params">()</span></span> <span class="hljs-keyword">local</span> newAnimal = {}

<span class="hljs-comment">-- Create a metadata table which can be associated with another table to customize its behavior.</span>
<span class="hljs-keyword">local</span> metatable = {}
metatable.<span class="hljs-built_in">__index</span> = <span class="hljs-built_in">self</span>
<span class="hljs-built_in">setmetatable</span>(newAnimal, metatable)
<span class="hljs-keyword">return</span> newAnimal

<span class="hljs-keyword">end</span>

<span class="hljs-comment">-- Create an instance method.</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">Animal:breathe</span><span class="hljs-params">()</span></span> <span class="hljs-built_in">print</span>(<span class="hljs-string">"I'm breathing..."</span>) <span class="hljs-keyword">end</span>

<span class="hljs-comment">-- Create an instance of the Animal class.</span> animal = Animal:new() <span class="hljs-comment">-- Call an instance method.</span> animal.breathe() <span class="hljs-comment">-- I'm breathing...</span></pre></div><p id="cf40">When using tables as classes in Lua, there are two very important concepts, namely, metatable and metamethod.</p><p id="e70c">In Lua, a metatable is a special table (well, it’s just a regular table but used for special purposes) that can be associated with another table to customize its behavior or provide fallbacks for non-existent keys. Metamethods are functions/methods defined in the metatable which can provide operator overloading or implementing inheritance.</p><p id="4981">The most important method is <code>__index</code> which can accept a function with the table as the first parameter, and the key being accessed as the second one. Therefore, a verbose version of the constructor can be written as:</p><div id="fe31"><pre><span class="hljs-comment">-- Create a constructor function for the class:</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">Animal:new</span><span class="hljs-params">()</span></span> <span class="hljs-keyword">local</span> newAnimal = {}

<span class="hljs-keyword">local</span> metadataTable = {}
metadataTable.<span class="hljs-built_in">__index</span> = <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-params">(_, key)</span></span>
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">self</span>[key]
<span class="hljs-keyword">end</span>

<span class="hljs-built_in">setmetatable</span>(newAnimal, metadataTable)
<span class="hljs-keyword">return</span> newAnimal

<span class="hljs-keyword">end</span></pre></div><p id="2bde">The table passed in (here <code>newAnimal</code>) is not used and thus can be replaced with the dummy variable <code>_</code>.</p><p id="431c">For class instantiation and inheritance, the use of the <code>__index</code> metamethod is so common that Lua provides a shortcut. Even though <code>__index</code> is called a metamethod, it can accept a table as the value as shown above. This is another syntactic sugar for the verbose version above.</p><p id="206d">Actually, since the <code>setmetatable()</code> function returns the table back, we can simplify the constructor as follows, which is very commonly used in practice:</p><div id="a499"><pre><span class="hljs-comment">-- Create a constructor function for the class:</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">Animal:new</span><span class="hljs-params">()</span></span> <span class="hljs-keyword">local</span> newAnimal = {}

<span class="hljs-keyword">return</span> <span class="hljs-built_in">setmetatable</span>(newAnimal, {<span class="hljs-built_in">__index</span> = <span class="hljs-built_in">self</span>})

<span class="hljs-keyword">end</span></pre></div><p id="b856">The metamethod <code>__index</code> is assigned to a metatable created on the fly.</p><p id="0b98">With the knowledge above, class inheritance is easier to understand:</p><div id="98a6"><pre><span class="hljs-comment">-- Well, in Lua, an instance of a class can be treated as another class, and</span> <span class="hljs-comment">-- it's still just a table...</span> <span class="hljs-comment">-- It inherits all the properties and methods of its parent.</span> Bird = Animal:new()

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">Bird:fly</span><span class="hljs-params">()</span></span> <span class="hljs-built_in">print</span>(<span class="hljs-string">"I can fly!"</span>) <span class="hljs-keyword">end</span>

bird = Bird:new() <span class="hljs-comment">-- Inherits from Animal.</span> bird.breathe() <span class="hljs-comment">-- Also inherits from Animal.</span> bird.fly() <span class="hljs-comment">-- New in the Bird class</span></pre></div><p id="f42b">A property/method will be checked in the current instance, the class, the parent class, the grandparent class, etc, whichever is the first one that has the given property/method. If none can be found in all of them, <code>nil</code> be returned.</p><h2 id="12e9">Splash scripting example</h2><p id="5e2e">Finally, let’s check a simple example of Splash from the <a href="https://splash.readthedocs.io/en/stable/scripting-tutorial.html">official document</a>, which shall be fairly simple to understand now:</p><div id="f70d"><pre><span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">main</span><span class="hljs-params">(splash, args)</span></span> splash:go(<span class="hljs-string">"http://example.com"</span>) splash:wait(<span class="hljs-number">0.5</span>) <span class="hljs-keyword">local</span> title = splash:evaljs(<span class="hljs-string">"document.title"</span>) <span class="hljs-keyword">return</span> {title=title} <span class="hljs-keyword">end</span></pre></div><p id="bb9a">As you see the colon is used very heavily in Splash. It can be mysterious if you don’t know Lua. However, with the knowledge of this post, you should be very comfortable working with it now.</p><p id="7570">Some further posts will be published on how to use Lua scripting in Splash for scraping JavaScript web pages in more detail.</p><h2 id="8d7d">Related articles</h2><ul><li><a href="https://levelup.gitconnected.com/how-to-build-a-scraping-project-with-scrapy-and-mongodb-46e78b6549e3">How to build a scraping project with Scrapy and MongoDB</a></li><li><a href="https://lynn-kwong.medium.com/how-to-scrape-javascript-webpages-using-selenium-in-python-21d56731bb1f">How to scrape JavaScript webpages using Selenium in Python</a></li></ul></article></body>

Learn the Basics of Lua for Web Scraping as a Python Developer

Get started with the essentials of Lua in 10 minutes

Image by Clker-Free-Vector-Images (Quill Pen Write) in Pixabay

Lua is a lightweight and high-level programming language that is typically employed for scripting purposes. It was designed with the intention to be integrated into other applications and allows developers to extend the functionality of their software by custom scripts.

As a Python developer, you may normally not have the chance to work with Lua. However, if you need to scrape JavaScripe web pages in your work, you will have a high chance to use it because of Splash, a lightweight and scriptable browser engine developed by Zyte (previously Scrapinghub), the same company that develops Scrapy.

Lua is used in Splash as a scripting language to provide more advanced control over the web scraping process. With Lua scripts, you can interact with web pages, manipulate the DOM, execute advanced JavaScript, etc. In this post, we will introduce the basics of Lua that are essential for web scraping using Splash. You will then be able to understand Lua scripts in Splash and can start to write scripts by yourself.

Install Lua

Actually, for this post, we don’t need to install Lua, you can just try the commands in Lua Live Demo. However, if you want to run Lua on your own computer locally, you can simply download the source code and build it.

Basic syntax

This part is not meant to be comprehensive and will only cover the essentials that will very likely be needed in Splash scripting. For a more comprehensive introduction, the book “Programming in Lua” and the official reference manual are recommended.

  1. The comments in Lua start with two hyphens (--) as in SQL.
  2. Lua is case-sensitive.
  3. No need to declare variables in Lua before accessing them. Variables are by default global but can be changed to local by declaring them with the local keyword.
  4. Lua is dynamic typing meaning the types are inferred from the values as in Python. We don’t need to (and can’t) specify the types when we declare variables.
  5. A string can be created with single, double quotes, or double curly brackets ([[]]). Double curly brackets are used to write multi-line strings which are very commonly used in Splash because JavaScript code is normally written as multi-line strings.
  6. nil is similar to None in Python. However, it does more than serve as an empty or undefined value in Lua. A variable whose value is nil will garbage collected in Lua.
  7. Only false and nil are falsy in Lua, and any other value is truthy, including 0 and empty strings.
  8. Functions are first-class values in Lua, similar to that in Python, meaning functions can be used in the same way as other types of data. They can be stored in variables, passed to another function, or returned from another function.
  9. Table is the only data structure in Lua. There are no other data structures like list/array, dictionary/object, etc, which are commonly found in other languages. However, all other data structures can be constructed based on tables as we will see later.
  10. When a function is called with a string or a table, the parentheses can be omitted. This can be confusing for beginners.
  11. Tables can be treated as objects and can have methods. The methods can be called with either a dot (obj.method()) or a colon (obj:method()). The latter is a syntactic sugar for obj.method(obj). This is very commonly used in Splash and will be introduced in more detail later.

We will then further illustrate some parts that need a further introduction with some simple code.

Variables

It should be emphasized that it is global variables that do not need to be declared, local variables still need to be declared. Otherwise, they will be global, even when created inside a function:

function testVariables()
    var1 = 100
    local var2 = 200
    print(var1, var2)
end

testVariables() -- 100, 200

print(var1, var2) -- 100, nil

As you see, var1 is a global variable even though it’s created in a function. As a best practice, we also always declare variables as local unless they must be globally used.

Functions

As mentioned above, functions are first-class values in Lua, meaning they can be stored in variables, passed to another function, or returned from another function.

A function can be created with the function keyword directly:

function echo(var)
    print(var)
end

It can also be created anonymously and then assign to a variable:

echo = function (var)
    print(var)
end

Actually, the former can be seen as a syntactic sugar for the latter. It’s more prominent when creating functions for a table:

obj = {}

function obj.echo(var)
    print(var)
end

-- Above is the same as:
obj.echo = function (var)
    print(var)
end

-- We can call both with the same syntax:
obj.echo(100) -- 100

Closures

A very commonly encountered concept in Lua is closure, which is basically a function returned by another function. An important feature of closure is that it can remember and update the variables passed in from the parent function.

Let’s see it in a simple example:

function createAdder(initVal)
    local value = initVal or 0  -- This is the way to set default value in Lua.

    return function (num)
        value = value + num
        print(value)
    end
end

adder = createAdder()
adder(1) -- 1
adder(2) -- 3

adder100 = createAdder(100)
adder100(1) -- 101
adder100(2) -- 103

Closures also demonstrate that functions in Lua are first-class values and can be returned from another function.

Arrays

As mentioned above, the only data structure in Lua is table, and there is no such data structure of array or list. However, tables can be used to create arrays natively in Lua. You just need to put discrete values in curly brackets, much like creating sets in Python:

arr = {"red", "green", "blue"}

And then we can access the values by index. However, note that unlike in most other programming languages, the index starts at 1 for Lua!

print(arr[1]) -- red
print(arr[3]) -- blue

Under the hood, arrays in Lua are still associative arrays, which are collections of key-value pairs where each key is linked to a specific value. The above array is the same as:

arr = {[1]="red", [2]="green", [3]="blue"}

Note that the indices must be put in square brackets if they are specified explicitly.

We can use the for loop to iterate the items of an array. Let’s define a function that can do such a job and print the array in a nice way:

function printArr(arr)
    if not arr then
        print(arr)  -- nil
        return
    end

    repr = '['

    for _, v in ipairs(arr) do
        repr = repr .. tostring(v) .. ', '
    end

    repr = string.gsub(repr, ",%s*$", "") .. ']'
    print(repr)
end

arr = {"red", "green", "blue"}
printArr(arr) -- [red, green, blue]

This simple example has several commonly used knowledge points of Lua:

  • Note the syntax of the if condition and for loop in Lua. We need to use then … end or do … end explicitly to denote a code block in Lua.
  • ipairs() return the index/value pairs of an array. Since we are not using the index here, it’s assigned to a dummy variable (_) which is the same as in Python.
  • .. is used to concatenate strings in Lua. Non-string values will be converted to strings using the tostring() function before concatenation.
  • The string.gsub() function searches for a pattern in a string variable and replaces it with a replacement string. The pattern is similar to regular expressions and in most cases works in the same way.

As a side note, we can get the length of an array with the hash operator (#) and can thus loop through it using the numeric for loop:

arr = {"red", "green", "blue"}

-- Note that the range includes both ends.
for i = 1, #arr do
    print(arr[i])
end

-- red
-- green
-- blue

We can use table.insert() and table.remove() to insert or remove an item in an array:

table.insert(arr, "black") -- Inserted in the end.
table.insert(arr, 2, "pink") -- Insert at a specific position.
printArr(arr)  -- [red, pink, green, blue, black]

arr.remove(arr)  -- Remove the last item.
arr.remove(arr, 3)  -- Remove the item at a specific position.
printArr(arr) -- [red, pink, blue]

Associative arrays

An associative array in Lua is a collection of key-value pairs, where each key is linked to a specific value. It is similar to the dictionaries in Python. However, from a technical point of view, it’s more similar to the objects in vanilla JavaScript.

Firstly, if the keys are strings which are also valid identifier names in Lua, we can use them as keys directly, and no need to use quotes and square brackets. This is also the most common use case:

myTable = {value=100}

Technically, it’s the same as:

myTable = {['value']=100}

When the keys are variables, numbers, reserved keywords like if and for, or strings not valid as identifier names in Lua, they must be put in square brackets:

myTable= {
    value=100,
    [1]='color',
    ['if']=true,
    ['1stName']='John',
    ['last name']='Doe'
}

When the key is a string that is also a valid identifier name, we can access its value either using a dot or a pair of square brackets:

print(myTable.value) -- 100
print(myTable["value"])  -- 100

print(myTable[value]) -- nil

Note that the third one returns nil. This is because the value of the value variable is used as the key, which is nil. nil does not exist as a key in the table, and it’s actually not allowed. However, if there is a variable called value in your code, you may get unexpected results.

For other types of keys, you must always use square brackets to access the value:

print(myTable[1])  -- color
print(myTable['if'])  -- true
print(myTable['1stName']) -- 'John'
print(myTable['last name']) -- 'Doe'

We can loop through the key/value pairs of a table using the pairs() function. Note that the keys are not ordered and may be different the sequence when they are created:

for k, v in pairs(myTable) do
    print(k .. ' -> ' .. tostring(v))
end

-- value -> 100
-- last name -> Doe
-- 1 -> color
-- if -> true
-- 1stName -> John

Classes and objects

Lua is not a native object-oriented programming (OOP) language and thus there are no such concepts as classes or objects. Everything (classes or objects) is just tables in Lua. However, OOP can be realized easily with tables.

Firstly, a table can be seen as an object already and we can add functions to it as we saw previously. The one demonstrated above is like a static method that does not require an instance. We can also create classical instance methods that do require an instance. It can be realized with the “magical” colon in Lua:

person = {firstName = "John", lastName = "Doe"}

function person:getFullName()
    return self.firstName .. ' ' .. self.lastName
end

print(person:getFullName())  - John Doe

In this example, self refers to the object itself calling the function, similar to the self in Python.

The function declaration using a colon is just a syntactic sugar for the following declaration (yes, there are many sugars 🍬 in Lua):

person = {firstName = "John", lastName = "Doe"}

function person.getFullName(self)
    return self.firstName .. ' ' .. self.lastName
end

print(person.getFullName(person))

Understanding this syntax sugar is very important to understand classes, instantiation, and inheritance in Lua. Let’s demonstrate it with a simple example:

-- Create a class, which is just a table in Lua.
Animal = {}

-- Create a constructor function for the class:
function Animal:new()
    local newAnimal = {}

    -- Create a metadata table which can be associated with another table to customize its behavior.
    local metatable = {}
    metatable.__index = self
    setmetatable(newAnimal, metatable)
    return newAnimal
end

-- Create an instance method.
function Animal:breathe()
    print("I'm breathing...")
end

-- Create an instance of the Animal class.
animal = Animal:new()
-- Call an instance method.
animal.breathe() -- I'm breathing...

When using tables as classes in Lua, there are two very important concepts, namely, metatable and metamethod.

In Lua, a metatable is a special table (well, it’s just a regular table but used for special purposes) that can be associated with another table to customize its behavior or provide fallbacks for non-existent keys. Metamethods are functions/methods defined in the metatable which can provide operator overloading or implementing inheritance.

The most important method is __index which can accept a function with the table as the first parameter, and the key being accessed as the second one. Therefore, a verbose version of the constructor can be written as:

-- Create a constructor function for the class:
function Animal:new()
    local newAnimal = {}

    local metadataTable = {}
    metadataTable.__index = function (_, key)
        return self[key]
    end
    
    setmetatable(newAnimal, metadataTable)
    return newAnimal
end

The table passed in (here newAnimal) is not used and thus can be replaced with the dummy variable _.

For class instantiation and inheritance, the use of the __index metamethod is so common that Lua provides a shortcut. Even though __index is called a metamethod, it can accept a table as the value as shown above. This is another syntactic sugar for the verbose version above.

Actually, since the setmetatable() function returns the table back, we can simplify the constructor as follows, which is very commonly used in practice:

-- Create a constructor function for the class:
function Animal:new()
    local newAnimal = {}

    return setmetatable(newAnimal, {__index = self})
end

The metamethod __index is assigned to a metatable created on the fly.

With the knowledge above, class inheritance is easier to understand:

-- Well, in Lua, an instance of a class can be treated as another class, and
-- it's still just a table...
-- It inherits all the properties and methods of its parent.
Bird = Animal:new()

function Bird:fly()
    print("I can fly!")
end

bird = Bird:new()  -- Inherits from Animal.
bird.breathe()  -- Also inherits from Animal.
bird.fly()  -- New in the Bird class

A property/method will be checked in the current instance, the class, the parent class, the grandparent class, etc, whichever is the first one that has the given property/method. If none can be found in all of them, nil be returned.

Splash scripting example

Finally, let’s check a simple example of Splash from the official document, which shall be fairly simple to understand now:

function main(splash, args)
  splash:go("http://example.com")
  splash:wait(0.5)
  local title = splash:evaljs("document.title")
  return {title=title}
end

As you see the colon is used very heavily in Splash. It can be mysterious if you don’t know Lua. However, with the knowledge of this post, you should be very comfortable working with it now.

Some further posts will be published on how to use Lua scripting in Splash for scraping JavaScript web pages in more detail.

Related articles

Lua
Python
Scripting
Splash
JavaScript
Recommended from ReadMedium