Data structures example: linked lists

What’s a linked list ?

A linked list is very similar to an array in a way that both are structures capable of storing elements sequentially. However, taking a closer look reveals that they work totally different internally. Like all arrays the build-in array class in ActionScript stores multiple elements in a single array instance accessed by indexing, while a linked list uses a separate object instance for each data element. This object is called a node, and besides storing the data it also needs to hold a reference to the next node in the list. The result is the most simple form of a linked list, the singly linked list:

Figure 1: A singly linked list

If a node also stores a reference to the previous node it can move in both directions, forth and back; this is obviously called a doubly linked list, which is shown below.

Figure 1: A doubly linked list

The SLinkedList and DLinkedList classes in the data structure package act as containers for storing the nodes and provide methods for manipulating (appending/deleting etc.) the list, taking care of everything. But a linked list can also be solely defined by a bunch of node objects, without using a manager class. For example, in a singly linked list, a node object could be defined like this:

class Node
{
    var data:*;
    var next:Node;
}

Then a basic functional list could be created like this:

var a:Node = new Node();
var b:Node = new Node();
var c:Node = new Node();

a.next = b;
b.next = c;
c.next = null;

Pretty simple, right? Note that you don’t have to store node ‘b’ and ‘c’ explicitly – as long as node ‘a’ is stored properly, the garbage collector has no way of removing them, so they stay alive.

A doubly linked list is more flexible and some operations like unlinking nodes from the list are faster. So why didn’t I omit the singly linked version? Because if you know in advance that you’ll only move in one direction and seldom modify the list, the singly linked version is a good choice because it needs less memory and you get a slight performance gain, as the node management is a little bit simpler behind the scenes.

Arrays vs. Linked Lists

A weakness from which an array suffers is that although it gives direct access to any array cell, it’s clumsy when it comes to modifying the array, like removing or inserting elements after it has been initialized. This is demonstrated below, where you can click a cell and drag it to remove it from the array, which is essentially a splice(nodeIndex, 1) operation.

Figure 3: An array. Drag cells to the screen edge to remove them from the array.

Now the resulting ‘gap’ is filled, and all elements right of the gap are shifted one position to the left. This takes time and gets worse the bigger the array gets. On very large arrays this operation can lead to a hefty slow down (see also the queue post) and should be avoided for real time applications. The worst-case happens when you remove the first element (shift()), while only the last element can be removed instantly (pop()). The same of course applies for inserting elements. So in the end an array is great for a stack, but bad for a queue if implemented in a naive way.

Now this is where a linked list shines, because data can be removed or inserted almost instantly, making it perfect for situations where you need a structure that constantly changes.

Figure 4: A linked list, drag nodes to the screen edge to remove them, or press I to insert a node.

As with the array demo, you can drag a node to the screen edge to unlink it from the list. In addition, you can also insert a node by pressing the ‘I’ key. Watch what happens: no large chunks of data have to be moved around in memory, just the references, visualized by the arrows, need to be updated, which is fast and simple. If you hold a node near an edge of the screen, you see how the links would change. Grey arrows represent obsolete links that will be removed together with the dragged node. The transparent blue and purple links are rerouted to ‘bridge’ the current node.
Looking at the code, each node has a small helper function to unlink it from the list that looks like this:

function unlink():void
{
    //new purple link
    if (prev != null) prev.next = next;
    
    //new blue link
    if (next != null) next.prev = prev;
    
    //obsolete grey links
    next = prev = null;
}

If the first (head) or last (tail) node is removed, this gets even simpler:

if (node == head)
{
    head = head.next;
}
else if (node == tail)
{
    tail = tail.prev;
}

If you want to remove a node from the list using the SLinkedList or DLinkedList class, you first create an iterator object pointing to the node, and then remove it later on at any time calling the remove() method on the iterator itself or the linked list:

var itr:DListIterator = new DListIterator(myList, myNode);

itr.remove();

//or
myList.remove(itr);

This is even possible if the list was modified before. With an array, you have to keep an eye on the array index of the stored item whenever you modify the array.

Another advantage of linked lists is that they are more human-readable. Modifying indices by hand is error-prone and I think the most common mistakes come from the zero-based nature of arrays (the last element is n-1 and not n). To show you the difference, the common brute force method for checking a number of objects against each other (e.g. for collision queries), implemented with arrays and linked lists are shown below.

//arrayed version
var k:int = myArr.length;
for (var i:int = 0; i < k - 1; i++)
{
    oi = myArr[i];
    for (var j:int = i + 1; j < k; j++)
    {
        oj = myArr[j];
        collide(oi, oj);
    }
}
//linked version
var node:SListNode = head;
while (node)
{
    //check node against all following nodes
    var walker:SListNode = node.next;
    while (walker)
    {
        collision(node, walker);
        walker = walker.next;
    }
    //repeat with next node
    node = node.next;
}

Iterating over a linked list

Here are the basic ways to iterate over a list, starting with the ‘cleanest’ way using the iterator pattern. It decouples the iteration from the implementation resulting in a common interface all data structures in the package support.

//get iterator instance
var itr:DListIterator = new DListIterator(list);

//or simply
itr = list.getListIterator();
    
//start iteration
for (itr.start(); itr.valid(); itr.forth())
{
    trace(itr.node, itr.data);
}

This can also be written using a while loop:

//restart
itr.start();

while (itr.valid())
{
    tr(itr.node, itr.data);
    itr.forth();
}

Both iteration styles are fine in most cases, but when performance is critical you should definitely avoid using the iterator interface. Although it offers a unified way of accessing data, the iterator pattern needs many function calls. Therefore we modify the second iteration method and change it slightly to arrive at a very lightweight iteration method:

//take the first node
var walker:DListNode = list.head;
while (walker)
{
    trace(walker, walker.data);
    walker = walker.next;
}

The code above just uses the node’s pointer to the next node to successively jump from node to node until a dead end is reached, and that happens when the tail node is referenced, which obviously has no valid next node.

Performance & optimization

Comparing the latter while loop with a standard for loop yields that it’s slightly slower, but there isn’t really a big difference between the loop bodies. Keep in mind that it’s more important what you are actually doing inside the loop body than purely looking at the loop code itself, so the numbers won’t have a measurable impact on performance. Anyway here are some numbers:

size for (i = 0; i < k; i++) while(i) {i = i.next}
1e+7 30 ms 164 ms
1e+6 5 ms 17 ms
1e+5 - 1 ms

But let’s move from empty loop bodies and try to access the data, which is much more interesting:

var val:*
//arrayed version, [] access
for (var i:int = 0; i < 1e+7; i++)
{
    val = myArray[i];
}
//linked version, '.' access
while (walker)
{
    val = walker.data;
    walker = walker.next;
}

Quick note: leaving val untyped is surprisingly a lot faster than strongly typed.

size array access [] object access .
1e+7 131 ms 131 ms
1e+6 13 ms 13 ms
1e+5 1 ms 1 ms

So, no difference here, meaning that linked lists can be a perfect replacement for arrays in terms of data access speed, at the same time offering superior performance when modifying big data sets.

Instead of using a list manager like the DLinkedList class, you can implement a custom ‘low-level’ linked list if you want to get the most out of the AVM. All you need is a basic node class, capable of storing data and and a reference to other nodes. But be aware that if you start playing around with references, you should assure that every reference is removed properly if unused, otherwise you quickly run into hairy problems resulting in strange behavior and memory waste/leak…

That should be all you need to know about linked lists.
As usual you can download the examples for this article here.

See also:
The graph class
The queue class
The tree class

34 Comments

  1. You leave me no choice with your excellent demonstrations and straight facts to use linked lists more often in Flash. I have been using Arrays many times to keep it simple for the others working on the same projects but I agree on the speed, the issues associated with arrays and the slowness of large arrays (had to use them in a large map project for this reason as items flew in and out of an array it crawled).

    If the ones I am working with need help I will point them here and say go, learn how to be an AS3 Jedi at polygonal labs.

  2. Excellent article Michael, I was wondering what do you need to do if you want a partially linked list, i.e.
    1 3 > 4 6 > 7 > 8 9
    What class would you use? SLinkedList and DLinkedList classes are too specific ;)
    Cheers!

  3. It ate my left braces <, so let’s try once again:

    1 < 3 <> 4 > 5 < 6 <> 7 <> 8 < 9

  4. I never thought about that but if you cut the list into chunks you have to store each chunk separately, otherwise it will be lost. Maybe you can just store the first node of each chunk and put it into a doubly linked list. Then for iteration you add a second layer: first iterate over the ‘container’ list, then for each container node go inside it and iterate over the partially list. arrays would be also fine. But I can image that the whole thing will be quite complicated to manage.

  5. Another great article, keep’em coming!, thanks Micheal

    Og2t, have you seen Michael’s graph class?
    http://lab.polygonal.de/2007/06/13/data-structures-example-the-graph-class/

  6. Thanks Bob, that was exactly what I was looking for :)

  7. Great stuff. To the point, and very helpful. Glad you’re putting this stuff out for us.
    I ran into a very small bug but thought I’d point it out cause it confused me for a short moment: In SLinkedList.as, the remove function will ‘_count–’ twice when removing the head node through the ‘removeHead()’ call.
    Just trying to help,
    Thanks again

  8. @bryce thanks, yes it’s a bug :-) fixed in SVN.

  9. Hi Michael,

    I can not really confirm your results. Earlier I did some tests using simple linked lists vs. arrays and it was definitly faster.

    http://blog.je2050.de/2007/06/06/lists-faster-than-array/

    But that article is already old. Usually I build my lists with N+1 elements now and go through

    a = list.next;
    while (a) { … a = a.next }

    It makes sense to me, since array access is always slower than direct object access. For instance if you have sinTable[i] and cosTable[i] and you always access both it makes more sense to do table[i] and have an element in it so that you do element.sin and element.cos. You have one array access + 2 object values but its still a lot faster than accessing two arrays.

  10. synthetic benchmarks are always problematic because performance depends on the overall application design with affects memory management or garbage collector efficiency.

    so i think someone cannot say ‘one object access equals two array access operations’. for example take a node object in a linked list and add more data to it (more members, large strings, or nested objects / arrays) and the linked list iterations becomes slower, where the array access behaves different. the main performance gain comes from the fact that a linked list can be modified in constant time. also the loop head itself is meaningless – the bottleneck is always the loop body.

  11. I am not saying that you can count the access like that. But basically your tests do not give an idea of what you can achieve with the linked lists. For instance the fact that you have to cast the object you get from an array while you just use an untyped value. With the list you do not need to do that.

    In fact I used (custom) lists in various situations now — always including strong typing — and they ware a lot faster than using the plain array (if you just iterate over every single element). Although I don’t know what would happen if each list node is a ‘big class’ as you say.

  12. I can’t seem to move nodes about properly. I created an iterator

    var itr:DListIterator = layerContents[i].getListIterator();

    The iterator points to the start of my Linked List. I want to move the first node to the end (the head to the tail). I have tried this in the code below using the insertAfter method

    itr.node.insertAfter(itr.list.tail);

    I can’t see any problems with it, and it returns no error messages, but it just doesn’t work, the list remains unchanged, I must be doing something wrong.

    Help.

  13. nice,
    i use your classes in wow-engine
    it’s simply amazing

  14. how to sort n split a linked list??????m a beginer……….

  15. directly sorting linked lists is currently not supported with my lib, but I’m working on it. for now, you have to convert it to an array, use the array.sort() function, then write it back.

  16. hi Michael, greeting for your work!

    my question is : i’m used to make Dictionary instead Array, for easying to find items…

    Dictionary v.s. Linked List?

    sorry for my bad english

  17. Mark A.

    thanks alot michael for building that set of data structures for the community.

    But one issue against linked lists and iterators and a plus for using for loops with arrays is the additional variable i you get. in a lot of cases you do not only want to get the data out of a structure but you also want to arrange something like data.x = i * gridWidth or something similar.

    While using iterators you have to come up with an additional vairable declared outside the for scope to get a counter…

  18. a linked list differs conceptually from an array – it’s not a replacement for an array.

    in your example using a linked list would just complicate things because an element in a linked list has no index associated with it.

  19. Thanks alot.
    I have used polygonal.de library. It is very usefull for me to create Graph, Tree.

  20. winxalex

    I like the Linked list from the C++ but this one like more like Java or C#. Actually don’t see the reason of iterator SList or DList. First I think only one is enough. Actually I could use Node with null “data” and use node “next” as pointer to head, to tile to or any element also in iteration could be used… Just idea from C++ experiences…
    Thx

  21. winxalex

    Actually I’m saving the pointer to all nodes to hashTable (Dictionary) and when I want to remove the node from the list I need to create iterator to point to list to point to pointer in the Dictionary. Which it will be easly to send emptyNode as pointer to remove(emptyNode) which itr.node do.

  22. I understand – removal of nodes is a bit painful. I’ve refined linked lists in the hx3ds library, which is a port of as3ds to HaXe. You can get it from here: http://lib.haxe.org/p/hx3ds. Removal then works like this:

    var obj = new Obj();
    var node:DLLNode = linkedList.append(obj);

    node.remove(); //unlink from ll

    but you still have to keep track of the nodes somewhere… I usually store them inside the data that was added.

  23. winxalex

    I hope that base version is not obsolite and ds would got refinements too.
    Thx

    1. as3ds is not updated anymore, but the hx3ds package contains a swc file for targeting AS3 – auto-completion is a bit fishy but this will change in the future. easiest way to get it: run the haxe installer, then in the command prompt type: haxelib install hx3ds This will download everything to haxedir/lib/hx3ds

  24. winxalex

    var node:Node=new Node(null);
    for(node=list.head;node.next!=null;node=node.next)
    {
    }

  25. It’s really a nice example of linked list…

  26. Is the DLLNode declared as final?
    For performance reasons.

    1. it doesn’t matter.

  27. var myDll:DLL=new DLL();
    myDll.append(100);
    myDll.append(101);
    myDll.append(102);
    myDll.append(103);

    myDll.close();
    var node:DLLNode=myDll.lastNodeOf(“108″,myDll.getNodeAt(2));//trace info is {DLLNode 101} ???????????

    1. thanks for reporting. this is bug and has been fixed in SVN.

  28. Hi Michael – I wanted to construct a Doubly linked list of just 6 items which could be dragged and reordered according to the user. The library seems to enable removal and insertion of new nodes but I can’t see a method to reorder existing nodes. Would I just use the insertBefore of insertAfter for that?

Trackbacks/Pingbacks

  1. [...] been using the great opensource package from Michael Baczynksy at polygonal.de for the use of lists, queues, stacks and trees . While it works great, it has some explicit design decisions that we do [...]

  2. [...] The Array is apparently the fastest possibility of storing elements sequentially. However, adding to (or removing from) an array requires the array to internally shift the position of all succeeding items by the number of elements added. This shifting is the more expensive the more the insertion occurred in front of the array and the longer the array is. Michael Baczynski (polygonal.de) has visulalised this behaviour in its linked list example. [...]

Get Adobe Flash player