Data structures example: linked lists
August 13, 2007 on 1:11 am | In Actionscript, data structures |What’s a linked list ?
A linked list is very similar to an array in a way that both are structures capable of storing elements sequentially. However, taking a closer look reveals that they work totally different internally. Like all arrays the build-in array class in ActionScript stores multiple elements in a single array instance accessed by indexing, while a linked list uses a separate object instance for each data element. This object is called a node, and besides storing the data it also needs to hold a reference to the next node in the list. The result is the most simple form of a linked list, the singly linked list:
If a node also stores a reference to the previous node it can move in both directions, forth and back; this is obviously called a doubly linked list, which is shown below.
The SLinkedList and DLinkedList classes in the data structure package act as containers for storing the nodes and provide methods for manipulating (appending/deleting etc.) the list, taking care of everything. But a linked list can also be solely defined by a bunch of node objects, without using a manager class. For example, in a singly linked list, a node object could be defined like this:
class Node
{
var data:*;
var next:Node;
}
Then a basic functional list could be created like this:
var a:Node = new Node(); var b:Node = new Node(); var c:Node = new Node(); a.next = b; b.next = c; c.next = null;
Pretty simple, right? Note that you don’t have to store node ‘b’ and ‘c’ explicitly - as long as node ‘a’ is stored properly, the garbage collector has no way of removing them, so they stay alive.
A doubly linked list is more flexible and some operations like unlinking nodes from the list are faster. So why didn’t I omit the singly linked version? Because if you know in advance that you’ll only move in one direction and seldom modify the list, the singly linked version is a good choice because it needs less memory and you get a slight performance gain, as the node management is a little bit simpler behind the scenes.
Arrays vs. Linked Lists
A weakness from which an array suffers is that although it gives direct access to any array cell, it’s clumsy when it comes to modifying the array, like removing or inserting elements after it has been initialized. This is demonstrated below, where you can click a cell and drag it to remove it from the array, which is essentially a splice(nodeIndex, 1) operation.
Now the resulting ‘gap’ is filled, and all elements right of the gap are shifted one position to the left. This takes time and gets worse the bigger the array gets. On very large arrays this operation can lead to a hefty slow down (see also the queue post) and should be avoided for real time applications. The worst-case happens when you remove the first element (shift()), while only the last element can be removed instantly (pop()). The same of course applies for inserting elements. So in the end an array is great for a stack, but bad for a queue if implemented in a naive way.
Now this is where a linked list shines, because data can be removed or inserted almost instantly, making it perfect for situations where you need a structure that constantly changes.
As with the array demo, you can drag a node to the screen edge to unlink it from the list. In addition, you can also insert a node by pressing the ‘I’ key. Watch what happens: no large chunks of data have to be moved around in memory, just the references, visualized by the arrows, need to be updated, which is fast and simple. If you hold a node near an edge of the screen, you see how the links would change. Grey arrows represent obsolete links that will be removed together with the dragged node. The transparent blue and purple links are rerouted to ‘bridge’ the current node.
Looking at the code, each node has a small helper function to unlink it from the list that looks like this:
function unlink():void
{
//new purple link
if (prev != null) prev.next = next;
//new blue link
if (next != null) next.prev = prev;
//obsolete grey links
next = prev = null;
}
If the first (head) or last (tail) node is removed, this gets even simpler:
if (node == head)
{
head = head.next;
}
else if (node == tail)
{
tail = tail.prev;
}
If you want to remove a node from the list using the SLinkedList or DLinkedList class, you first create an iterator object pointing to the node, and then remove it later on at any time calling the remove() method on the iterator itself or the linked list:
var itr:DListIterator = new DListIterator(myList, myNode); itr.remove(); //or myList.remove(itr);
This is even possible if the list was modified before. With an array, you have to keep an eye on the array index of the stored item whenever you modify the array.
Another advantage of linked lists is that they are more human-readable. Modifying indices by hand is error-prone and I think the most common mistakes come from the zero-based nature of arrays (the last element is n-1 and not n). To show you the difference, the common brute force method for checking a number of objects against each other (e.g. for collision queries), implemented with arrays and linked lists are shown below.
//arrayed version
var k:int = myArr.length;
for (var i:int = 0; i < k - 1; i++)
{
oi = myArr[i];
for (var j:int = i + 1; j < k; j++)
{
oj = myArr[j];
collide(oi, oj);
}
}
//linked version
var node:SListNode = head;
while (node)
{
//check node against all following nodes
var walker:SListNode = node.next;
while (walker)
{
collision(node, walker);
walker = walker.next;
}
//repeat with next node
node = node.next;
}
Iterating over a linked list
Here are the basic ways to iterate over a list, starting with the ‘cleanest’ way using the iterator pattern. It decouples the iteration from the implementation resulting in a common interface all data structures in the package support.
//get iterator instance
var itr:DListIterator = new DListIterator(list);
//or simply
itr = list.getListIterator();
//start iteration
for (itr.start(); itr.valid(); itr.forth())
{
trace(itr.node, itr.data);
}
This can also be written using a while loop:
//restart
itr.start();
while (itr.valid())
{
tr(itr.node, itr.data);
itr.forth();
}
Both iteration styles are fine in most cases, but when performance is critical you should definitely avoid using the iterator interface. Although it offers a unified way of accessing data, the iterator pattern needs many function calls. Therefore we modify the second iteration method and change it slightly to arrive at a very lightweight iteration method:
//take the first node
var walker:DListNode = list.head;
while (walker)
{
trace(walker, walker.data);
walker = walker.next;
}
The code above just uses the node’s pointer to the next node to successively jump from node to node until a dead end is reached, and that happens when the tail node is referenced, which obviously has no valid next node.
Performance & optimization
Comparing the latter while loop with a standard for loop yields that it’s slightly slower, but there isn’t really a big difference between the loop bodies. Keep in mind that it’s more important what you are actually doing inside the loop body than purely looking at the loop code itself, so the numbers won’t have a measurable impact on performance. Anyway here are some numbers:
| size | for (i = 0; i < k; i++) | while(i) {i = i.next} |
| 1e+7 | 30 ms | 164 ms |
| 1e+6 | 5 ms | 17 ms |
| 1e+5 | - | 1 ms |
But let’s move from empty loop bodies and try to access the data, which is much more interesting:
var val:*
//arrayed version, [] access
for (var i:int = 0; i < 1e+7; i++)
{
val = myArray[i];
}
//linked version, '.' access
while (walker)
{
val = walker.data;
walker = walker.next;
}
Quick note: leaving val untyped is surprisingly a lot faster than strongly typed.
| size | array access [] | object access . |
| 1e+7 | 131 ms | 131 ms |
| 1e+6 | 13 ms | 13 ms |
| 1e+5 | 1 ms | 1 ms |
So, no difference here, meaning that linked lists can be a perfect replacement for arrays in terms of data access speed, at the same time offering superior performance when modifying big data sets.
Instead of using a list manager like the DLinkedList class, you can implement a custom ‘low-level’ linked list if you want to get the most out of the AVM. All you need is a basic node class, capable of storing data and and a reference to other nodes. But be aware that if you start playing around with references, you should assure that every reference is removed properly if unused, otherwise you quickly run into hairy problems resulting in strange behavior and memory waste/leak…
That should be all you need to know about linked lists. As usual you can download the examples for this article here.
See also:
The graph class
The queue class
The tree class
You leave me no choice with your excellent demonstrations and straight facts to use linked lists more often in Flash. I have been using Arrays many times to keep it simple for the others working on the same projects but I agree on the speed, the issues associated with arrays and the slowness of large arrays (had to use them in a large map project for this reason as items flew in and out of an array it crawled).
If the ones I am working with need help I will point them here and say go, learn how to be an AS3 Jedi at polygonal labs.
Comment by Ryan [draw.logic] — August, 13 2007 #
Excellent article Michael, I was wondering what do you need to do if you want a partially linked list, i.e.
1 3 > 4 6 > 7 > 8 9
What class would you use? SLinkedList and DLinkedList classes are too specific ;)
Cheers!
Comment by Og2t — August, 15 2007 #
It ate my left braces <, so let’s try once again:
1 < 3 <> 4 > 5 < 6 <> 7 <> 8 < 9
Comment by Og2t — August, 15 2007 #
I never thought about that but if you cut the list into chunks you have to store each chunk separately, otherwise it will be lost. Maybe you can just store the first node of each chunk and put it into a doubly linked list. Then for iteration you add a second layer: first iterate over the ‘container’ list, then for each container node go inside it and iterate over the partially list. arrays would be also fine. But I can image that the whole thing will be quite complicated to manage.
Comment by Michael — August, 15 2007 #
Another great article, keep’em coming!, thanks Micheal
Og2t, have you seen Michael’s graph class?
http://lab.polygonal.de/2007/06/13/data-structures-example-the-graph-class/
Comment by Bob — August, 15 2007 #
Thanks Bob, that was exactly what I was looking for :)
Comment by Og2t — August, 17 2007 #
Great stuff. To the point, and very helpful. Glad you’re putting this stuff out for us.
I ran into a very small bug but thought I’d point it out cause it confused me for a short moment: In SLinkedList.as, the remove function will ‘_count–’ twice when removing the head node through the ‘removeHead()’ call.
Just trying to help,
Thanks again
Comment by Bryce — August, 17 2007 #
@bryce thanks, yes it’s a bug :-) fixed in SVN.
Comment by Michael — August, 17 2007 #
Hi Michael,
I can not really confirm your results. Earlier I did some tests using simple linked lists vs. arrays and it was definitly faster.
http://blog.je2050.de/2007/06/06/lists-faster-than-array/
But that article is already old. Usually I build my lists with N+1 elements now and go through
a = list.next;
while (a) { … a = a.next }
It makes sense to me, since array access is always slower than direct object access. For instance if you have sinTable[i] and cosTable[i] and you always access both it makes more sense to do table[i] and have an element in it so that you do element.sin and element.cos. You have one array access + 2 object values but its still a lot faster than accessing two arrays.
Comment by Joa Ebert — August, 28 2007 #
synthetic benchmarks are always problematic because performance depends on the overall application design with affects memory management or garbage collector efficiency.
so i think someone cannot say ‘one object access equals two array access operations’. for example take a node object in a linked list and add more data to it (more members, large strings, or nested objects / arrays) and the linked list iterations becomes slower, where the array access behaves different. the main performance gain comes from the fact that a linked list can be modified in constant time. also the loop head itself is meaningless - the bottleneck is always the loop body.
Comment by Michael — August, 28 2007 #
I am not saying that you can count the access like that. But basically your tests do not give an idea of what you can achieve with the linked lists. For instance the fact that you have to cast the object you get from an array while you just use an untyped value. With the list you do not need to do that.
In fact I used (custom) lists in various situations now — always including strong typing — and they ware a lot faster than using the plain array (if you just iterate over every single element). Although I don’t know what would happen if each list node is a ‘big class’ as you say.
Comment by Joa Ebert — August, 28 2007 #
I can’t seem to move nodes about properly. I created an iterator
var itr:DListIterator = layerContents[i].getListIterator();
The iterator points to the start of my Linked List. I want to move the first node to the end (the head to the tail). I have tried this in the code below using the insertAfter method
itr.node.insertAfter(itr.list.tail);
I can’t see any problems with it, and it returns no error messages, but it just doesn’t work, the list remains unchanged, I must be doing something wrong.
Help.
Comment by I'm having problems — September, 12 2007 #
see here: http://www.gotoandplay.it/_forums/viewtopic.php?t=3309
Comment by Michael — September, 12 2007 #
nice,
i use your classes in wow-engine
it’s simply amazing
Comment by Seraf — October, 7 2007 #
how to sort n split a linked list??????m a beginer……….
Comment by ah — November, 11 2007 #
directly sorting linked lists is currently not supported with my lib, but I’m working on it. for now, you have to convert it to an array, use the array.sort() function, then write it back.
Comment by Michael — November, 11 2007 #
hi Michael, greeting for your work!
my question is : i’m used to make Dictionary instead Array, for easying to find items…
Dictionary v.s. Linked List?
sorry for my bad english
Comment by nk — January, 4 2008 #
[...] been using the great opensource package from Michael Baczynksy at polygonal.de for the use of lists, queues, stacks and trees . While it works great, it has some explicit design decisions that we do [...]
Pingback by The collections package at dpdk Open Source — October, 9 2008 #
thanks alot michael for building that set of data structures for the community.
But one issue against linked lists and iterators and a plus for using for loops with arrays is the additional variable i you get. in a lot of cases you do not only want to get the data out of a structure but you also want to arrange something like data.x = i * gridWidth or something similar.
While using iterators you have to come up with an additional vairable declared outside the for scope to get a counter…
Comment by Mark A. — November, 8 2008 #
a linked list differs conceptually from an array - it’s not a replacement for an array.
in your example using a linked list would just complicate things because an element in a linked list has no index associated with it.
Comment by Michael — November, 8 2008 #
Thanks alot.
I have used polygonal.de library. It is very usefull for me to create Graph, Tree.
Comment by Satish — January, 14 2009 #
I like the Linked list from the C++ but this one like more like Java or C#. Actually don’t see the reason of iterator SList or DList. First I think only one is enough. Actually I could use Node with null “data” and use node “next” as pointer to head, to tile to or any element also in iteration could be used… Just idea from C++ experiences…
Thx
Comment by winxalex — October, 13 2009 #
Actually I’m saving the pointer to all nodes to hashTable (Dictionary) and when I want to remove the node from the list I need to create iterator to point to list to point to pointer in the Dictionary. Which it will be easly to send emptyNode as pointer to remove(emptyNode) which itr.node do.
Comment by winxalex — October, 13 2009 #
I understand - removal of nodes is a bit painful. I’ve refined linked lists in the hx3ds library, which is a port of as3ds to HaXe. You can get it from here: http://lib.haxe.org/p/hx3ds. Removal then works like this:
var obj = new Obj(); = linkedList.append(obj);
var node:DLLNode
…
node.remove(); //unlink from ll
but you still have to keep track of the nodes somewhere… I usually store them inside the data that was added.
Comment by Michael — October, 13 2009 #
I hope that base version is not obsolite and ds would got refinements too.
Thx
Comment by winxalex — October, 13 2009 #
var node:Node=new Node(null);
for(node=list.head;node.next!=null;node=node.next)
{
}
Comment by winxalex — October, 13 2009 #
as3ds is not updated anymore, but the hx3ds package contains a swc file for targeting AS3 - auto-completion is a bit fishy but this will change in the future. easiest way to get it: run the haxe installer, then in the command prompt type: haxelib install hx3ds This will download everything to haxedir/lib/hx3ds
Comment by Michael — October, 13 2009 #
It’s really a nice example of linked list…
Comment by Ajeet — January, 23 2010 #