What are Linear Data Structures?

We say a data structure is 'linear' if the items inside it are stored in a sequence.

Arrays, linked lists, and stacks are all linear data structures.

Why should I care?

We use these data structures every day in programming.

Even if you're already familiar with them, it's helpful to recap them occasionally.

In 5 minutes or less:

As we said in the introduction, a data structure is 'linear' if the elements form a sequence.

That means that the data structure has a first and last element, and each element is connected to its previous and next element.

An 'array' is a linear data structure; the items are stores sequentially.
A 'graph' is not a linear data structure; any node can be linked to any other node in the graph - there is no fixed 'sequence'.

(if you're not familiar with graphs, don't worry - there's a newsletter coming up that explores them in detail).

linear data structures represented as a series of boxes. Non-linear structures look like a 'graph'

Let's take a look at some common linear data structures...

Arrays

If you've done any programming, you're almost certainly familiar with the concept of arrays.

An array is like a bookcase; the items are stored next to each other, but we can jump to any item we like to read it.

The items in the array have an 'index' that allows us to reference them directly.

image on an array with some elements in

The ability to jump to any item we like to read its value is called 'random access', and is a huge advantage of an array.

We take this for granted, but this is not a property that many other 'linear data structures' have, as you'll see below.

When we allocate an array, we have to determine up-front how much space we need.

If we fill our array, we have to stop and allocate some more space. That means that while normal inserts into the array are very fast, occasionally we have to pause for a short time to make the array bigger - which takes some time.

Linked Lists

A linked list is a data structure where each item points to the next. We can't jump to any element directly like we can with an array. Instead, we have to access them in turn:

an image of a linked list

Linked lists are useful because, unlike an array, we don't need to decide up-front how much space we need. If we need to add a new item, we simply add it to the end.

That means that adding the 200th item has the same cost as adding the 2nd item. This predictable performance is an advantage of a linked list.

It's also easy to add or remove an item from the middle of the linked list, by simply changing some 'next' pointers.

Here's how we'd remove an item from a linked list:

removing an item from a linked list

This would be more difficult to do in an array, as we'd have to shift all of the remaining items to account for the new or removed item.

A variant of the linked list is the 'doubly-linked list', where each element not only points to the next element, but also the previous one.

This means we can traverse the data structure in either order, but still have the advantages of a linked list.

an image of a doubly-linked list

Queues

The queue is a 'First In First Out' (FIFO) data structure. That means that items are read in the same order that they are inserted.

This is just like the queue at the store, the first person to join the queue is the first person to be served.

an image of a queue

A printer queue is a good example of this data structure in use. The printer will print items in the order in which they were queued. If you send your document to the printer last, it will be the last thing to be printed.

Queues are also useful as 'buffers'.

Suppose we have two separate systems, one that reads messages and one that processes them. We don't want the system that reads messages to have to wait for each message to be processed before listening for another.

Putting a queue between them allows us to 'de-couple' those systems. The read process can keep adding items to the queue, safe in the knowledge the the processing side will pull the items off of the queue and process them eventually - no waiting required.

Stacks

The stack is a 'Last In First Out' data structure; the last item to be added is the first item to be read.

You can think of this like a stack of plates, the last plate to be added to the pile is the first plate we'd take off again:

an image of a stack

You might use a stack to implement 'undo' functionality, for example. The last task the user performed is the first one to be reversed when they click the 'undo' button.

The 'call stack' we see every day when running code is also a great example of a stack, but that's a topic for a future newsletter!

Want to know more?

Check out these links: