Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
What is a linked list, and how is it different from an array? How efficient or inefficient are its operations? What are its strengths and weaknesses? How can I construct and interact with a linked list? By the end of this objective, you will be able to answer all of these questions confidently.
A linked list is a simple, linear data structure used to store a collection of elements. Unlike an array, each element in a linked list does not have to be stored contiguously in memory.
For example, in an array, each element of the list [43, 32, 63
is stored in memory like so:
43
is the first item in the collection and is therefore stored in the first slot. 32
is the second item and is stored immediately next to 43
in memory. This pattern continues on and on.
In a linked list, each element of the list could be stored like so:
You can see here that the elements can be spaced out in memory. Because the elements are not stored contiguously, each element in memory must contain information about the next element in the list. The first item stores the data 43
and the location in memory (*3
) for the next item in the list. This example is simplified; the second item in the list 32
could be located anywhere in memory. It could even come before the first item in memory.
You might also be wondering what types of data can be stored in a linked list. Pretty much any type of data can be stored in a linked list. Strings, numbers, booleans, and other data structures can be stored. You should not feel limited using a linked list based on what type of data you are trying to store.
Are the elements in a linked list are sorted or unsorted? The elements in a linked list can be either sorted or unsorted. There is nothing about the data structure that forces the elements to be sorted or unsorted. You cannot determine if a linked list's elements are sorted by determining they are stored in a linked list.
What about duplicates? Can a linked list contain them? Linked lists can contain duplicates. There is nothing about the linked list data structure that would prevent duplicates from being stored. When you encounter a linked list, you should know that it can contain duplicates.
Are there different types of linked lists? If so, what are they? There are three types of linked lists: singly linked list (SLL), doubly linked list (DLL), and circular linked list. All linked lists are made up of nodes where each node stores the data and also information about other nodes in the linked list.
Each singly linked list node stores the data and a pointer where the next node in the list is located. Because of this, you can only navigate in the forward direction in a singly linked list. To traverse an SLL, you need a reference to the first node called the head. From the head of the list, you can visit all the other nodes using the next pointers.
The difference between an SLL and a doubly linked list (DLL) is that each node in a DLL also stores a reference to the previous item. Because of this, you can navigate forward and backward in the list. A DLL also usually stores a pointer to the last item in the list (called the tail).
A Circular Linked List links the last node back to the first node in the list. This linkage causes a circular traversal; when you get to the end of the list, the next item will be back at the beginning of the list. Each type of linked list is similar but has small distinctions. When working with linked lists, it’s essential to know what type of linked list.
Lookup
To look up an item by index in a linked list is linear time (O(n)
). To traverse through a linked list, you have to start with the head reference to the node and then follow each subsequent pointer to the next item in the chain. Because each item in the linked list is not stored contiguously in memory, you cannot access a specific index of the list using simple math. The distance in memory between one item and the next is varied and unknown.
Append
Adding an item to a linked list is constant time (O(1)
). We always have a reference point to the tail of the linked list, so we can easily insert an item after the tail.
Insert
In the worst case, inserting an item in a linked list is linear time (O(n)
). To insert an item at a specific index, we have to traverse — starting at the head — until we reach the desired index.
Delete
In the worst case, deleting an item in a linked list is linear time (O(n)
). Just like insertion, deleting an item at a specific index means traversing the list starting at the head.
Space
The space complexity of a linked list is linear (O(n)
). Each item in the linked list will take up space in memory.
The primary strength of a linked list is that operations on the linked list's ends are fast. This is because the linked list always has a reference to the head (the first node) and the tail (the last node) of the list. Because it has a reference, doing anything on the ends is a constant time operation (O(1)
) no matter how many items are stored in the linked list. Additionally, just like a dynamic array, you don't have to set a capacity to a linked list when you instantiate it. If you don't know the size of the data you are storing, or if the amount of data is likely to fluctuate, linked lists can work well. One benefit over a dynamic array is that you don't have doubling appends. This is because each item doesn't have to be stored contiguously; whenever you add an item, you need to find an open spot in memory to hold the next node.
The main weakness of a linked list is not efficiently accessing an "index" in the middle of the list. The only way that the linked list can get to the seventh item in the linked list is by going to the head node and then traversing one node at a time until you arrive at the seventh node. You can't do simple math and jump from the first item to the seventh.
Remember that linked lists have efficient operations on the ends (head and tail). There are two structures that only operate on the ends; queues and stacks. So, most queue or stack implementations use a linked list as their underlying data structure.
We can see the difference between how a linked list and an array are stored in memory, but why is this important? Once you see the problem with the way arrays are stored in memory, the benefits of a linked list become clearer.
The primary problem with arrays is that they hold data contiguously in memory. Remember that having the data stored contiguously is the feature that gives them quick lookups. If I know where the first item is stored, I can use simple math to figure out where the fifth item is stored. The reason that this is a problem is that it means that when you create an array, you either have to know how much space in memory you need to set aside, or you have to set aside a bunch of extra memory that you might not need, just in case you do need it. In other words, you can be space-efficient by only setting aside the memory you need at the moment. But, in doing that, you are setting yourself up for low time efficiency if you run out of room and need to copy all of your elements to a newer, bigger array.
With a linked list, the elements are not stored side-by-side in memory. Each element can be stored anywhere in memory. In addition to storing the data for that element, each element also stores a pointer to the memory location of the next element in the list. The elements in a linked list do not have an index. To get to a specific element in a linked list, you have to start at the head of the linked list and work your way through the list, one element at a time, to reach the specific element you are searching for. Now you can see how a linked list solves some of the problems that the array data structure has.
Let’s look at how we can represent a singly linked list graphically and in Python code. Seeing a singly linked list represented graphically and in code can help you understand it better.
How do you represent a singly linked list graphically? Let’s say you wanted to store the numbers 1, 2, and 3. You would need to create three nodes. Then, each of these nodes would be linked together using the pointers.
Notice that the last element or node in the linked list does not have a pointer to any other node. This fact is how you know you are at the end of the linked list.
What does a singly linked list implementation look like in Python? Let's start by writing a LinkedListNode
class for each element in the linked list.
Now, we need to build out the class for the LinkedList
itself:
Our class is super simple so far and only includes an initialization method. Let's add an append
method so that we can add nodes to the end of our list:
Now, let's use our simple class definitions for LinkedListNode
and LinkedList
to create a linked list of elements 1
, 2
, and 3
.
You must be able to understand and interact with linked lists. You now know the basic properties and types of linked lists, what makes a linked list different from an array, what problem it solves, and how to represent them both graphically and in code. You now know enough about linked lists that you should be able to solve algorithmic code challenges that require a basic understanding of linked lists.
Draw out a model of a singly-linked list that stores the following integers in order: 3,2,6,5,7,9
.
Draw out a model of a doubly-linked list that stores the following integers in order: 5,2,6,4,7,8
.
During this sprint, we will introduce you to some very common data structures: linked lists, queues, stacks, and binary trees. Additionally, we will teach you about searching through these data structures.
A basic understanding of and the ability to work with these data structures is crucial. These are probably the most common data structures you work with, and an excellent working understanding of them is essential for you to pass a technical interview.
In this module, you will learn all about linked lists. This a crucial data structure because they form the basis for many other data structures.
This module will teach about queues, stacks, and different implementation options for both. The queue and stack data structures come up frequently during technical interviews and form the basis for necessary traversal techniques that we will look at later.
In this module, you will learn about binary tree properties and binary search trees. These data structures commonly come up during technical interviews, so you need to be comfortable working with them.
In this module, you will learn about the different tree traversal methods and practice using them in algorithmic code challenges.
There are two ways to sort a linked list using bubble sort:
Exchanging data between nodes
Modifying the links between nodes
In this section, we will see how both these approaches work. We will use the bubble sort algorithm to first sort the linked list by changing the data, and then we will see how we can use bubble sort to change the links in order to sort the linked list.
Sorting Linked List by Exchanging Data
To sort a linked list by exchanging data, we need to declare three variables p
, q
, and end
.
The variable p
will be initialized with the start node, while end
will be set to None
.
It is important to remember that to sort the list with n
elements using bubble sort, you need n-1
iterations.
To implement bubble sort, we need two while loops. The outer while loop executes until the value of variable end
is equal to the self.start_node
.
The inner while loop executes until p
becomes equal to the end
variable. Inside the outer while loop, the value of p
will be set to self.start_node
which is the first node. Inside the inner while loop, the value of q
will be set to p.link
which is actually the node next to q
. Then the values of p
and q
will be compared if p
is greater than q
the values of both the variables will be swapped and then p
will point to p.ref
, which is the next node. Finally, the end
will be assigned the value of p
. This process continues until the linked list is sorted.
Let's understand this process with the help of an example. Suppose we have the following list:
Let's implement our algorithm to sort the list. We'll see what will happen during each iteration. The purpose of the bubble sort is that during each iteration, the largest value should be pushed to the end, hence at the end of all iterations, the list will automatically be sorted.
Before the loop executes, the value of end
is set to None
.
In the first iteration, p
will be set to 8, and q
will be set to 7
. Since p
is greater than q
, the values will be swapped and p
will become p.ref
. At this point of time the linked list will look like this:
Since at this point of time, p
is not equal to end
, the loop will continue and now p
will become 8 and q
will become 1. Since again p
is greater than q
, the values will be swapped again and p
will again become p.ref
. The list will look like this:
Here again, p
is not equal to end
, the loop will continue and now p
will become 8 and q
will become 6. Since again p
is greater than q
, the values will be swapped again and p
will again become p.ref
. The list will look like this:
Again p
is not equal to end
, the loop will continue and now p
will become 8 and q
will become 9. Here since p
is not greater than q
, the values will not be swapped and p
will become p.ref
. At this point of time, the reference of p
will point to None
, and end
also points to None
. Hence the inner while loop will break and end
will be set to p
.
In the next set of iterations, the loop will execute until 8, since 9 is already at the end. The process continues until the list is completely sorted.
The Python code for sorting the linked list using bubble sort by exchanging the data is as follows:
Add the bub_sort_dataexchange()
method to the LinkedList
class that you created in the last article.
Once you add the method to the linked list, create any set of nodes using the make_new_list()
function and then use the bub_sort_dataexchange()
to sort the list. You should see the sorted list when you execute the traverse_list()
function.
Sorting Linked List by Modifying Links
Bubble sort can also be used to sort a linked list by modifying the links instead of changing data. The process remains quite similar to sorting the list by exchanging data, however, in this case, we have an additional variable r
that will always correspond to the node previous than the p
node.
Let's take a simple example of how we will swap two nodes by modifying links. Suppose we have a linked list with the following items:
And we want to swap 65 and 35. At this point in time p
corresponds to node 65, and q
corresponds to node 35. The variable r
will correspond to node 45 (previous to node p
). Now if the node p
is greater than node q
, which is the case here, the p.ref
will be set to q.ref
and q.ref
will be set to p
. Similarly, r.ref
will be set to q
. This will swap nodes 65 and 35.
The following method implements the bubble sorting for the linked list by modifying links:
Add the bub_sort_linkchange()
method to the LinkedList
class that you created in the last article.
Once you add the method to the linked list, create any set of nodes using the make_new_list()
function and then use the bub_sort_linkchange()
to sort the list. You should see the sorted list when you execute the traverse_list()
function.
In this section we will see how we can merge two sorted linked lists in a manner that the resulting linked list is also sorted. There are two approaches to achieve this. We can create a new linked list that contains individually sorted lists or we can simply change links of the two linked list to join the two sorted linked list. In the second case, we do not have to create a new linked list.
Let's first see how we can merge two linked lists by creating a new list.
Merging Sorted Linked Lists by Creating a New List
Let's first dry run the algorithm to see how we can merge two sorted linked list with the help of a new list.
Suppose we have the following two sorted linked lists:
list1:
list2:
These are the two lists we want to merge. The algorithm is straight forward. All we will need is three variables, p
, q
, and em
, and an empty list newlist
.
At the beginning of the algorithm, p
will point to the first element of the list1
whereas q
will point to the first element of the list2
. The variable em
will be empty. At the start of the algorithm, we will have the following values:
Next, we will compare the first element of the list1
with the first element of list2
, in other words, we will compare the values of p
and q
and the smaller value will be stored in the variable em
which will become the first node of the new list. The value of em
will be added to the end of the newlist
.
After the first comparison we will have the following values:
Since q
was less than p
, therefore, we store the value of q
in em
moved 'q' one index to the right. In the second pass, we will have the following values:
Here since p
was smaller, we add the value of p
to newlist
, and set em
to p
and then moved p
one index to the right. In the next iteration we have:
Similarly, in the next iteration:
And in the next iteration, p
will again be smaller than q
, hence:
Finally,
When one of the list becomes None
, all the elements of the second list are added at the end of the new list. Therefore, the final list will be:
The Python script for merging two sorted lists is as follows:
In the script above we have two methods: merge_helper()
and merge_by_newlist()
. The first method merge_helper()
takes a linked list as a parameter and then passes the self
class, which is a linked list itself and the linked list passed to it as a parameter, to the merge_by_newlist()
method.
The merge_by_newlist()
method merges the two linked by creating a new linked list and returns the start node of the new linked list. Add these two methods to the LinkedList
class. Create two new linked lists, sort them using the bub_sort_datachange()
or the bub_sort_linkchange()
methods that you created in the last section and then use the merge_by_newlist()
to see if you can merge two sorted linked lists or not.
Merging Sorted Linked Lists by Rearranging Links
In this approach, a new linked list is not used to store the merger of two sorted linked lists. Rather, the links of the two linked lists are modified in such a way that two linked lists are merged in a sorted manner.
Let's see a simple example of how we can do this. Suppose we have the same two lists list1
and list2
:
list1:
list2:
We want to merge them in a sorted manner by rearranging the links. To do so we need variables p
, q
and em
. Initially, they will have the following values:
Next, we will compare the first element of the list1
with the first element of list2
, in other words, we will compare the values of p
and q
and the smaller value will be stored in the variable em
which will become the first node of the new list.
After the first comparison we will have the following values:
After the first iteration, since q
is less than p
, the start node will point towards q
and q
will become q.ref
. The em
will be equal to start. The em
will always refer to the newly inserted node in the merged list.
Here since p
was smaller than the q
, the variable em
now points towards the original value of p
and p
becomes p.ref
.
Here since q
was smaller than p
, em
points towards q
and q
becomes q.ref
.
Similarly em
here points towards q
.
And here em
points towards becomes p
.
When one of the lists becomes None
, the elements from the second list are simply added at the end.
The script that contains functions for merging two lists without creating a new list is as follows:
In the script above we have two methods: merge_helper2()
and merge_by_linkChange()
. The first method merge_helper2()
takes a linked list as a parameter and then passes the self class which is a linked list itself and the linked list passed to it as a parameter, to the merge_by_linkChange()
, which merges the two linked by modifying the links and returns the start node of the merged list. Add these two methods to the LinkedList
class. Create two new linked lists, sort them using the bub_sort_datachange()
or the bub_sort_linkchange()
methods that you created in the last section and then use the merge_by_newlist()
to see if you can merge two sorted linked lists or not. Let's see this process in action.
Create a new linked list using the following script:
The script will ask you for the number of nodes to enter. Enter as many nodes as you like and then add values for each node as shown below:
Next, create another linked list repeating the above process:
Next, add a few dummy nodes with the help of the following script:
The next step is to sort both the lists. Execute the following script:
Finally, the following script merges the two linked lists:
To see if the lists have actually been merged, execute the following script:
The output looks like this:
In this article, we continued from where we left in the previous article. We saw how we can sort merge lists by changing data and then my modifying links. Finally, we also studied different ways of merging two sorted linked lists.
Objective 01 - Recall the different traversal types for a binary tree and implement a function to complete the traversal for each type
Python does not have a static array data type. However, lists are built on dynamic arrays. As you will see, dynamic arrays rely on an underlying static array to work. So while you won't be creating and using this data structure directly, it is still essential to understand.
A data structure is a structure that is designed for holding information in a particular way. A static array is a data structure that is designing for storing information sequentially (in order). For example, if you were to store the English alphabet in a static array, you would expect the "B" character to right next to both the "A" character and the "C" character. Additionally, every position within the static array is labeled with an index. So, if you wanted to access the first item in the static array, you would expect that item to have an index of 0. The second item would have an index of 1. The third item would have an index of 2. This pattern continues for the entire capacity of the static array.
Time and Space Complexity
Lookup
To look up an item by index in an array is constant time (O(1
). If you have the specific index of an object in an array, the computations to find that item in memory are all constant time as well.
Append
Adding an item to an array is constant time (O(1)
). We always have a reference point to the last thing in a static array, so we can insert an item after the current end.
Insert
Unless you are inserting an item at the end of the list, items must be shifted over to make room for the new information you add to the static array. It's like if you had a chain of people stretched out, holding hands, in a line. The first person in the line is butted right up against a wall, and there is no room on one side of him. If someone wanted to join the end of the line, the people already in the line wouldn't have to do anything (O(1)
). However, if you wanted to join the beginning of the line, every single person would have to move over (away from the wall) (O(n)
) so that you would have room to join. If you wanted to join the line somewhere in the middle, only the people to your one side would have to shift to make room for you. In the computer, this shifting is moving information from one address in memory to another. Each move takes time.
Delete
Just like insertions, deletions are only efficient (O(1)
) when they are done at the end of the static array. If something is deleted from any other position in the array, the items have to be moved over, so there isn't any empty space left. Remember, static arrays can be a good data structure because retrieving information from a specific index is fast. It is fast because we can ensure that information is consistently stored in sequence right next to each other. That way, we can always be confident that whatever information is at index 5 is the sixth item in the array. If we left empty spaces in the middle of our static array, we would no longer ensure that this was true.
Space
The space complexity of an array is linear (O(n)
). Each item in the array will take up space in memory.
Strengths
Static arrays are great to use when you need a data structure to retrieve information from a specific index efficiently. This is because, as we explained earlier, accessing any specific index in a static array involves a simple mathematical computation (starting index + (size of each item * index)). This computation is done in O(1)
time and is not affected by the static array size at all. If you need a data structure where you are likely only to append items (add them to the end of the list), a static array also works great. When you add a new item to the end of the list, nothing has to be shifted over or moved in memory, so that operation is very efficient (O(1)
).
Weaknesses
There are situations where static arrays are not the best data structure to use for storing your information. What about if you don't know how much information you need to store? Or if the amount of information you need to store is likely to fluctuate or change frequently. In this case, a static array is not good. The reason is that when you create a static array, you have to know and declare the size of that array. That way, your computer can separate off a chunk of memory that is the exact right size for storing that static array. If you run out of room in the static array, you can't simply make it bigger; you have to create a brand new, bigger static array. You have to copy each item from the first static array into the newer, bigger one.
Another reason that static arrays are not always the best choice to use for storing information is that they are inefficient unless you are performing operations at the end of the static array. They are inefficient because if you want to insert or delete something at the beginning (or the middle of the list), all the items to the right of that index must be moved over. If you delete something, everything has to be shifted over, so there isn't an empty index in the middle of your data. If you insert something, all the items have to shift over to make room for the new item before inserting it.
What about array slicing?
You often encounter a scenario where you want to use a subset of items from an existing array. Array slicing is when you take a subset from an existing array and allocate a new array with just the items from the slice.
In Python, the syntax looks like this:
The default start index is 0, and if you leave off the end_index, the slice will capture through the end of the list.
You might be wondering, what is the time and space complexity of slicing an array? To understand the complexity, you need to know what is happening behind the scenes when you take a slice of an array. First, you are actually allocating a new list. Second, you copy all of the items in your slice from the original array into the newly allocated list. This means that you have an O(n)
time cost (for the copying) and an O(n)
space cost for the newly allocated list.
You must keep these facts in mind and account for them when using a slice in your code. It's not a free operation.
Draw out what happens to a static array when you insert an item at the beginning of the array.
Draw out what happens to a static array when you delete an item from the array's beginning.
In-Place
An in-place function modifies or destroys the state of the input data when it is run. For instance, if you write a function that squares every integer in an input list, an in-place version of this function would change the data in the list that was passed in. It would not create a new list and return the new list. In-place functions are more space-efficient because they don't create new variables directly tied to the input size. However, to get that space-efficiency, you have to risk that the function's user may end up changing state to the input accidentally.
Imagine a scenario where you have an antique map that you are using to navigate on a hike. You end up needing directions, and when you come across another hiker, you ask them for help. The person helping you has two options. They can take your antique map, use a pen, and mark it up with their notations that will help you navigate. However, you most likely didn't want those annotations to be on your map forever. The other option would be to find another piece of paper and have the person helping you write out their annotations on that. This way, your original antique map doesn't have to be modified. However, now you have two maps that you have to carry around on your hike.
Out-of-Place
In contrast to in-place functions, out-of-place functions don't modify or destroy the input state when they are run. Any changes done to the input are done to a copy of the input, not the original that was passed in. This is why they are less space-efficient. If you have a list of 1,000,000 items that you want to square, you first have to make a copy of that list. Now, you have two lists of 1,000,000 items. However, you avoid any side-effects that might be unintended.
Pass By Reference or Value
In Python, some function arguments are passed in by their actual value, and some are passed in as a reference to the object in memory. Primitive values like integers, floats, and strings are passed in by their actual value. So, if you call a function and pass in the integer 2
when you reference that value by the named parameter of the function, you can't change 2
in memory. However, non-primitive objects like lists or dictionaries are passed in as references to that object in memory. So, if you call a function and pass in the dictionary {"name": "Matt"}
when you reference that dictionary using the named parameter, you are changing the original object that was passed in. For objects that are passed in by reference, they must be copied to a new variable before they are modified if you want to avoid side effects.
When should I use an in-place function or algorithm?
It would be best if you always defaulted to using an out-of-place function. This is a safer default to avoid unintended side-effects in your program. However, there are scenarios that you might encounter where you need to be extremely space-efficient. In that case, you might have to use an in-place function to work within the particular space-constraints you've been given.
Here is an example of a function that triples each number in an input list. This function does this in-place:
Now, since this is an in-place function, watch what happens when we use it:
my_list
was modified when I called the function, and the function only returned the default return value of None
.
Let's now write a similar function, but this time we will do it out-of-place:
Look what happens when we use this function:
Notice how we had to store the returned list in a new variable. Also, notice that it didn't modify the list that we passed in when we called the function.
In your own words, describe the difference between an in-place algorithm and an out-of-place algorithm.
In your own words, explain when it is an excellent choice to use an in-place algorithm.
Remember how we said you had to know how much information you were going to store when you created a static array? Well, with a dynamic array, you don't have to know. You don't have to declare a size when you instantiate a dynamic array. That makes it better in scenarios where the amount of information you need to store is unknown or is likely to fluctuate.
Time and Space Complexity
Lookup
To look up an item by index in an array is constant time (O(1
). If you have the specific index of an item in an array, the computations to find that item in memory are all constant time as well.
Append
Adding an item to an array is constant time (O(1)
) in the average case. However, in the worst case, the cost is O(n)
(this will be explained in more detail below).
Insert
In the worst case, inserting an item is linear time (O(n)
). When you insert into an array, all the items — starting at the index we are inserting into — have to be shifted one index. These items have to be "moved over" to make room for the new item being inserted. The worst-case scenario is inserting at the 0th index, and every item in the array has to shift over.
Delete
In the worst case, deleting an item is linear time (O(n)
). For any item you delete (unless it is the last item), all of the items after that index have to be shifted over to fill the now blank spot in the array. Remember, arrays store data in sequential order, so if we delete an item, we cannot just leave that space blank. If we left the space blank, it would ruin the quick lookup time. To have a fast lookup time, we need to be able to rely on the distance from the start of the array to whatever index we are trying to access.
Space
The space complexity of an array is linear (O(n)
). Each item in the array will take up space in memory.
Strengths
Again, probably the dynamic array's biggest strength is not having to know or worry about the size of the data structure. It can grow to accommodate your data as needed. And, you don't have to manage this growth; the data structure itself grows when necessary. Dynamic arrays also have some of the same strengths as a static array. They also have efficient lookups (O(1)
) when you have a specific index that you want to retrieve from.
Weaknesses
The main weakness of the dynamic array is related to its strength. To not have to worry about or manage the array's size, when the array runs out of room, it has to grow to accommodate more items. So, let's say your dynamic array is currently set up to store ten items. If it's full and you try to add an 11th item, the data structure can't just assume that there is a spot available right after the 10th item. It actually creates a new, bigger array and then copies all of the first ten items into the new array, and finally, it adds the 11th item. We will talk a bit more about how this works below. Additionally, dynamic arrays have the same weaknesses as static arrays, slow insertions and deletions (O(n)
).
Doubling Appends
Underneath the hood of a dynamic array is a static array. When you create a dynamic array, it is a static array that keeps track of the starting index, the index of the last item that it stores, and the index for the last slot in its capacity. This brings up an important point. An array has a size and a capacity. An array's size is how many items it is storing at the moment. Its capacity is how many items it could store before it runs out of room.
So, let's say that your dynamic array instantiates with an underlying static array with a capacity of 10 and a size of 0 when you create it. Then, you add ten items to the array. Now, it has a capacity of 10 and a size of 10. If you now go to append an 11th item to the array, you've run out of capacity. Here is where the dynamic of the dynamic array comes into play. The data structure will create a new underlying static array with a capacity twice the size of the original underlying static array. It will then copy the ten original items into the new array and finally add the 11th item. The cost of copying the original items into the new array is O(n)
. So, when we say that, in the worst-case, an append on a dynamic array has a time-complexity of O(n)
, this is why. However, all the other appends still have a time-complexity of O(1)
. So, in the average case append, the time-complexity is still efficient. Also, consider that as the array's capacity keeps doubling, the doublings will occur less and less frequently.
What type in Python is a dynamic array?
In your own words, explain why the worst-case time cost of appending to a dynamic array is O(n)
.
What is the difference between the size of a dynamic array and the capacity of a dynamic array?
This module project requires you to answer some multiple-choice questions related to the module's objectives. Additionally, you must continue developing your problem-solving skills by completing coding challenges related to its content.
Website Version
Curriculum
Utilities
practice
Resources
quick-reference
Python-Docs
MISC
Interview Prep
Installations Setup & Env
Aux-Exploration
There are lots of different types of tree data structures. A binary tree is a specific type of tree. It is called a binary tree because each node in the tree can only have a maximum of two child nodes. It is common for a node's children to be called either left
or right
.
Here is an example of a what a class for a binary tree node might look like:
With this simple class, we can now build up a structure that could be visualized like so:
A "perfect" tree has all of its levels full. This means that there are not any missing nodes in each level.
"Perfect" trees have specific properties. First, the quantity of each level's nodes doubles as you go down.
Second, the quantity of the last level's nodes is the same as the quantity of all the other nodes plus one.
These properties are useful for understanding how to calculate the height of a tree. The height of a tree is the number of levels that it contains. Based on the properties outlined above, we can deduce that we can calculate the tree's height with the following formula:
In the formula above, n
is the total number of nodes. If you know the tree's height and want to calculate the total number of nodes, you can do so with the following formula:
We can represent the relationship between a perfect binary tree's total number of nodes and its height because of the properties outlined above.
Calculate how many levels a perfect binary tree has given that the total number of nodes is 127.
Calculate the total number of nodes on a perfect binary tree, given that the tree's height is 8.
Just like a binary tree is a specific type of tree, a binary search tree (BST) is a specific type of binary tree. A binary search tree is just like a binary tree, except it follows specific rules about how it orders the nodes contained within it. For each node in the BST, all the nodes to the left are smaller, and all the nodes to the right of it are larger.
We can call a binary search tree balanced if the heights of its left and right subtrees differ by at most one, and both of the subtrees are also balanced.
Lookup
If a binary search tree is balanced, then a lookup operation's time complexity is logarithmic (O(log n)
). If the tree is unbalanced, the time complexity can be linear (O(n)
) in the worst possible case (virtually a linear chain of nodes will have all the nodes on one side of the tree).
Insert
If a binary search tree is balanced, then an insertion operation's time complexity is logarithmic (O(log n)
). If the tree is entirely unbalanced, then the time complexity is linear (O(n)
) in the worst case.
Delete
If a binary search tree is balanced, then a deletion operation's time complexity is logarithmic (O(log n)
). If the tree is entirely unbalanced, then the time complexity is linear (O(n)
) in the worst case.
Space
The space complexity of a binary search tree is linear (O(n)
). Each node in the binary search tree will take up space in memory.
One of the main strengths of a BST is that it is sorted by default. You can pull out the data in order by using an in-order traversal. BSTs also have efficient searches (O(log n)
). They have the same efficiency for their searches as a sorted array; however, BSTs are faster with insertions and deletions. In the average-case, dictionaries have more efficient operations than BSTs, but a BST has more efficient operations in the worst-case.
The primary weakness of a BST is that they only have efficient operations if they are balanced. The more unbalanced they are, the worse the efficiency of their operations gets. Another weakness is that they are don't have stellar efficiency in any one operation. They have good efficiency for a lot of different operations. So, they are more of a general-purpose data structure.
If you want to learn more about trees that automatically rearrange their nodes to remain balanced, look into AVL trees (Links to an external site.) or Red-Black trees (Links to an external site.)
In your own words, explain why an unbalanced binary search tree's performance becomes degraded.
To create a binary search tree, we need to define two different classes: one for the nodes that will make up the binary search tree and another for the tree itself.
Let's start by creating a BSTNode
class. An instance of BSTNode
should have a value
, a right
node, and a left
node.
Now that we have our basic BSTNode
class defined with an initialization method let's define our BST
class. This class will have an initialization method and an insert
method.
Notice that our BST
class expects each BSTNode
to have an insert
method available on an instance object. But, we haven't yet added an insert
method on the BSTNode
class. Let's do that now.
Now that we can insert nodes into our binary search tree let's define a search
method that can lookup values in our binary search tree.
Our BST
class expects there to be a search
method available on the BSTNode
instance stored at the root. Let's go ahead and define that now.
To implement a delete
operation on our BST
and BSTNode
classes, we must consider three cases:
If the BSTNode
to be deleted is a leaf (has no children), we can remove that node from the tree.
If the BSTNode
to be deleted has only one child, we copy the child node to be deleted and delete it.
If the BSTNode
to be deleted has two children, we have to find the "in-order successor". The "in-order successor" is the next highest value, the node that has the minimum value in the right subtree.
Given the above information, can you write pseudocode for a method that can find the minimum value of all the nodes within a tree or subtree?
A module is a collection of code that is written to meet specific needs. For example, you could split up different parts of a game you were building into modules. Each module would be a separate Python file that you could manage separately.
Any Python file that ends with the .py
extension is considered a module. The name of the module is the name of the file.
To import from other modules, we can use the import
command.
So, by importing the built-in math
module, we have access to all of the functions and data defined in that module. We access those functions and data using dot notation, just like we do with objects.
If you only need a specific function from a module, you can import that specific function like so:
You can also import all the names from a module with this syntax to avoid using dot notation throughout your file.
You can also bind the module to a name of your choice by using as
.
To find out which names a module defines when imported, you can use the dir()
method. This method returns an alphabetically sorted list of strings for all of the names defined in the module.
What is a linked list, and how is it different from an array? How efficient or inefficient are its operations? What are its strengths and weaknesses? How can I construct and interact with a linked list? By the end of this objective, you will be able to answer all of these questions confidently.
Basic Properties of a Linked List
A linked list is a simple, linear data structure used to store a collection of elements. Unlike an array, each element in a linked list does not have to be stored contiguously in memory.
For example, in an array, each element of the list [43, 32, 63
is stored in memory like so:
43
is the first item in the collection and is therefore stored in the first slot. 32
is the second item and is stored immediately next to 43
in memory. This pattern continues on and on.
In a linked list, each element of the list could be stored like so:
You can see here that the elements can be spaced out in memory. Because the elements are not stored contiguously, each element in memory must contain information about the next element in the list. The first item stores the data 43
and the location in memory (*3
) for the next item in the list. This example is simplified; the second item in the list 32
could be located anywhere in memory. It could even come before the first item in memory.
You might also be wondering what types of data can be stored in a linked list. Pretty much any type of data can be stored in a linked list. Strings, numbers, booleans, and other data structures can be stored. You should not feel limited using a linked list based on what type of data you are trying to store.
Are the elements in a linked list are sorted or unsorted? The elements in a linked list can be either sorted or unsorted. There is nothing about the data structure that forces the elements to be sorted or unsorted. You cannot determine if a linked list's elements are sorted by determining they are stored in a linked list.
What about duplicates? Can a linked list contain them? Linked lists can contain duplicates. There is nothing about the linked list data structure that would prevent duplicates from being stored. When you encounter a linked list, you should know that it can contain duplicates.
Are there different types of linked lists? If so, what are they? There are three types of linked lists: singly linked list (SLL), doubly linked list (DLL), and circular linked list. All linked lists are made up of nodes where each node stores the data and also information about other nodes in the linked list.
Each singly linked list node stores the data and a pointer where the next node in the list is located. Because of this, you can only navigate in the forward direction in a singly linked list. To traverse an SLL, you need a reference to the first node called the head. From the head of the list, you can visit all the other nodes using the next pointers.
The difference between an SLL and a doubly linked list (DLL) is that each node in a DLL also stores a reference to the previous item. Because of this, you can navigate forward and backward in the list. A DLL also usually stores a pointer to the last item in the list (called the tail).
A Circular Linked List links the last node back to the first node in the list. This linkage causes a circular traversal; when you get to the end of the list, the next item will be back at the beginning of the list. Each type of linked list is similar but has small distinctions. When working with linked lists, it’s essential to know what type of linked list.
Time and Space Complexity
Lookup
To look up an item by index in a linked list is linear time (O(n)
). To traverse through a linked list, you have to start with the head reference to the node and then follow each subsequent pointer to the next item in the chain. Because each item in the linked list is not stored contiguously in memory, you cannot access a specific index of the list using simple math. The distance in memory between one item and the next is varied and unknown.
Append
Adding an item to a linked list is constant time (O(1)
). We always have a reference point to the tail of the linked list, so we can easily insert an item after the tail.
Insert
In the worst case, inserting an item in a linked list is linear time (O(n)
). To insert an item at a specific index, we have to traverse — starting at the head — until we reach the desired index.
Delete
In the worst case, deleting an item in a linked list is linear time (O(n)
). Just like insertion, deleting an item at a specific index means traversing the list starting at the head.
Space
The space complexity of a linked list is linear (O(n)
). Each item in the linked list will take up space in memory.
Strengths of a Linked List
The primary strength of a linked list is that operations on the linked list's ends are fast. This is because the linked list always has a reference to the head (the first node) and the tail (the last node) of the list. Because it has a reference, doing anything on the ends is a constant time operation (O(1)
) no matter how many items are stored in the linked list. Additionally, just like a dynamic array, you don't have to set a capacity to a linked list when you instantiate it. If you don't know the size of the data you are storing, or if the amount of data is likely to fluctuate, linked lists can work well. One benefit over a dynamic array is that you don't have doubling appends. This is because each item doesn't have to be stored contiguously; whenever you add an item, you need to find an open spot in memory to hold the next node.
Weaknesses of a Linked List
The main weakness of a linked list is not efficiently accessing an "index" in the middle of the list. The only way that the linked list can get to the seventh item in the linked list is by going to the head node and then traversing one node at a time until you arrive at the seventh node. You can't do simple math and jump from the first item to the seventh.
What data structures are built on linked lists?
Remember that linked lists have efficient operations on the ends (head and tail). There are two structures that only operate on the ends; queues and stacks. So, most queue or stack implementations use a linked list as their underlying data structure.
Why is a linked list different than an array? What problem does it solve?
We can see the difference between how a linked list and an array are stored in memory, but why is this important? Once you see the problem with the way arrays are stored in memory, the benefits of a linked list become clearer.
The primary problem with arrays is that they hold data contiguously in memory. Remember that having the data stored contiguously is the feature that gives them quick lookups. If I know where the first item is stored, I can use simple math to figure out where the fifth item is stored. The reason that this is a problem is that it means that when you create an array, you either have to know how much space in memory you need to set aside, or you have to set aside a bunch of extra memory that you might not need, just in case you do need it. In other words, you can be space-efficient by only setting aside the memory you need at the moment. But, in doing that, you are setting yourself up for low time efficiency if you run out of room and need to copy all of your elements to a newer, bigger array.
With a linked list, the elements are not stored side-by-side in memory. Each element can be stored anywhere in memory. In addition to storing the data for that element, each element also stores a pointer to the memory location of the next element in the list. The elements in a linked list do not have an index. To get to a specific element in a linked list, you have to start at the head of the linked list and work your way through the list, one element at a time, to reach the specific element you are searching for. Now you can see how a linked list solves some of the problems that the array data structure has.
How do you represent a linked list graphically and in Python code?
Let’s look at how we can represent a singly linked list graphically and in Python code. Seeing a singly linked list represented graphically and in code can help you understand it better.
How do you represent a singly linked list graphically? Let’s say you wanted to store the numbers 1, 2, and 3. You would need to create three nodes. Then, each of these nodes would be linked together using the pointers.
Notice that the last element or node in the linked list does not have a pointer to any other node. This fact is how you know you are at the end of the linked list.
What does a singly linked list implementation look like in Python? Let's start by writing a LinkedListNode
class for each element in the linked list.
Now, we need to build out the class for the LinkedList
itself:
Our class is super simple so far and only includes an initialization method. Let's add an append
method so that we can add nodes to the end of our list:
Now, let's use our simple class definitions for LinkedListNode
and LinkedList
to create a linked list of elements 1
, 2
, and 3
.
You must be able to understand and interact with linked lists. You now know the basic properties and types of linked lists, what makes a linked list different from an array, what problem it solves, and how to represent them both graphically and in code. You now know enough about linked lists that you should be able to solve algorithmic code challenges that require a basic understanding of linked lists.
Draw out a model of a singly-linked list that stores the following integers in order: 3,2,6,5,7,9
.
Draw out a model of a doubly-linked list that stores the following integers in order: 5,2,6,4,7,8
.
L1 = Node(34) L1.next = Node(45) L1.next.next = Node(90)
L1 = [34]-> [45]-> [90] -> None
Node(45) Node(90)
Result:
A binary search tree (BST) is a binary tree where every node in the left subtree is less than the root, and every node in the right subtree is of a value greater than the root. The properties of a binary search tree are recursive: if we consider any node as a “root,” these properties will remain true.
Due to the way nodes in a binary search tree are ordered, an in-order traversal (left node, then root node, then right node) will always produce a sequence of values in increasing numerical order.
Binary search trees are called “search trees” because they make searching for a certain value more efficient than in an unordered tree. In an ideal binary search tree, we do not have to visit every node when searching for a particular value.
Here is how we search in a binary search tree:
Begin at the tree’s root node
If the value is smaller than the current node, move left
If the value is larger than the current node, move right
New nodes in a binary search tree are always added at a leaf position. Performing a search can easily find the position for a new node.
When removing from a binary search tree, we are concerned with keeping the rest of the tree in the correct order. This means removing is different depending on whether the node we are removing has children. There are three cases:
If the node being removed is a leaf, it can simply be deleted.
If the node has a single child, (left or right) we must move the child into the position of the node when deleting it.
If the node has two children, we must first find the In-Order Predecessor (IOP): the largest node in our node’s left subtree. The IOP is always a leaf node, and can be found by starting at the left subtree’s root and moving right. We can then swap the node being removed with its IOP and delete it, as it is now a leaf.
Depending on the values contained in a binary search tree, and the order in which they are added, the performance of a BST’s operations can vary. This performance depends on the shape of the tree and the number of nodes it contains.
The worst case of a binary search tree is one that has its values added in numerical order. This structure then doesn’t resemble a tree - it looks like a linked list! As potentially every node has to be visited when searching, the worst case BST has a run time of O(n)
for all operations utilizing find.
Unlike linear data structures (Array, Linked List, Queues, Stacks, etc) which have only one logical way to traverse them, trees can be traversed in different ways. Following are the generally used ways for traversing trees.
h
In an ideal case, a binary search tree has a similar number of nodes in its right and left subtrees. Since you have to visit less nodes when searching in an ideal BST, this case has a run time of O(lg(n))
for all operations that utilize find, including search, insert, and remove.
Depth First Traversals: (a) Inorder (Left, Root, Right) : 4 2 5 1 3 (b) Preorder (Root, Left, Right) : 1 2 4 5 3 (c) Postorder (Left, Right, Root) : 4 5 2 3 1 Breadth First or Level Order Traversal : 1 2 3 4 5 Please see post for Breadth First Traversal. Inorder Traversal ():
Uses of Inorder In case of binary search trees (BST), Inorder traversal gives nodes in non-decreasing order. To get nodes of BST in non-increasing order, a variation of Inorder traversal where Inorder traversal s reversed can be used. Example: Inorder traversal for the above-given figure is 4 2 5 1 3. Preorder Traversal ():
Uses of Preorder Preorder traversal is used to create a copy of the tree. Preorder traversal is also used to get prefix expression on of an expression tree. Please see to know why prefix expressions are useful. Example: Preorder traversal for the above given figure is 1 2 4 5 3.
Uses of Postorder Postorder traversal is used to delete the tree. Please see for details. Postorder traversal is also useful to get the postfix expression of an expression tree. Please see to for the usage of postfix expression. Example: Postorder traversal for the above given figure is 4 5 2 3 1.