Algorithms & Data Structures Day #1

Any of you who ever wish to succeed a job interview for any tech position (mainly developer and security researcher positions though) MUST master the art of algorithms and data structures. Actually, it’s a very important skill to have if you want to get serious with understanding computer programs for any reason and not just for the sake of finding a job.

So I’ve decided to share with you a quick 10 days walk-through into the world of algorithms and data structures – what do they mean, when to use what and how to implement them in the C programming language (I might add some x86 later on).

First we’ll start with two important definitions with the generosity of Wikipedia:

Data Structure-

a particular way of organizing data in a computer so that it can be used efficiently.

Algorithm-

a self-contained sequence of actions to be performed. Algorithms perform calculation, data processing, and/or automated reasoning tasks.

Now that we know what algorithms and data structures basically are, we can go on and start exploring the various common algorithms out there.

Today we will start with the most common and basic data structure called a “Linked List“.

In computer science, a linked list is a linear collection of data elements, called nodes, each pointing to the next node by means of a pointer. It is a data structure consisting of a group of nodes which together represent a sequence.

In order to create and explain a linked list we first need to define a node – the most basic building block for many common data structures.  The node has two main functions;

  1. Provides a mechanism to hold a piece of data.
  2. Provides a mean to connect itself to other nodes and thus form a “chain”.

The node allows connecting itself to other nodes using an object reference pointer (the “next” pointer) that points at the address of the next node.

Let’s say we have a first node holding the integer value of 5 and a second node holding the integer value of 6. The first node’s object reference pointer will point on the second node and will form a chain:

linkedlist1

We can keep adding more and more nodes to the chain using this process till the last member of the list that will contain an integer (in this case) and instead of an object reference pointer that points on the next node it will contain “null”.

Now we know how does a link list basically looks like but how can we implement it in code?

First we’re gonna need to define a struct in a recursive manner:

typedefnode

Now we can use the nodes! Hooray! Let’s create a variable to point at the first member of our linked list and call it samson (just because I can!):

node1

It is good practice to check whether malloc returned NULL or not, do it every time you use it! Adding items to the list is super easy and can be done by allocating memory for the second node and just keep advancing to the next pointer!

screenshot-from-2017-02-26-140236

Isn’t linking nodes fun? NOT REALLY! But it’s a must 🙂

As we’ve seen in the example, linked lists are sets of dynamically allocated nodes and as such they have very nice advantages over arrays for example. Items can be added or removed from the middle of the list and there is no need to define an initial size. However there are some drawbacks  – dynamic allocation for instance is a good place for bugs since pointers are used and the code gets much more complicated which might increase the risk of memory leaks and segmentation faults. Another drawback regarding dynamic memory allocation is that this action is less efficient in memory and the nodes are larger than array members since they hold both the value and another pointer. Oh and searching through a linked list is hell since there is no random access. For instance, if you want to find the N member of the list, you must iterate through the entire list till you find it, and with a simple “printList” function I’ll demonstrate how the iteration is done:

screenshot-from-2017-02-26-141751

As you can see, I used a “curr” pointer to keep track of the node currently being printed. After printing out the value of the current node we set curr to point to the next member of the list till we reach the last one which is being recognized with “NULL”.

ADD/REMOVE:

Adding items to the end of the list is very easy. we use the same logic we used in the previous example to iterate through the list but when we get to the end of the list using the NULL identifier, we then set it to point to another item!

screenshot-from-2017-02-26-142632

Removing the last item of the list is quite tricky, since there are more things we need to do, like checking if the list contains only one member or more first:

screenshot-from-2017-02-26-143323

“Wait a minute!!! You told us all about adding and removing items from the end of the list! what is I want to add or remove from the beginning of the list or from a specific location in the list???”

And you’re right! Unfortunately I am not going to explain how it is done just yet, we will talk about it tomorrow though, when I’ll explain all about the stack – how items are pushed to and popped from it and how this LIFO (Last In First Out) thing can be implemented in linked lists.

I am in a cheerful kind of mood though so I will tell you briefly how one can remove a specific item from the list.

I’m pretty sure you get the idea of dealing with linked lists, so you shall understand perfectly fine the following steps:

  1. Iterate through the list till you reach the node before the one you wish to remove.
  2. Save the soon-to-be-deleted node in a temporary pointer.
  3. Set the previous node’s “next” pointer to point to the node after the node-to-be-deleted (in that way we’re skipping the deleted node).
  4. Delete the node using the temporary pointer.
  5. Free the memory for the temporary pointer.

screenshot-from-2017-02-26-145156

(I know, I didn’t check for the case n=0, it requires the use of a pop() function which will be explained tomorrow)

And now a little bit of extra!:

What we’ve seen today is called a “singly linked list” since each node is linked directly to only one other node which works great when you only need forward access to the node. Sometimes you might want to be able to enumerate both forward and backwards and in these cases you will use a little something called a “doubly linked list“.

In a doubly linked list, each member points not only to the next member of the list but also the previews member, and as you’ve guessed this is being done by adding another pointer to the node_t struct.

screenshot-from-2017-02-26-150409

Now that you get the logic, you know how to implement it using C code. 🙂

Good luck!

Ar3p0

Advertisements

One thought on “Algorithms & Data Structures Day #1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s