Algorithms & Data Structures Day #2

Sorry for taking the time, I know I said I’ll post a day-after-day guide to algorithms and data structures but you’ll have to excuse me, I’ve been ill.

So on day #1 we talked about linked lists. If you don’t remember what linked lists are you might want to go back to previous post and go over it before diving into today’s guide since we’re going to talk about the stack and how to implement it using linked lists.

First things first:

In computer science, a stack is an abstract data type that serves as a collection of elements, with two principal operations: push, which adds an element to the collection, and pop, which removes the most recently added element that was not yet removed.

But what is so special about the stack? what differentiates the stack from other data types?  The stack has a special way to store and organize data. It follows a simple “rule” called LIFO  (Last IFirst Out). This means that the item most recently added to the stack will be the first handed back from the collection.

I find that the easiest way to understand what is the stack and how it works is by visualizing a pile of pancakes on a plate one on top of the other:

original_-buttermilk_pancakes

Let’s just say, for argue’s sake that I eat pancakes. And let’s just say that if I was served a plate like this I would eat the pancakes one by one. I were to eat them as I said I would have to start with the top one, otherwise the entire pile will ruin and all of these beautiful diabetes-causing treats would become a diabetes-causing mess! What I mean is, that the last pancake put on top of the pile is the first one I’d have to eat and the same goes with stacks (no I did not start a data-structure diet where I only eat ints, I need my beef and veggies too!) the last item on the stack, aka the top of the stack will be the first to go.

stack

The action of adding another item to the stack is called “push” and the action of taking out the top item from the stack is called “pop”. So adding more pancakes to the pile can be done with “push”ing pancakes to the plate, and eating the top one is done by “pop”ping the top one. pretty simple isn’t it?

On the previous post we’ve learned how a linked list is a series of nodes linked together and do a chain – we can use this type of structure to help build our stack. The stack would contain a linked list who’s head node pointed to the most recently added (top) item.

Let’s begin with creating a linked list:

typedefnode

Pushing to the list is basically just adding a new node to the beginning of the list. In order to do that we first need to create a new item, link the item to point to the head of our list and then set the head of the list to be our new item. This way, we will create a new head to the list with a new value and keep the rest of the list linked to it!

I am going to use a “pop” function and since we’re using C here, we’re gonna have to pass a pointer to the pointer variable (double pointer yayyy) so we could modify the pointer itself:

screenshot-from-2017-03-01-124530

Using this function we can always add a new item to the stack. Now how about some popping???

dreamstime_xl_27975470-custom

Funny isn’t it?

In order to remove the last item and pop it from the stack we need to do exactly the opposite! I’ll explain; First we need to take the next item that the head is pointing to and save it, then “free” the head item and then all that’s left is to set the head to be the next item we’ve stored on the side.

screenshot-from-2017-03-01-125305

That’s pretty much it.

But why was I using a linked list and not an array for instance?

The main advantage a linked list has over an array is that it has no size limit and on modern days computers we can add billions and trillions of nodes without having to think about size and memory consumption (ALWAYS THINK ABOUT MEMORY CONSUMPTION NO MATTER WHAT YOU DO!). Also, it’s a very straight forward way to implement a stack – an array for example would require bounds checking to make sure that a “push” didn’t cause the item to exceed the array’s bounds. However, there are some downsides to this approach, and many of them has to do with performance and scalability; like whenever a new item is being pushed to the stack, a memory allocation occurs every time. This can end up causing undesirable performance characteristics in high performance systems.

 

Hope you found this article to be educating 😉

 

Ar3p0

Algorithms & Data Structures Day #1

Any of you who ever wish to succeed a job interview for any tech position (mainly developer and security researcher positions though) MUST master the art of algorithms and data structures. Actually, it’s a very important skill to have if you want to get serious with understanding computer programs for any reason and not just for the sake of finding a job.

So I’ve decided to share with you a quick 10 days walk-through into the world of algorithms and data structures – what do they mean, when to use what and how to implement them in the C programming language (I might add some x86 later on).

First we’ll start with two important definitions with the generosity of Wikipedia:

Data Structure-

a particular way of organizing data in a computer so that it can be used efficiently.

Algorithm-

a self-contained sequence of actions to be performed. Algorithms perform calculation, data processing, and/or automated reasoning tasks.

Now that we know what algorithms and data structures basically are, we can go on and start exploring the various common algorithms out there.

Today we will start with the most common and basic data structure called a “Linked List“.

In computer science, a linked list is a linear collection of data elements, called nodes, each pointing to the next node by means of a pointer. It is a data structure consisting of a group of nodes which together represent a sequence.

In order to create and explain a linked list we first need to define a node – the most basic building block for many common data structures.  The node has two main functions;

  1. Provides a mechanism to hold a piece of data.
  2. Provides a mean to connect itself to other nodes and thus form a “chain”.

The node allows connecting itself to other nodes using an object reference pointer (the “next” pointer) that points at the address of the next node.

Let’s say we have a first node holding the integer value of 5 and a second node holding the integer value of 6. The first node’s object reference pointer will point on the second node and will form a chain:

linkedlist1

We can keep adding more and more nodes to the chain using this process till the last member of the list that will contain an integer (in this case) and instead of an object reference pointer that points on the next node it will contain “null”.

Now we know how does a link list basically looks like but how can we implement it in code?

First we’re gonna need to define a struct in a recursive manner:

typedefnode

Now we can use the nodes! Hooray! Let’s create a variable to point at the first member of our linked list and call it samson (just because I can!):

node1

It is good practice to check whether malloc returned NULL or not, do it every time you use it! Adding items to the list is super easy and can be done by allocating memory for the second node and just keep advancing to the next pointer!

screenshot-from-2017-02-26-140236

Isn’t linking nodes fun? NOT REALLY! But it’s a must 🙂

As we’ve seen in the example, linked lists are sets of dynamically allocated nodes and as such they have very nice advantages over arrays for example. Items can be added or removed from the middle of the list and there is no need to define an initial size. However there are some drawbacks  – dynamic allocation for instance is a good place for bugs since pointers are used and the code gets much more complicated which might increase the risk of memory leaks and segmentation faults. Another drawback regarding dynamic memory allocation is that this action is less efficient in memory and the nodes are larger than array members since they hold both the value and another pointer. Oh and searching through a linked list is hell since there is no random access. For instance, if you want to find the N member of the list, you must iterate through the entire list till you find it, and with a simple “printList” function I’ll demonstrate how the iteration is done:

screenshot-from-2017-02-26-141751

As you can see, I used a “curr” pointer to keep track of the node currently being printed. After printing out the value of the current node we set curr to point to the next member of the list till we reach the last one which is being recognized with “NULL”.

ADD/REMOVE:

Adding items to the end of the list is very easy. we use the same logic we used in the previous example to iterate through the list but when we get to the end of the list using the NULL identifier, we then set it to point to another item!

screenshot-from-2017-02-26-142632

Removing the last item of the list is quite tricky, since there are more things we need to do, like checking if the list contains only one member or more first:

screenshot-from-2017-02-26-143323

“Wait a minute!!! You told us all about adding and removing items from the end of the list! what is I want to add or remove from the beginning of the list or from a specific location in the list???”

And you’re right! Unfortunately I am not going to explain how it is done just yet, we will talk about it tomorrow though, when I’ll explain all about the stack – how items are pushed to and popped from it and how this LIFO (Last In First Out) thing can be implemented in linked lists.

I am in a cheerful kind of mood though so I will tell you briefly how one can remove a specific item from the list.

I’m pretty sure you get the idea of dealing with linked lists, so you shall understand perfectly fine the following steps:

  1. Iterate through the list till you reach the node before the one you wish to remove.
  2. Save the soon-to-be-deleted node in a temporary pointer.
  3. Set the previous node’s “next” pointer to point to the node after the node-to-be-deleted (in that way we’re skipping the deleted node).
  4. Delete the node using the temporary pointer.
  5. Free the memory for the temporary pointer.

screenshot-from-2017-02-26-145156

(I know, I didn’t check for the case n=0, it requires the use of a pop() function which will be explained tomorrow)

And now a little bit of extra!:

What we’ve seen today is called a “singly linked list” since each node is linked directly to only one other node which works great when you only need forward access to the node. Sometimes you might want to be able to enumerate both forward and backwards and in these cases you will use a little something called a “doubly linked list“.

In a doubly linked list, each member points not only to the next member of the list but also the previews member, and as you’ve guessed this is being done by adding another pointer to the node_t struct.

screenshot-from-2017-02-26-150409

Now that you get the logic, you know how to implement it using C code. 🙂

Good luck!

Ar3p0