Which Data structure to choose ? – Data Structures selection and performance comparison

Data structures are important and it provides a easy and efficient way to organize the data in any function/program.

Choosing the correct data structure is important and the performance of the program is directly proportional to the data structure used.

Proper analysis needs to be done to pick the correct data structure.

Example : One data structure might be good in accessing the data in a order but might not be good in accessing a random data.

Important key factor/question one need to ask in deciding the deciding the data structure is “How the data is going to be used ?”

Big O is a notation which gives a nice representation of how many steps it will take to complete a task of fetching the data or the algorithm used to fetch the data.

If there is a function say

int GetData(int x) { return 2+x;}

this function has one step to complete and it is represented in Big O as O(1).

Now if the function is like this :

int GetData(int x)

{

int result = 1;

for(int i=0,i

{

result =result*i;

}

return result;

}

In the above function the result is determined by the line result = result*i and it is inside a loop.

Since this loop will be executed n number of times in Big O this will represented as O(n) and this is called Linear time.

Following table shows the speed comparison of each data structure represented in Big O Notation.

 Data Structure

Adding data to end

Removing data  
from end

Inserting data in the middle

Removing
data from the middle

Random Data Access

In-order  Data
Access

Searching for specific data element

Comments

Array

O(n)

O(n)

O(n)

O(n)

O(1)

O(1)

O(n)

Most efficient use of memory; use in cases where data size is fixed.

List<T>

best case O(1);

worst case O(n)

O(1)

O(n)

O(n)

O(1)

O(1)

O(n)

Implementation is optimized
for speed. In many cases, List
will be the best choice.

Collection<T>

best case O(1);

worst case O(n)

O(1)

O(n)

O(n)

O(1)

O(1)

O(n)

List is a better choice, unless
publicly exposed as API.

LinkedList

O(1)

O(1)

O(1)

O(1)

O(n)

O(1)

O(n)

Many operations are fast,
but watch out for cache
coherency.

Stack<T>

best case O(1);

worst case O(n)

O(1)

N/A

N/A

N/A

N/A

N/A

Shouldn’t be selected for
performance reasons, but
algorithmic ones.

Queue<T>

best case O(1);

worst case O(n)

O(1)

N/A

N/A

N/A

N/A

N/A

Shouldn’t be selected for
performance reasons, but
algorithmic ones.

Dictionary<K,T>

best case O(1);

worst case O(n)

O(1)

best case O(1); worst case O(n)

O(1)

O(1)*

O(1)*

O(1)

Although in-order access time
is constant time, it is usually slower than other structures due to the over-head of looking up the key.

Source : MSDN

Selecting a Collection Class – An MSDN article with rules of thumb for data structure selection

Advertisements