Collections and Data Structures

Data Structures scaled e1693816535598

IT Tips & Insights: Learn all about data structures, and when to use different classes of collections.

By Fernando Neto, Software Engineer

When working with data, we can use several different structures so that the data can be treated more efficiently and manipulated as a collection.

One of the most basic types of structures we can use is the System.Array type or the classes from the System.Collections and System.Collections.Generic namespaces to add, remove, and modify individual elements or a range of elements in a collection.

All collections using .Net provide methods to add, remove or locate items in the collection. However, there are collections that work like a dictionary, so we have a unique key to access the data, unlike other collections such as lists, queues and stacks.

In general, you should use generic collections when handling large amounts of data. The following table describes some common collection scenarios and the collection classes you can use for these scenarios.

Need Collection
Store items as key/value pairs for quick lookup by key Dictionary<TKey,TValue>
Access Items by Index List<T>
First in, First Out (FIFO) Queue<T>
Last in, First Out (LIFO) Stack<T>
Access items in sequence LinkedList<T>
Receive notifications when items are removed from or added to the collection. ObservableCollection<T>
Classified collection SortedList<TKey,TValue>
A set of math functions HashSet<T>

SortedSet<T>

It is important to understand how each data structure works for each need. 

Collections and Data Structure Complexity

When choosing a collection type, we need to consider potential performance tradeoffs. Below is an analysis of the complexity of using each data structure:

Action Complexity Worst Case
Stack<T>.Push O(1) O(n)
Queue<T>.Enqueue O(1) O(n)
List<T>.Add O(1) O(n)
List<T>.Item[Int32] O(1) O(1)
List<T>.Enumerator O(n) O(n)
Search on List<T> O(1) O(n)
SortedSet<T>.Add O(log n) O(n)
Dictionary<T>.Add O(1) O(n)
Search on Dictionary<T> O(1) O(1)
SortedDictionary<T>.Add O(log n) O(n log n)

As you can see from the list above, we have different complexities for different actions. Some collections are specific, like stacks and queues, or linked lists. However, when we talk about collections as a list of data, we can treat this data as a List or Dictionaries. 

When dealing with a list of users, we can create a List or a Dictionary where the Dictionary Key is the user’s ID. This will greatly reduce the complexity of returning a user, rather than using a List.

When working with data, we need to be aware of the complexity of accessing information. We often use framework functions that can perform very long queries on lists, causing the application to not perform as expected.

Benchmark of Collections

Six tests were created with the following types of collections: List, HashSet and Dictionary. The tests and their results are described below:

Test 1 – Add 1.000.000 value types.

Results:

————————————————

HashSet.Add : 48 Elapsed Milliseconds

List.Add : 14 Elapsed Milliseconds

Dictionary.Add : 48 Elapsed Milliseconds

Dictionary[n] = n : 42 Elapsed Milliseconds

Test 2 – Add 1.000.000 reference types.

Results:

————————————————

HashSet.Add : 268 Elapsed Milliseconds

List.Add : 23 Elapsed Milliseconds

Dictionary.Add : 208 Elapsed Milliseconds

Dictionary[n] = n : 150 Elapsed Milliseconds

Test 3 – Run Contains() on half of 10000 value types.

Results:

————————————————

HashSet.Contains : 0 Elapsed Milliseconds

List.Contains : 8 Elapsed Milliseconds

Dictionary.ContainsKey : 0 Elapsed Milliseconds

Dictionary.ContainsValue : 48 Elapsed Milliseconds

Test 4 – Run Contains() on half of 10000 reference types.

Results:

————————————————

HashSet.Contains : 0 Elapsed Milliseconds

List.Contains : 126 Elapsed Milliseconds

Dictionary.ContainsKey : 0 Elapsed Milliseconds

Dictionary.ContainsValue : 207 Elapsed Milliseconds

Test 5 – Remove half of 10000 value types.

Results:

————————————————

HashSet.Remove : 0 Elapsed Milliseconds

List.Remove : 5 Elapsed Milliseconds

Dictionary.Remove : 0 Elapsed Milliseconds

Test 6 – Remove half of 10000 reference types.

Results:

————————————————

HashSet.Remove : 0 Elapsed Milliseconds

List.Remove : 68 Elapsed Milliseconds

Dictionary.Remove : 0 Elapsed Milliseconds

These tests were created using .Net 6.

The HashSet and Dictionary types are, in general, better performing than List, in which the faster speed adding new items is greatly offset by deficits in other common operations. However, it’s important to remember that based on your use case, the type of collection you should use normally picks itself. Use a list if you just need a List to keep track of items. Use a Dictionary if you require hash lookup against some value. Use a hash set if you need to perform set operations.

References

Collection and data structure

https://learn.microsoft.com/en-us/dotnet/standard/collections/

Benchmark unit test

https://theburningmonk.com/2011/03/hashset-vs-list-vs-dictionary/

*The tests are disponible on the link above, I created them using .Net 6.

About

 

Hello, my name is Fernando. I’m from Sao Paulo, Brazil. I have been working with .Net since 2006, with all versions and technologies, since version 2.0 of Framework. Currently I work as a Software Engineer at Softensity, always working with best practices and the latest technologies.