Arrays in C#

In the previous articles of this series, primitive data types were used in different C# programs to store and manage data inside the computer memory (RAM).

This was done by declaring variables—specific memory areas whose size in bits depends on the associated type—that are available and can be accessed during the execution of the program through the corresponding variable identifier. Each of those variables can store a single value.

An array is a data structure that allows the allocation of a sequence of elements of the same data type. When an array is declared, a set of contiguous memory locations is allocated inside the computer's memory. Each location stores an array element that can be accessed via a corresponding integer index. Arrays in C# can be declared using the syntax described below.

int n = // specifies the number of elements of the array 
double values = new double[n];

The above instruction allocates an array of n elements of type double called values on the computer's memory. When a new array is created, its elements are initialised to the default value for their type, for instance, 0.0d for double. Once allocated, the size of the array is fixed and cannot be modified. A possible graphical representation of an array of n elements is shown below. Please note that 0 is used as the default value of the elements (instead of 0.0d as it should be for a double type) to simplify the figure.

arrays.png

It should be noted that all the elements of the array are allocated in a contiguous area of the memory. The size of the array is fixed and cannot be modified during a program execution. Moreover, the first element is associated with index 0 and the last element with index (n-1).

Array Initialisation

Like variables of primitive data types, arrays can also be initialised when declared inside a program, as shown in the code snippet below.

double[] values = {0.1, 3.76, -1.23, 0.45};

The values assigned to each array element are specified between curly braces, separated by a comma. When the above approach is used to declare and initialise an array, it is not required to define the size of the array explicitly. This will be determined when the program is executed, according to the number of elements provided in between the curly braces.

Note that once an array has been declared inside a program, the above notation cannot be used to assign new values to all the array's elements in a single instruction. However, as described in the next section, a value can be assigned to any individual array element.

Assigning Values to the Elements of an Array

By convention, given an array of n elements, its first element can be accessed via the index 0, whereas the last one can be via the index n-1. The array identifier and the index are used to assign and read values from array elements located in a specific position.

// declare an array of 5 elements: first index 0, last index 4
double[] values = new double[5]; 

values[2] = 3.4; // value stored in position 3 
values[0] = -1.1; // value stored in position 0 (first)
values[4] = -8.7; // value stored in position 4 (last)

When accessing the elements of an array of size n, it is essential to specify an index within the allocated range, i.e., from 0 to n-1, to avoid error conditions when the associated program is executed referred to out-of-bounds array access. If the following instruction was added to the previous code snippet

values[5] = 2.3;

Even though the code could be compiled without errors, its execution would generate an error condition at runtime, called exception. Exception handling is a technique in which a programming construct is used to consistently trap, intercept and handle an error that occurred during application execution. The .NET runtime is designed to use exception handling based on exception objects and protected blocks of code. This is one of the features offered by application virtual machines that run managed code.

In a C program (native code), when an out-of-bounds array access occurs, it can lead to undefined behaviour, causing the program to continue running with unpredictable outcomes, crash unexpectedly, produce incorrect results without immediate detection, or reveal issues at some undetermined point in the future.

However, in a C# program (managed code) running within the .NET managed environment, the scenario is different. Here, the runtime performs checks to prevent out-of-bounds array access, and any attempt to access an element beyond the array's bounds triggers a runtime exception known as IndexOutOfRangeException. The advantage lies in the ability to catch and handle these exceptions within the code, preventing program crashes and enabling graceful error recovery.

It should be noted that in C#, the number of elements of an array can also be specified at runtime, as demonstrated by the following code.

int size = Convert.ToInt32(Console.ReadLine());
// content of size variable is used instead of a constant value
double [] values = new double[size];

In the above example, the compiler would not know what the content of size would be at compile-time. Hence, as discussed above, in C# any checks on the size of arrays and what indexes are used are left to the .NET runtime.

Accessing the Elements of an Array

We have already seen how a specific element of an array can be accessed via the [] operator in conjunction with the array identifier and a particular index enclosed in between the []. Assignment expressions can be written accordingly, where an array element can be the lvalue of the assignment, the rvalue, or both. This is represented in the following code snippet.

int[] values = {1, 7, 3, 5, 4};

values[3] = 18; // {1, 7, 3, 18, 4}
int x = values[1] + 3; // {1, 7, 3, 18, 4}, x = 10
values[2] = values[1] + x; // {1, 7, 17, 18, 4}

Using loops with Arrays

Often, it is required to perform operations on different elements of an array. Whilst the above instructions provide access to specific elements, loops are the most widely used approach to iterate over the whole array.

It should be noted that an array, once created, provides a property—Length—that contains the number of elements of the array. This is accessible using the dot notation shown in the following code snippet.

int[] values = new int[12];
int size = values.Length; // size contains 12 

int[] values2 = { 1, 2, 3, 4, 5 };
int size2 = values2.Length; // size2 contains 5

The following code snippet shows how a for loop can be used to print the array's content on the screen by accessing each of its elements via the associated index.

int[] values = {3, 2, 10, 5, 1};

for (int i = 0; i < values.Length; i++)
{
      Console.WriteLine(values[i]);
}

The i counter is used inside the loop as the index of the values array. It is initialised to start from 0, i.e., the index of the first array element. i is then incremented by 1 at each iteration to refer to and access the remaining elements of the array. Moreover, the array property values.Length is utilised to define a relational expression that controls the loop's termination. When i contains the value 4, the last element of the array is accessed. The subsequent increment of i from 4 to 5 will make the loop condition false and cause the loop's termination.

When access to all the elements of an array is anticipated, a foreach loop can be used to iterate over them. Unlike other for loops, foreach loops usually do not maintain an explicit counter. They essentially say do this to everything in this array, rather than do this x times. The code snippet below shows an example of a foreach loop.

int[] values = {3, 2, 10, 5, 1};

foreach (int v in values)
{
      Console.WriteLine(v);    
}

After the foreach keyword, a variable is declared in parenthesis. Then, the in keyword is used before the array's identifier (or, in general, the Collection) on which the loop will iterate. In the above example, because the elements of the array values are of type int, the variable v used inside the foreach is also an int.

At each iteration, v will contain one of the elements of the array values, starting from the first one, i.e., the value 3 stored in position with index 0. The loop terminates when the last element, i.e., the value 1, stored in position with index 4, is assigned to v. A programmer does not have to deal with indexes as the underlying access to the array's elements is abstracted and managed by the C# compiler.