Incomputing, anuninitialized variable is avariable that is declared but is not set to a definite known value before it is used. It will havesome value, but not a predictable one. As such, it is a programming error and a common source ofbugs in software.
A common assumption made by novice programmers is that all variables are set to a known value, such as zero, when they are declared. While this is true for many languages, it is not true for all of them, and so the potential for error is there. Languages such asC usestack space for variables, and the collection of variables allocated for a subroutine is known as astack frame. While the computer will set aside the appropriate amount of space for the stack frame, it usually does so simply by adjusting the value of the stackpointer, and does not set thememory itself to any new state (typically out of efficiency concerns). Therefore, whatever contents of that memory at the time will appear as initial values of the variables which occupy those addresses.
Here's a simple example in C:
voidcount(void){intk;for(inti=0;i<10;i++){k=k+1;}printf("%d",k);}
The final value ofk
is undefined. The answer that it must be 10 assumes that it started at zero, which may or may not be true. Note that in the example, the variablei
is initialized to zero by the first clause of thefor
statement.
Another example can be when dealing withstructs. In the code snippet below, we have astruct student
which contains some variables describing the information about a student. The functionregister_student
leaks memory contents because it fails to fully initialize the members ofstruct student new_student
. If we take a closer look, in the beginning,age
,semester
andstudent_number
are initialized. But the initialization of thefirst_name
andlast_name
members are incorrect. This is because if the length offirst_name
andlast_name
character arrays are less than 16 bytes, during thestrcpy
,[1] we fail to fully initialize the entire 16 bytes of memory reserved for each of these members. Hence aftermemcpy()
'ing the resulted struct tooutput
,[2] we leak some stack memory to the caller.
structstudent{unsignedintage;unsignedintsemester;charfirst_name[16];charlast_name[16];unsignedintstudent_number;};intregister_student(structstudent*output,intage,char*first_name,char*last_name){// If any of these pointers are Null, we fail.if(!output||!first_name||!last_name){printf("Error!\n");return-1;}// We make sure the length of the strings are less than 16 bytes (including the null-byte)// in order to avoid overflowsif(strlen(first_name)>15||strlen(last_name)>15){printf("first_name and last_name cannot be longer than 16 characters!\n");return-1;}// Initializing the membersstructstudentnew_student;new_student.age=age;new_student.semester=1;new_student.student_number=get_new_student_number();strcpy(new_student.first_name,first_name);strcpy(new_student.last_name,last_name);//copying the result to outputmemcpy(output,&new_student,sizeof(structstudent));return0;}
In any case, even when a variable isimplicitly initialized to adefault value like 0, this is typically not thecorrect value. Initialized does not mean correct if the value is a default one. (However, default initialization to0 is a right practice for pointers and arrays of pointers, since it makes them invalid before they are actually initialized to their correct value.) In C, variables with static storage duration that are not initialized explicitly are initialized to zero (or null, for pointers).[3]
Not only are uninitialized variables a frequent cause of bugs, but this kind of bug is particularly serious because it may not be reproducible: for instance, a variable may remain uninitialized only in somebranch of the program. In some cases, programs with uninitialized variables may even passsoftware tests.
Uninitialized variables are powerful bugs since they can be exploited to leak arbitrary memory or to achieve arbitrary memory overwrite or to gain code execution, depending on the case. When exploiting a software which utilizesaddress space layout randomization (ASLR), it is often required to know thebase address of the software in memory. Exploiting an uninitialized variable in a way to force the software to leak a pointer from itsaddress space can be used to bypass ASLR.
Uninitialized variables are a particular problem in languages such as assembly language, C, andC++, which were designed forsystems programming. The development of these languages involved a design philosophy in which conflicts between performance and safety were generally resolved in favor of performance. The programmer was given the burden of being aware of dangerous issues such as uninitialized variables.
In other languages, variables are often initialized to known values when created. Examples include:
NULL
(distinct fromNone
) and raises anUnboundLocalError
when such a variable is accessed before being (re)initialized to a valid value.Even in languages where uninitialized variables are allowed, manycompilers will attempt to identify the use of uninitialized variables and report them ascompile-timeerrors. Some languages assist this task by offering constructs to handle the initializedness of variables; for example,C# has a special flavour of call-by-reference parameters to subroutines (specified asout
instead of the usualref
), asserting that the variable is allowed to be uninitialized on entry but will be initialized afterwards.