1 An introduction to C – Computational Technologies

Victor S. Borisov and Petr N. Vabischchevich

1 An introduction to C

Abstract: In this chapter we present a brief description of the basics for the C programming language, which is very popular in scientific and engineering computations. It is assumed that the reader is familiar with the programming fundamentals.

1.1 Syntax

The basic syntax constructions of a C program are described below in order to understand the codes presented in other chapters of the book.

1.1.1 Characters

The following characters are used in C:

  • – The upper and lower case letters:
  • – Decimal numerals:
  • – Punctuation marks:
  • – Whitespace: space, tab, newline and form feed.

Special character combinations called escape sequences are used in C to represent difficult or impossible-to-type characters (see Table 1.1). The first required character of an escape sequence is \, whereas other characters are Latin letters or numbers.

In addition, any character can be presented by a combination of three octal digits using \ddd or \xddd in hexadecimal representation, respectively.

Table 1.1. Escape sequences.

Sequence Description
\a Alert
\b Backspace
\f Form feed
\n Newline (line feed)
\r Carriage return
\t Horizontal tab
\v Vertical tab
\‘ Single quote
\ “ Double quote
\ ? Question mark
\ \ Backslash
\0 Null character (empty)

In standard C, non-Latin alphabets are not supported. For specific compilers, such an extension is possible. In particular, it is appropriate to employ non-Latin letters in comments.

1.1.2 Comments

Comments help to understand a program code and are ignored by any compiler. Comments can be used anywhere where a blank or newline are allowed. Block comments are delimited by / * and * / . Comments are not nested. A single-line comment (introduced in the standard C99) can begin with the character / / and end at the end of the line.

Listing 1.1.

1.1.3 Identifiers

Identifiers are employed as names of variables, functions, and data types. They are created by a declaration of a variable, function, structure etc.

Upper and lower case letters, digits, and the underscore character _ can be used in identifiers. The first character cannot be a digit. There is no restriction on the length of identifiers, but only the first 31 characters are significant for most compilers.

Keywords are specific reserved identifiers or system words that are used as operators in C. Keywords cannot be utilized as names. The list of the C keywords is given below:

The use of an underscore as the first character of an identifier requires some caution. Such identifiers may conflict with the names of system functions and/or variables, and programs with them may have problems with portability.

1.2 Constants and variables

Here we introduce the basic data types that C uses. The syntax of constants and the amount of memory occupied by various types of constants are described below.

1.2.1 Data types

Programs work with data stored in computer memory. The storage and processing of various data types, such as integer and floating-point numbers, are implemented in various ways.

Data can have a prescribed value before running a code and remain unchanged throughout execution. This data is referred to as constants and presented by literals in a code. A variable has a name, and its value can be changed. Thus, a variable is a named memory location, where we can initialize some value and modify it during performing calculations.

A compiler recognizes a constant by a form employed to declare it in a program. To use a variable, we have to declare a name and type of memory allocation in this declaration. The basic data types in C are integers (keywords are int, short, long, unsigned), symbols (char) and floating-point numbers (afloat, double). Any variable can be declared as unchangeable by adding the const keyword. We cannot assign a new value to variables of this type.

1.2.2 Integer constants

The types short, int, and long are designed to represent integers. To declare integers, we must write only a type that is followed by a list of variable names.

Listing 1.2.

A comma is used to delimit identifiers.

Decimal integers are represented using the digits 0-9, but 0 cannot be the first digit. Octal integers start with 0 and are represented by digits 0-7. Hexadecimal integers start with Ox or OX and use digits 0-9 and letters a-f or A-F for representing 10-15.

If a prescribed value is larger than it must be by default, the long type can be used. A long integer constant is explicitly defined by the letter 1 or L after the constant. The short type is employed to save memory. Using unsigned int, unsigned long, unsigned short, we can handle only positive integers. Octal and hexadecimal constants can also have the modifier unsigned. For this, the letters u or U are used after constants.

In the C language, a range of values for representing integers is not specified. Usually a short integer occupies half of a machine word, int employs one word for storage, whereas long corresponds to one or two words, respectively. For 32-bit machines, the word size is equal to 4 bytes (1 byte is equal to 8 bits), whereas int and long usually have the same size (see Table 1.2).

Table 1.2. Integer data types.

Type Size (bytes) Range of value
int 2 -32,768-32,767
4 -2,147,483,648-2,147,483,647
unsigned int 2 0-65,535
4 0-4,294,967,295
short 2 –32, 768–32, 767
unsigned short 2 0-65,535
long 4 -2,147,483,648-2,147,483,647
unsigned long 4 0-4,294,967,295

Table 1.3. Floating-point numbers.

The operator s i z e o f is used to determine the size of a type or variable in bytes.

Listing 1.3.

1.2.3 Floating-point numbers

The fundamental data type in C is float. For computing with double precision, the double (long float) type is used to represent a double value of bits for number presentation. If double is not enough, then we can apply long double.

Real numbers are represented by an integer part, a decimal point, fractional part, letter e or E, and an integer exponent part with an optional sign. Integer and fractional parts are represented as a sequence of digits. Either an integer or a fractional part of a real number may be absent. This is the same for a decimal point or e (E).

Floating-point numbers can be stored with only limited precision, which is defined by the binary format for the presentation of real numbers. The precision (see Table 1.3) is expressed by the number of significant digits (symbols). In this case, the position of a decimal point does not matter.

Complex numbers are supported starting with the standard C99.

1.2.4 Using characters

A character variable is identified by the keyword char. The char type defines integer numbers without sign in the range from 0 to 255 (the size of 1 byte), and a character constant consists of one ASCII code between single quotes.

Listing 1.4.

1.3 Expressions and operators

Operators define some operations on data (operands) to be performed by a computer. The essentials of c operators is given below.

1.3.1 Arithmetic operators

Table 1.4 shows the fundamental arithmetic operators. A value-assigning operator is the simplest operator.

Table 1.4. Arithmetic operators.

Operator Operation
= Assignment
+ Addition
- Subtraction
* Multiplication
/ Division
% Modulo (integer remainder)
+= = Addition with assignment
-= Subtraction with assignment
*= Multiplication with assignment
/= Division with assignment
++ Increment
- Decrement

Listing 1.5.

The assignment operators presented below are often employed to update variables; they reduce a code and make it more convenient in use. For instance, the statement n = n + 5 can be written as n += 5. Such a combination is also possible for other arithmetic operators.

Listing 1.6.

In C, any fractional part resulting from integer division is discarded. The expression n % m yields an integer remainder of dividing n by m.

Listing 1.7.

The ++ operator increases the value of its operand by 1, whereas – decreases its operand by 1. Thus, instead of n = n + 1, we can write ++n. The increment and decrement can precede an operand (prefix form) or follow it (postfix form). If the increment (or decrement) precedes an operand, this operation is executed before the result is used in the expression. If the increment operator follows an operand, the operation is executed after returning the value of the operand.

1.3.2 Relational and logical operators

Relational operators are applied when the values of two variables are compared with each other. Logical operators provide formal logic operations. Results of relational operators are often stand as operands of a logical operator.

In the C language, various operators are employed to compare operands of any type (see Table 1.5). Logical operators are listed in Table 1.6.

The values true and false are used as operands and results of relational and logical operators. The true value is represented by any number other than 0, whereas false is equal to 0. The result of relational or logical operators is true (1) or false (0).

The operators =>, >, =<, < have the same precedence; the equality/inequality operators =_/ ! = follow immediately behind them in the precedence order. The precedence of the operator && is higher than of ∣∣, and both are lower than relational and equality operators. Round brackets are used to change the order of executing relational and logical operations.

Table 1.5. Relational operators.

Operator Operation
> Greater than
>= Greater than or equal to
< Less than
<= Less than or equal to
== Equal to
! = Unequal to

Table 1.6. Logical operators.

Operator Operation
&& AND
∣∣ OR
! NOT

Listing 1.8.

The arithmetic operators + and – have the same precedence, which is lower then the precedence level of the operators *, /, and %, which is in turn lower than logical operators.

1.3.3 Type conversions

Statements and expressions should normally employ variables and constants of just one type. If variables of different types appear in an operator, type conversions occur. In the assignment operator, the value of the variable at its right is converted to the type of the variable at its left if it is possible. Each char variable is converted to int, int to float, and float to double. In this case, precision of computations does not increase, but just a representation of numbers is changed. For example, the inverse conversion from double to float loses precision.

Listing 1.9.

One value can be assigned to many variables in an assignment operator.

Listing 1.10.

Also, the explicit conversion of variable types is possible.

Listing 1.11.

In the above example, the function sin takes only arguments with the float or double type. If an argument has the int type, it is necessary to convert it explicitly to the float type.

1.4 Statements

A statement is considered as a part of a program, which can be executed separately. In C, we have control-flow statements that specify the order in which computations are performed.

1.4.1 Conditional statements

In the C language, i f and switch are conditional statements for making a decision.

There are two forms of if :

and

In this construction, a statement may be one operator, a block of operators or empty. A compound statement (block) consists of one or more operators enclosed in braces { }. An empty statement has only a semicolon ; .

An expression is evaluated and if it is true (non-zero), then the statement following i f is executed. Otherwise the statement following else is executed (see the second form of the i f statement).

A statement if can be inside other if or else statements. An example is the following nested if sequence called a if–else–if ladder.

In this case, the expressions are evaluated from the top down. If some expression is satisfied, the statement associated with this expression is executed, and the rest of a ladder is skipped.

Instead of if-else, a ternary operator? : is often used. The following constructions are equivalent:

The switch statement is employed to select a branch of a computational process based on a value of a control expression. The control expression of switch should be expressed as an integer (int or char).

The value of a control expression is compared with constants in cas e operators. If the value of the switch expression matches one of the constants, the control is transferred to the corresponding mark case and the statements before break are executed. An operator break provides an immediate exit from switch.

1.4.2 Loop statements

Loop statements are intended for repeated execution of a sequence of operations until some condition is satisfied. The condition can be preset before a loop (the for loop) or varies during the execution of a loop body (the while and do-while loops).

The basic form of the for loop is shown here.

In initialization, an initial value of a variable (loop parameter) is assigned. The condition determines whether or not to execute a statement (body of this loop) again. An update of a loop parameter is performed at each iteration in increment. A for loop is executed if the condition is satisfied, otherwise the loop is terminated.

Listing 1.12.

In standard C99, it is possible to declare a variable in the initialization section of a for loop. This variable is local, and its scope extends to the body of the loop.

Listing 1.13.

Any section of a for loop can be omitted and an infinite loop can be obtained if all sections are empty.

In the while loop which has the form any valid expression can serve as a control expression. If a condition is true (the value of the expression is nonzero), then the statement of this loop is executed. Otherwise the loop is terminated.

In the for and while loops, a condition is checked before an iteration. In the do-while loop, such testing is performed after performing an iteration.

This loop is executed until the condition is true. Braces are not required if a statement is not a block.

1.4.3 Jump statements

In C, the following jumps are determined: break, continue, return, and goto. The statements break and continue can be used in any loops, and moreover, break can be employed in switch. A return statement is applied anywhere within a function, and goto is applied anywhere in a program.

Applications of break in switch are discussed above during consideration of conditional statements. In loops, a break statement leads to unconditional termination of loop execution with the transition to a statement after this loop.

Listing 1.14.

A break statement interrupts a loop, whereas continue only provides the interruption of the current iteration, and the jump to the next one. Control is transferred to the beginning of the nearest outer statement of a while, do or for loop.

A goto jump statement transfers control to a statement with a label.

A label is an identifier followed by a colon. It must be anywhere within the function, where goto is used (after or before the goto statement).

1.5 Arrays

Arrays are the fundamental objects used in scientific and engineering computations. An array is a series of values of the same type stored sequentially. The whole array bears a single name. The access to an array element is performed using an integer index.

1.5.1 One-dimensional arrays

The index of the first element of an array is equal to zero. Any array is located in a separate continuous memory domain. The size of any array is defined by a constant. The first element of an array is located in the memory with the lowest address, and the last element corresponds to the biggest address.

The initialization of an array is conducted using a list of values, which is a list of constants separated by commas.

Listing 1.15.

1.5.2 Character arrays

An example of a one-dimensional array is a string that is a character array ending with the special null character (‘ \ 0 ’ ). A string constant is a sequence of ASCII characters between quotes.

To declare a character array intended for storing a string, it is necessary to provide a place for the null character, i.e. the size of the array should be 1 more than the number of characters in the string.

If a size of an array is not defined during initialization, then the size is determined automatically by the number of characters.

Listing 1.16.

1.5.3 Two- and multidimensional arrays

For multidimensional arrays, as many pairs of brackets as the dimension of the array are used. Numbers in brackets are sizes of an array in the selected dimension. A two-dimensional array at [n] ] [m can be treated as a matrix with n rows and m columns, where n and m are constants.

When handling multidimensional arrays, a computer spends a lot of time on the calculation of addresses. Because of this, access to elements of a multidimensional array is much slower than access to elements of a one-dimensional array.

Multidimensional arrays are initialized in the same way as one-dimensional ones. In a multidimensional array, the size of the leftmost dimension can be omitted. The size of other dimensions must be explicitly defined.

Listing 1.17.

1.6 Pointers

A pointer is a variable whose value is the address of an object in computer memory.

1.6.1 Pointer declaration

A pointer declaration consists of a type, the asterisk character * and a variable name.

A pointer of any type can refer to any place in memory, but operations performed with pointers significantly depend on its type. The special character NULL indicates an empty pointer.

Listing 1.18.

1.6.2 Pointer operations

There are two basic pointer operations, * and &. The & operation applied to some variable returns the address of this variable, but not its current value. The * operation returns the value of the variable located at the specified address.

Listing 1.19.

1.6.3 Pointers and arrays

In C, arrays and pointers are closely interrelated. Operations with indexes of arrays can be replaced by operations with pointers.

Listing 1.20.

We can consider the name of an array as the pointer that refers to the first element of the array (x = *pA).

The calculation of y in the above example illustrates arithmetic operations with pointers, i.e. the pointer to the first element is added to 2. The result is the address of the first element plus the memory size of two elements of the array, i.e. the pointer to A[2].

1.7 Functions

A function is an independent unit of a program created to resolve specific tasks. In C, functions play the same role as subroutines or procedures in other algorithmic languages. They are the minimum executable entities of a program. Functions are used in order to break large tasks into smaller subtasks which can be executed many times during runtime. A function must be declared before its use.

1.7.1 Function definition

The basic form for defining a function returning a value is

In the function header, type defines the type of a returning value (int, double, char etc.) or void if the function returns nothing. If a type is not indicated, it is assumed that the function returns int.

A function name should be unique and must not be identical with the keywords or other functions of a program.

A list of parameters is either empty or it includes arguments separated by commas and has the form

The body of the function is restricted by braces and includes compound statements or blocks. A block differs from a compound statement in that it can include definitions of other objects, e.g. variables or arrays. In C, a function cannot be defined within the body of another function (a definition of a function cannot be nested).

In a body of a function statement, return is used to provide immediate exit from the function and jump to a caller. Two forms are used, i.e.

The first form is employed for functions whose returned type is void. In the second form, the expression has the type declared in the function definition.

It is possible to omit the return statement. A compiler automatically adds it to the end of the function’s body.

An example of a function definition is as follows:

Listing 1.21.

1.7.2 Function prototype

To call a function before it is defined in the current file or if it is described in a different file, it is necessary to put a function declaration (prototype). This gives compilers an opportunity to check the types of arguments passed to the function with the types of parameters in the function definition.

A function prototype has the same form as a function definition but without a function body; it ends with a semicolon.

A simplified form of a function prototype does not include the names of parameters; only the parameter types are specified, whereby the prototypes of the same function are equivalent.

Listing 1.22.

An example of a function call:

Listing 1.23.

1.7.3 Pointers as function formal parameters

In C, function parameters are passed by value, and thus a called function cannot change a parameter value by executing statements in the body of the function. This restriction can be overruled if the addresses of variables are passed to a function as actual arguments.

An important case is when a name of an array is used as a function formal parameter. In this case, only the address of the beginning of the array is passed to a function, whereas the array elements are not copied. Thus, in a body of a function, we can change the elements of arrays. There are three forms of passing arrays to functions:

  • – the parameter is defined as an array with an indication of dimension;
  • – the parameter is defined as array without indication of dimension;
  • – the parameter is defined as a pointer.

The following is an example of equivalent specifications of an array:

Listing 1.24.

1.8 Structures

A structure is a collection (set) of one or more variables, possibly of different types, which are grouped together under the same name. Structures allow grouping of connected data and their operation as a unit rather than as separate entities.

1.8.1 Structure definition

When creating a structure, we introduce our own data type. A structure must have a name, and each member of a structure must also have a name and a type.

In the following example, the structure point defines the position of a point on a plane.

Listing 1.25.

1.8.2 Structure variables and their initialization

A structure definition creates a new data type, which can be used to declare variables. Structure variables are declared in the same way as variables of other types.

Similarly to arrays, structures can be initialized by using an initialization list. The following is an example of the definition and initialization of the point structure variables:

Listing 1.26.

1.8.3 Gaining access to structure members

To access a structure member, the member with the variable name must be specified using the following form:

If a pointer to a structure is used to access structure members, the –> operator is employed. The following is an example of the access to a member of the point structure:

Listing 1.27.

1.9 Input/output

Input and output operations in C are implemented by the standard input/output library (the header file is stdio .h) using functions scanf, printf, fscanf, and fprintf. The functions fopen and fclose provide opening and closing files.

1.9.1 Library functions

The c language is accompanied by a set of standard library functions which perform various tasks. In particular, the operations of input and output are realized via library functions. Prototypes of library functions are contained in special header files supplied with libraries as a part of programming systems. To use library input-output functions, we apply the directive

1.9.2 Streams

The information input from peripheral devices (such as a keyboard) is stored in random access memory (RAM), and output information is directed from RAM to peripheral devices (the display, printer, etc.). A hard disk drive can be employed both for input and output. The process of information exchange between RAM and the peripherals is provided by streams. A stream is a byte sequence transmitted during an input /output process.

There are standard streams and streams that are connected with files on a disk. Standard streams are created and opened by the system automatically. The main streams are

  • – stdin – the standard input stream (connected with a keyboard);
  • – stdout – the standard output stream (connected with a video display).

In C, streams are of two types: text and binary. A text stream is a sequence of characters. When transmitting characters from a stream to a display, some of them are not shown (e.g., carriage return, newline, etc.). A binary stream is a sequence of bytes which clearly correspond to what is on an external device.

1.9.3 Work with files

To work with a file, we begin with a declaration of a pointer to a stream using the form

Opening a stream is executed by the standard function fopen defined as

If for some reason during stream opening an error occurs, then the fopen function returns the value NULL.

How to open a file is shown as follows:

Listing 1.28.

Here the first parameter test. dat is the file name in the current directory associated with the stream with the pF pointer. The second parameter w means that the file is opened in write mode.

Table 1.7 presents the possible opening modes.

When working with binary files, the b letter is added to the opening mode (e.g. rb, wb, r+b, ab+).

After finishing with a file, it should be closed. Closing a file is performed by the library function fclose, which has the following prototype:

Table 1.7. Opening modes.

Parameter Mode
r Open for reading
w Open for writing (the previous contents disappears)
a Open for append
r+ Open for both reading and writing
w+ Create for both reading and writing
a+ Open to append or to create for both reading and writing

If closing is successful, then fclose returns zero; other values indicate an error.

1.9.4 Formatted data output

Using the fprint f function of formatted output, we can generate a text file with computational results in character form. The print f function is designed to cause output to a display. This function is a particular version of fprintf, which works with the standard stream stdin.

The description of functions to display and write to the stream a format string is

For example, we can print a string as follows:

Listing 1.29.

A string can contain formatting specifications, which display a subsequent argument from a list. Specifications begin with the character %. The following is an example of printing numbers with comments.

Listing 1.30.

The result is

The format %d (also, %i) is used for integers written in decimal form. To work with float variables, the %f format is applied, which by default outputs 6 digits after the comma. In the format %15.10 f, 15 positions are prescribed to represent a number, and 10 of them are for digits after the decimal point.

The formats %e and %E are employed to output floating numbers in exponential notation. Using the command

Listing 1.31.

we get

The % s format is applied to output a string.

1.9.5 Formatted data input

The scanf (fscanf) function which provides input is an analogue of printf (fprintf), but it performs formatting in the opposite direction. The function prototypes are

The scanf function reads characters from the standard input stream, interprets them according to string specification template, and sends results to its other arguments. The scanf function ends the work if the format is concluded or an input value does not conform with the controlling specification. The scanf function returns the number of successfully entered data elements as the result. If a file is concluded, the result is EOF.

An example of finding the sum of two numbers entered from a keyboard:

Listing 1.32.

1.10 Program structure

A c program is a set of functions and data declarations contained in one or more files, which are called source files. One of these functions is the main function. Files are compiled independently from each other and built with procedures from libraries forming an object code of the program. Such separate compilation results in the fact that any modification of one file does not require recompiling of the entire program.

1.10.1 Preprocessing facilities

A preprocessor is included in compilers as a mandatory component. Its purpose is to process the source files of a program before compilation. The preprocessor modifies a source file, and only after that the file is compiled. Preprocessing includes, in particular, the replacement of identifiers by prescribed character sequences, the inclusion of file texts, the exclusion of individual parts of the code, conditional compilation, etc.

Preprocessor directives are written on a new line, and the first character must be #. Preceding gaps and tabs are ignored. The end of a directive is the end of a line; if the directive does not fit on one line, the character \ is placed at the end of the line and the directive continues on the next line. The directives # include and #define are examples of preprocessing facilities.

The # include directive creates a copy of a file mentioned in it, and adds it to our program. There are two forms of using # include. In the first case, we write and searching for the file is conducted in the system folders where library files are located. If we use the directive then searching is carried out in the current folder, where the source files are located.

The #define directive serves to replace often used constants, keywords, operators, or expressions by some identifiers. The directive syntax is

Before compilation, in all places where the identifier is used, it is replaced by the proposed value.

An example is as follows:

Listing 1.33.

1.10.2 File organization of a program

In the C language, source files are of two types: headers (with the . h extension) and source code files (with the . c extension). Header files serve to transfer information between modules and contain only descriptions, i.e. the necessary information is already written in the blocks of a program. Mainly, this is function prototypes, which describe function names, returned variable types, types of arguments. In headers, there are also described names and types of external variables, constants, structures, etc. For instance, the description means that the variable m of the integer type is defined somewhere in some source code file of a project. In this case, the description provides information about the external variable, but does not define the variable itself.

Source code files are separate modules developed and compiled independently and combined during creation of an executable program. Such files can include descriptions contained in header files. In turn, header files can also use other header files. The # include directive is used to include header files.

1.10.3 Program structure

A c program consists of one or more functions. It is necessary to define only the main function, where a program begins execution. The main function always contains statements (mainly function calls), which reflects the essence of the problem to be solved.

The structure of a typical C program may be written as follows:

To illustrate the general construction above, let us consider a program for the square root calculation by Heron’s iterative method.

Listing 1.34.

After executing this program, we obtain the following output:

In this program, the % 1 f format is employed in sscanf to input numbers with double precision.