SuivantBasNiv. sup.

10.1 Files, I/O Units, and Records 

In Fortran the term file is used for anything that can be handled with a READ or WRITE statement: the term covers not just data files stored on disc or tape and also peripheral devices such as printers or terminals. Strictly these should all be called external files, to distinguish them from internal files.

An internal file is nothing more than a character variable or array which is used as a temporary file while the program is running. Internal files can be used with READ and WRITE statements in order to process character information under the control of a format specification. They cannot be used by other I/O statements.

Before an external file can be used it must be connected to an I/O unit. I/O units are integers which may be chosen freely from zero up to a system-dependent limit (usually at least 99). Except in OPEN and INQUIRE statements, files can only be referred to via their unit numbers.

The OPEN statement connects a named file to a numbered unit. It usually specifies whether the file already exists or whether a new one is to be created, for example:
       OPEN(UNIT=1, FILE='B:INPUT.DAT', STATUS='OLD')
       OPEN(UNIT=9, FILE='PRINTOUT', STATUS='NEW')
For simplicity most of the examples in this section show an actual integer as the unit identifier, but it helps to make software more modular and adaptable if a named constant or a variable is used instead.

I/O units are a global resource. A file can be opened in any program unit; once it is open I/O operations can be performed on it in any program unit provided that the same unit number is used. The unit number can be held in an integer variable and passed to the procedure as an argument.

The connection between a file and a unit, once established, persists until:

Although all files are closed when the program exits, it is good practice to close them explicitly as soon as I/O operations on them are completed. If the program terminates abnormally, for example because an error occurs or it is aborted by the user, any files which are open, especially output files, may be left with incomplete or corrupted records.

The INQUIRE statement can be used to obtain information about the current properties of external files and I/O units. INQUIRE is particularly useful when writing library procedures which may have to run in a variety of different program environments. You can find out, for example, which unit numbers are free for use or whether a particular file exists and if so what its characteristics are.

Records 

A file consists of a sequence of records. In a text file a record corresponds to a line of text; in other cases a record has no physical basis, it is just a convenient collection of values chosen to suit the application. There is no need for a record to correspond to a disc sector or a tape block. READ and WRITE statements always start work at the beginning of a record and always transfer a whole number of records.

The rules of Fortran set no upper limit to the length of a record but, in practice, each operating system may do so. This may be different for different forms of record.

Formatted and Unformatted Records 

External files come in two varieties according to whether their records are formatted or unformatted. Formatted records store data in character-coded form, i.e. as lines of text. This makes them suitable for a wide range of applications since, depending on their contents, they may be legible to humans as well as computers. The main complication for the programmer is that each WRITE or READ statement must specify how each value is to be converted from internal to external form or vice-versa. This is usually done with a format specification.

Unformatted records store data in the internal code of the computer so that no format conversions are involved. This has a several advantages for files of numbers, especially floating-point numbers. Unformatted data transfers are simpler to program, faster in execution, and free from rounding errors. Furthermore the resulting data files, sometimes called binary files, are usually much smaller. A real number would, for example, have to be turned into a string of 10 or even 15 characters to preserve its precision on a formatted record, but on an unformatted record a real number typically occupies only 4 bytes i.e. the same as 4 characters. The drawback is that unformatted files are highly system-specific. They are usually illegible to humans and to other brands of computer and sometimes incompatible with files produced by other programming languages on the same machine. Unformatted files should only be used for information to be written and read by Fortran programs running on the same type of computer.

Sequential and Direct Access 

All peripheral devices allow files to be processed sequentially: you start at the beginning of the file and work through each record in turn. One important advantage of sequential files is that different records can have different lengths; the minimum record length is zero but the maximum is system-dependent.

Sequential files behave as if there were a pointer attached to the file which always indicates the next record to be transferred. On devices such as terminals and printers you can only read or write in strict sequential order, but when a file is stored on disc or tape it is possible to use the REWIND statement to reset this pointer to the start of the file, allowing it to be read in again or re-written. On suitable files the BACKSPACE statement can be used to move the pointer back by one record so that the last record can be read again or over-written.

One unfortunate omission from the Fortran Standard is that the position of the record pointer is not defined when an existing sequential file is opened. Most Fortran systems behave sensibly and make sure that they start at the beginning of the file, but there are a few rogue systems around which make it advisable, in portable software, to use REWIND after the OPEN statement. Another problem is how append new records to an existing sequential file. Some systems provide (as an extension) an "append" option in the OPEN statement, but the best method using Standard Fortran is to open the file and read records one at a time until the end-of-file condition is encountered; then use BACKSPACE to move the pointer back and clear the end-of-file condition. New records can then be added in the usual way.

The alternative access method is direct-access which allows records to be read and written in any order. Most systems only permit this for files stored on random-access devices such as discs; it is sometimes also permitted on tapes. All records in a direct-access file must be the same length so that the system can compute the location of a record from its record number. The record length has to be chosen when the file is created and (on most systems) is then fixed for the life of the file. In Fortran, direct-access records are numbered from one upwards; each READ or WRITE statement specifies the record number at which the transfer starts.

Records may be written to a direct-access file in any order. Any record can be read provided that it exists, i.e. it has been written at some time since the file was created. Once a record has been written there is no way of deleting it, but its contents can be updated, i.e. replaced, at any time.

A few primitive operating systems require the maximum length of a direct-access file to be specified when the file is created; this is not necessary in systems which comply fully with the Fortran Standard.

SuivantHautNiv. sup.