Make your own free website on
5.13.12 Problems with Binary and Text Modes
Under the MS-DOS operating system, the end-of-line character
sequence is <CR><LF>, while in C, the end-of-line sequence is only
<LF>.  This opens for some very strange effects.

When an MS-DOS file is opened for read in text mode by BRexx, all
<CR><LF> character sequences in file data are translated to <LF>
when transferred into the C program. Further, BRexx, which is a C
program, interprets <LF> as an end-of-line character sequence.
However, if the file is opened in binary mode, then the first
translation from <CR><LF> in the file to <LF> into the C program
is not performed. Consequently, if a file that really is a text
file is opened as a binary file and read line-wise, all lines
would appear to have a trailing <CR> character.

Similarly, <LF> written by the C program is translated to <CR><LF>
in the file.  This is always done when the file is opened in text
mode. When the file is opened in binary mode, all data is
transferred without any alterations. Thus, when writing lines to a
file which is opened for write in binary mode, the lines appear to
have only <LF>, not <CR><LF>. If later opened as a text file, this
is not recognized as an end-of-line sequence.

Example: Differing end-of-lines

Here is an example of how an incorrect choice of file type can
corrupt data. Assume BRexx running under MS-DOS, using <CR><LF> as
a end-of-line sequence in text files, but the system calls
translating this to <LF> in the file I/O interface. Consider the
following code.

     file = open(‘testfile.dat’, ‘wt’)      /* text mode */
     call write file, ‘45464748’x, ‘dummy’  /* i.e. ‘abcd’ */
     call write file, ‘65666768’x, ‘dummy’  /* i.e. ‘ABCD’ */
     call close file
     file = open(‘testfile.dat’, ‘rb’)      /* binary mode */
     say c2x(read(file))                    /* says ‘454647480D’
     say c2x(read(file))                    /* says ‘656667680D’
     call close file

Here, two lines of four characters each are written to the file,
while when reading, two lines of five characters are read. The
reason is simply that the writing was in text mode, so the end-of-
line character sequence was <CR><LF>; while the reading was in
binary mode, so the end-of-line character sequence was just <LF>.
Thus, the <CR> preceding the <LF> is taken to be part of the line
during the read.

To avoid this, be very careful about using the correct mode when
opening files. Failure to do so will almost certainly give strange