5.13.12 Problems with Binary and Text Modes Under the MS-DOS operating system, the end-of-line character sequence is <CR><LF>, while in C, the end-of-line sequence is only <LF>. This opens for some very strange effects. When an MS-DOS file is opened for read in text mode by BRexx, all <CR><LF> character sequences in file data are translated to <LF> when transferred into the C program. Further, BRexx, which is a C program, interprets <LF> as an end-of-line character sequence. However, if the file is opened in binary mode, then the first translation from <CR><LF> in the file to <LF> into the C program is not performed. Consequently, if a file that really is a text file is opened as a binary file and read line-wise, all lines would appear to have a trailing <CR> character. Similarly, <LF> written by the C program is translated to <CR><LF> in the file. This is always done when the file is opened in text mode. When the file is opened in binary mode, all data is transferred without any alterations. Thus, when writing lines to a file which is opened for write in binary mode, the lines appear to have only <LF>, not <CR><LF>. If later opened as a text file, this is not recognized as an end-of-line sequence. Example: Differing end-of-lines Here is an example of how an incorrect choice of file type can corrupt data. Assume BRexx running under MS-DOS, using <CR><LF> as a end-of-line sequence in text files, but the system calls translating this to <LF> in the file I/O interface. Consider the following code. file = open(‘testfile.dat’, ‘wt’) /* text mode */ call write file, ‘45464748’x, ‘dummy’ /* i.e. ‘abcd’ */ call write file, ‘65666768’x, ‘dummy’ /* i.e. ‘ABCD’ */ call close file file = open(‘testfile.dat’, ‘rb’) /* binary mode */ say c2x(read(file)) /* says ‘454647480D’ */ say c2x(read(file)) /* says ‘656667680D’ */ call close file Here, two lines of four characters each are written to the file, while when reading, two lines of five characters are read. The reason is simply that the writing was in text mode, so the end-of- line character sequence was <CR><LF>; while the reading was in binary mode, so the end-of-line character sequence was just <LF>. Thus, the <CR> preceding the <LF> is taken to be part of the line during the read. To avoid this, be very careful about using the correct mode when opening files. Failure to do so will almost certainly give strange effects.
PREV NEXT