Character-wise and Line-wise I/O

5.8. Character-wise and Line-wise I/O

Basically, the built-in REXX library offers two strategies of
reading and writing streams: line-wise and character-wise. When
reading line-wise, the underlying storage method of the stream
must contain information which describes where each line starts
and ends.

Some file systems store this information as one or more special
characters; while others structure the file in a number of
records; each containing a single line.  This introduces a
slightly subtle point; even though a stream foo returns the same
data when read by LINEIN()on two different machines; the data read
from foo may differ between the same two machines when the stream
is read by CHARIN(), and vice versa. This is so because the end-of-
line markers can vary between the two operating systems.

Example: Character-wise handling of EOL

Suppose a text file contains the following three lines (ASCII
character set is assumed):

     first
     second
     third

and you first read it line-wise and then character-wise. Assume
the following program:

     file = СDATAFILEТ
     foo = СТ
     do i=1 while chars(file)>0
          foo = foo || c2x(charin(file))С Т
     end
     say foo

When the file is read line-wise, the output is identical on all
machines, i.e. the three lines shown above. However, the character-
wise reading will be dependent on your operating system and its
file system, thus, the output might e.g. be any of:

     66 69 72 73 74 73 65 6F 63 6E 64 74 68 69 72 64 66 69 72 73
     74 0A
     
     66 69 72 73 74 0A
     73 65 6F 63 6E 64 0A
     74 68 69 72 64 0A
     
     66 69 72 73 74 0D 0A
     73 65 6F 63 6E 64 0D 0A
     74 68 69 72 64 0D 0A

If the machine uses records to store the lines, the first one may
be the result; here, only the data in the lines of the file is
returned.  Note that the boxes in the output are put around the
data generated by the actual line contents. What is outside the
boxes is generated by the end-of-line character sequences.

The second output line is typical for Unix machines. They use the
newline ASCII character as line separator, and that character is
read immediately after each line. The last line is typical for MS-
DOS, where the line separator character sequence is a carriage
return following by a newline (ASCII С0DТx and С0AТx).

For maximum portability, the line-wise built-in functions
(LINEIN(), LINEOUT() and LINES()) should only be used for line-
wise streams. And the character-wise built-in functions (CHARIN(),
CHAROUT() and CHARS()) should only be used for character-wise
data. You should in general be very careful when mixing character-
and line-wise data in a single stream; it does work, but may
easily lead to portability problems.

The difference between character- and line-wise streams are
roughly equivalent to the difference between binary and text
streams, but the two concepts are not totally equivalent. In a
binary file, the data read is the actual data stored in the file,
while in a text file, the character sequences used for denoting
end-of-line and end-of-file markers may be translated to actions
or other characters during reading.

The end-of-file marker may be differently implemented on different
systems. On some systems, this marker is only implicitly present
at the end-of-fileЧ-which is calculated from the file size (e.g.
Unix). Other systems may put a character signifying end-of-file at
the end (or even in the middle) of the file (e.g. <Ctrl-Z> for MS-
DOS).  These concepts vary between operating systems, interpreters
should handle each concept according to the customs of the
operating system.  Check the implementation-specific documentation
for further information. In any case, if the interpreter treats a
particular character as end-of-file, then it only gives special
treatment to this character during line-wise operations. During
character-wise operations, no characters have special meanings.



 PREV   NEXT