Naming Streams

5.4. Naming Streams

Unlike most programming languages, REXX does not use file handles;
the name of the stream is also in general the handle (although
some implementations add an extra level of indirection). You must
supply the name to all I/O functions operating on a stream.
However, internally, the REXX interpreter is likely to use the
native file pointers of the operating system, in order to improve
speed.  The name specified can generally be the name of an
operating system file, a device name, or a special stream name
supported by your implementation.

The format of the stream name is very dependent upon your
operating system. For portability concerns, you should try not to
specify it as a literal string in each I/O call, but set a
variable to the stream name, and use that variable when calling
I/O functions. This reduces the number of places you need to make
changes if you need to port the program to another system.
Unfortunately, this approach increases the need for PROCEDURE
EXPOSE, since the variable containing the files name must be
available to all routines using file I/O for that particular file,
and all their non-common ancestors.

Example: Specifying file names

The following code illustrates a portability problem related to
the naming of streams. The variable filename is set to the name of
the stream operated on in the function call.

     filename = ‘/tmp/MyFile.Txt’
     say ‘ first line is’ linein( filename )
     say ‘second line is’ linein( filename )
     say ‘ third line is’ linein( filename )

Suppose this script, which looks like it is written for Unix, is
moved to a VMS machine. Then, the stream name might be something
like SYS$TEMP:MYFILE.TXT, but you only need to change the script
at one particular point: the assignment to the variable filename;
as opposed to three places if the stream name is hard-coded in
each of the three calls to LINEIN().

If the stream name is omitted from the built-in I/O functions, a
default stream is used: input functions use the default input
stream, while output functions use the default output stream.
These are implicit references to the default input and output
streams, but unfortunately, there is no standard way to explicitly
refer to these two streams. And consequently, there is no standard
way to refer to the default input or output stream in the built-in
function STREAM().

However, most implementations allow you to access the default
streams explicitly through a name, maybe the nullstring or
something like stdin and stdout.  However, you must refer to the
implementation-specific documentation for information about this.

Also note that standard REXX does not support the concept of a
default error stream. On operating systems supporting this, it can
probably be accessed through a special name; see system-specific
information. The same applies for other special streams.

Sometimes the term “default input stream” is called “standard
input stream,” “default input devices,” “standard input,” or just
“stdin.”

The use of stream names instead of stream descriptors or handles
is deeply rooted in the REXX philosophy: Data structures are text
strings carrying information, rather than opaque data blocks in
internal, binary format. This opens for some intriguing
possibilities.  Under some operating systems, a file can be
referred to by many names.  For instance, under Unix, a file can
be referred to as foobar, ./foobar and ././foobar. All which name
the same file, although a REXX interpreter may be likely to
interpret them as three different streams, because the names
themselves differ.  On the other hand, nothing prevents an
interpreter from discovering that these are names for the same
stream, and treat them as equivalent (except concerns for
processing time). Under Unix, the problem is not just confined to
the use of ./ in file names, hard-links and soft-links can produce
similar effects, too.

Example: Internal file handles
Suppose you start reading from a stream, which is connected to a
file called foo. You read the first line of foo, then you issue a
command, in order to rename foo to bar.  Then, you try to read the
next line from foo. The REXX program for doing this under Unix
looks something like:

     signal on notready
     line1 = linein( ‘foo’ )
     ‘mv foo bar’
     line2 = linein( ‘foo’ )

Theoretically, the file foo does not exist during the second call,
so the second read should raise the NOTREADY condition.  However,
a REXX interpreter is likely to have opened the stream already, so
it is performing the reading on the file descriptor of the open
file. It is probably not going to check whether the file exists
before each I/O operation (that would require a lot of extra
checking). Under most operating systems, renaming a file will not
invalidate existing file descriptors.  Consequently, the
interpreter is likely to continue to read from the original foo
file, even though its has changed.

Example: Unix temporary files

On some systems, you can delete a file, and still read from and
write to the stream connected to that file. This technique is
shown in the following Unix specific code:

     tmpfile = ‘/tmp/myfile’
     call lineout tmpfile, ‘’
     call lineout tmpfile,, 1
     ‘rm’ tmpfile
     call lineout tmpfile, ‘This is the first line’

Under Unix, this technique is often used to create temporary
files; you are guaranteed that the file will be deleted on
closing, no matter how your program terminates. Unix deletes a
file whenever there are no more references to it. Whether the
reference is from the file system or from an open descriptor in a
user process is irrelevant.  After the rm command, the only
reference to the file is from the REXX interpreter. Whenever it
terminates, the file is deleted—-since there are no more
references to it.

Example: Files in different directories

Here is yet another example of how using the filename directly in
the stream I/O functions may give strange effects. Suppose you are
using a system that has hierarchical directories, and you have a
function CHDIR() which sets a current directory; then consider the
following code:

     call chdir ‘../dir1’
     call lineout ‘foobar’, ‘written to foobar while in dir1’
     call chdir ‘../dir2’
     call lineout ‘foobar’, ‘written to foobar while in dir2’

Since the file is implicitly opened while you are in the directory
dir1, the file foobar refers to a file located there.  However,
after changing the directory to dir2, it may seem logical that the
second call to LINEOUT() operates on a file in dir2, but that may
not be the case. Considering that these clauses may come a great
number of lines apart, that REXX has no standard way of closing
files, and that REXX only have one file table (i.e. open files are
not local to subroutines); this may open for a significant
astonishment in complex REXX scripts.

Whether an implementation treats ././foo and ./foo as different
streams is system-dependent; that applies to the effects of
renaming or deleting the file while reading or writing, too. See
your interpreter’s system-specific documentation.

Most of the effects shown in the examples above are due to
insufficient isolation between the filename of the operating
system and the file handle in the REXX program. Whenever a file
can be explicitly opened and bound to a file handle, you should do
that in order to decrease the possibilities for strange side
effects.

Interpreters that allow this method generally have an OPEN()
function that takes the name of the files to open as a parameter,
and returns a string that uniquely identifies that open file
within the current context; e.g. an index into a table of open
files. Later, this index can be used instead of the filename.

Some implementations allow only this indirect naming scheme, while
others may allow a mix between direct and indirect naming. The
latter is likely to create some problems, since some strings are
likely to be both valid direct and indirect file ids.



 PREV   NEXT