[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B. Installing gawk

This appendix provides instructions for installing gawk on the various platforms that are supported by the developers. The primary developer supports GNU/Linux (and Unix), whereas the other ports are contributed. See section Reporting Problems and Bugs, for the electronic mail addresses of the people who did the respective ports.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.1 The gawk Distribution

This section describes how to get the gawk distribution, how to extract it, and then what is in the various files and subdirectories.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.1.1 Getting the gawk Distribution

There are three ways to get GNU software:

The GNU software archive is mirrored around the world. The up-to-date list of mirror sites is available from the main FSF web site. Try to use one of the mirrors; they will be less busy, and you can usually find one closer to your site.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.1.2 Extracting the Distribution

gawk is distributed as a tar file compressed with the GNU Zip program, gzip.

Once you have the distribution (for example, `gawk-3.1.6.tar.gz'), use gzip to expand the file and then use tar to extract it. You can use the following pipeline to produce the gawk distribution:

 
# Under System V, add 'o' to the tar options
gzip -d -c gawk-3.1.6.tar.gz | tar -xvpf -

This creates a directory named `gawk-3.1.6' in the current directory.

The distribution file name is of the form `gawk-V.R.P.tar.gz'. The V represents the major version of gawk, the R represents the current release of version V, and the P represents a patch level, meaning that minor bugs have been fixed in the release. The current patch level is 6, but when retrieving distributions, you should get the version with the highest version, release, and patch level. (Note, however, that patch levels greater than or equal to 80 denote "beta" or nonproduction software; you might not want to retrieve such a version unless you don't mind experimenting.) If you are not on a Unix system, you need to make other arrangements for getting and extracting the gawk distribution. You should consult a local expert.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.1.3 Contents of the gawk Distribution

The gawk distribution has a number of C source files, documentation files, subdirectories, and files related to the configuration process (see section Compiling and Installing gawk on Unix), as well as several subdirectories related to different non-Unix operating systems:

Various `.c', `.y', and `.h' files

The actual gawk source code.

`README'
`README_d/README.*'

Descriptive files: `README' for gawk under Unix and the rest for the various hardware and software combinations.

`INSTALL'

A file providing an overview of the configuration and installation process.

`ChangeLog'

A detailed list of source code changes as bugs are fixed or improvements made.

`NEWS'

A list of changes to gawk since the last release or patch.

`COPYING'

The GNU General Public License.

`FUTURES'

A brief list of features and changes being contemplated for future releases, with some indication of the time frame for the feature, based on its difficulty.

`LIMITATIONS'

A list of those factors that limit gawk's performance. Most of these depend on the hardware or operating system software and are not limits in gawk itself.

`POSIX.STD'

A description of one area in which the POSIX standard for awk is incorrect as well as how gawk handles the problem.

`doc/awkforai.txt'

A short article describing why gawk is a good language for AI (Artificial Intelligence) programming.

`doc/README.card'
`doc/ad.block'
`doc/awkcard.in'
`doc/cardfonts'
`doc/colors'
`doc/macros'
`doc/no.colors'
`doc/setter.outline'

The troff source for a five-color awk reference card. A modern version of troff such as GNU troff (groff) is needed to produce the color version. See the file `README.card' for instructions if you have an older troff.

`doc/gawk.1'

The troff source for a manual page describing gawk. This is distributed for the convenience of Unix users.

`doc/gawk.texi'

The Texinfo source file for this Web page. It should be processed with TeX to produce a printed document, and with makeinfo to produce an Info or HTML file.

`doc/gawk.info'

The generated Info file for this Web page.

`doc/gawkinet.texi'

The Texinfo source file for TCP/IP Internetworking with gawk. It should be processed with TeX to produce a printed document and with makeinfo to produce an Info or HTML file.

`doc/gawkinet.info'

The generated Info file for TCP/IP Internetworking with gawk.

`doc/igawk.1'

The troff source for a manual page describing the igawk program presented in An Easy Way to Use Library Functions.

`doc/Makefile.in'

The input file used during the configuration process to generate the actual `Makefile' for creating the documentation.

`Makefile.am'
`*/Makefile.am'

Files used by the GNU automake software for generating the `Makefile.in' files used by autoconf and configure.

`Makefile.in'
`acconfig.h'
`acinclude.m4'
`aclocal.m4'
`configh.in'
`configure.in'
`configure'
`custom.h'
`missing_d/*'
`m4/*'

These files and subdirectories are used when configuring gawk for various Unix systems. They are explained in Compiling and Installing gawk on Unix.

`po/*'

The `po' library contains message translations.

`awklib/extract.awk'
`awklib/Makefile.am'
`awklib/Makefile.in'
`awklib/eg/*'

The `awklib' directory contains a copy of `extract.awk' (see section Extracting Programs from Texinfo Source Files), which can be used to extract the sample programs from the Texinfo source file for this Web page. It also contains a `Makefile.in' file, which configure uses to generate a `Makefile'. `Makefile.am' is used by GNU Automake to create `Makefile.in'. The library functions from A Library of awk Functions, and the igawk program from An Easy Way to Use Library Functions, are included as ready-to-use files in the gawk distribution. They are installed as part of the installation process. The rest of the programs in this Web page are available in appropriate subdirectories of `awklib/eg'.

`unsupported/atari/*'

Files needed for building gawk on an Atari ST (see section Installing gawk on the Atari ST, for details).

`unsupported/tandem/*'

Files needed for building gawk on a Tandem (see section Installing gawk on a Tandem, for details).

`posix/*'

Files needed for building gawk on POSIX-compliant systems.

`pc/*'

Files needed for building gawk under MS-DOS, MS Windows and OS/2 (see section Installation on PC Operating Systems, for details).

`vms/*'

Files needed for building gawk under VMS (see section How to Compile and Install gawk on VMS, for details).

`test/*'

A test suite for gawk. You can use `make check' from the top-level gawk directory to run your version of gawk against the test suite. If gawk successfully passes `make check', then you can be confident of a successful port.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.2 Compiling and Installing gawk on Unix

Usually, you can compile and install gawk by typing only two commands. However, if you use an unusual system, you may need to configure gawk for your system yourself.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.2.1 Compiling gawk for Unix

After you have extracted the gawk distribution, cd to `gawk-3.1.6'. Like most GNU software, gawk is configured automatically for your Unix system by running the configure program. This program is a Bourne shell script that is generated automatically using GNU autoconf. (The autoconf software is described fully in Autoconf--Generating Automatic Configuration Scripts, which is available from the Free Software Foundation.)

To configure gawk, simply run configure:

 
sh ./configure

This produces a `Makefile' and `config.h' tailored to your system. The `config.h' file describes various facts about your system. You might want to edit the `Makefile' to change the CFLAGS variable, which controls the command-line options that are passed to the C compiler (such as optimization levels or compiling for debugging).

Alternatively, you can add your own values for most make variables on the command line, such as CC and CFLAGS, when running configure:

 
CC=cc CFLAGS=-g sh ./configure

See the file `INSTALL' in the gawk distribution for all the details.

After you have run configure and possibly edited the `Makefile', type:

 
make

Shortly thereafter, you should have an executable version of gawk. That's all there is to it! To verify that gawk is working properly, run `make check'. All of the tests should succeed. If these steps do not work, or if any of the tests fail, check the files in the `README_d' directory to see if you've found a known problem. If the failure is not described there, please send in a bug report (see section Reporting Problems and Bugs.)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.2.2 Additional Configuration Options

There are several additional options you may use on the configure command line when compiling gawk from scratch, including:

--enable-portals

Treat pathnames that begin with `/p' as BSD portal files when doing two-way I/O with the `|&' operator (see section Using gawk with BSD Portals).

--enable-switch

Enable the recognition and execution of C-style switch statements in awk programs (see section The switch Statement.)

--disable-lint

This option disables all lint checking within gawk. The `--lint' and `--lint-old' options (see section Command-Line Options) are accepted, but silently do nothing. Similarly, setting the LINT variable (see section Built-in Variables That Control awk) has no effect on the running awk program.

When used with GCC's automatic dead-code-elimination, this option cuts almost 200K bytes off the size of the gawk executable on GNU/Linux x86 systems. Results on other systems and with other compilers are likely to vary. Using this option may bring you some slight performance improvement.

Using this option will cause some of the tests in the test suite to fail. This option may be removed at a later date.

--disable-nls

Disable all message-translation facilities. This is usually not desirable, but it may bring you some slight performance improvement.

--disable-directories-fatal

Causes gawk to silently skip directories named on the command line.

As of version 3.1.5, the `--with-included-gettext' configuration option is no longer available, since gawk expects the GNU gettext library to be installed as an external library.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.2.3 The Configuration Process

This section is of interest only if you know something about using the C language and the Unix operating system.

The source code for gawk generally attempts to adhere to formal standards wherever possible. This means that gawk uses library routines that are specified by the ISO C standard and by the POSIX operating system interface standard. When using an ISO C compiler, function prototypes are used to help improve the compile-time checking.

Many Unix systems do not support all of either the ISO or the POSIX standards. The `missing_d' subdirectory in the gawk distribution contains replacement versions of those functions that are most likely to be missing.

The `config.h' file that configure creates contains definitions that describe features of the particular operating system where you are attempting to compile gawk. The three things described by this file are: what header files are available, so that they can be correctly included, what (supposedly) standard functions are actually available in your C libraries, and various miscellaneous facts about your variant of Unix. For example, there may not be an st_blksize element in the stat structure. In this case, `HAVE_ST_BLKSIZE' is undefined.

It is possible for your C compiler to lie to configure. It may do so by not exiting with an error when a library function is not available. To get around this, edit the file `custom.h'. Use an `#ifdef' that is appropriate for your system, and either #define any constants that configure should have defined but didn't, or #undef any constants that configure defined and should not have. `custom.h' is automatically included by `config.h'.

It is also possible that the configure program generated by autoconf will not work on your system in some other fashion. If you do have a problem, the file `configure.in' is the input for autoconf. You may be able to change this file and generate a new version of configure that works on your system (see section Reporting Problems and Bugs, for information on how to report problems in configuring gawk). The same mechanism may be used to send in updates to `configure.in' and/or `custom.h'.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3 Installation on Other Operating Systems

This section describes how to install gawk on various non-Unix systems.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.1 Installing gawk on an Amiga

You can install gawk on an Amiga system using a Unix emulation environment, available via anonymous ftp from ftp.ninemoons.com in the directory `pub/ade/current'. This includes a shell based on pdksh. The primary component of this environment is a Unix emulation library, `ixemul.lib'.

A more complete distribution for the Amiga is available on the Geek Gadgets CD-ROM, available from:

 
CRONUS
1840 E. Warner Road #105-265
Tempe, AZ 85284  USA
US Toll Free: (800) 804-0833
Phone: +1-602-491-0442
FAX: +1-602-491-0048
Email: info@ninemoons.com
WWW: http://www.ninemoons.com
Anonymous ftp site: ftp.ninemoons.com

Once you have the distribution, you can configure gawk simply by running configure:

 
configure -v m68k-amigaos

Then run make and you should be all set! If these steps do not work, please send in a bug report (see section Reporting Problems and Bugs).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.2 Installing gawk on BeOS

Since BeOS DR9, all the tools that you should need to build gawk are included with BeOS. The process is basically identical to the Unix process of running configure and then make. Full instructions are given below.

You can compile gawk under BeOS by extracting the standard sources and running configure. You must specify the location prefix for the installation directory. For BeOS DR9 and beyond, the best directory to use is `/boot/home/config', so the configure command is:

 
configure --prefix=/boot/home/config

This installs the compiled application into `/boot/home/config/bin', which is already specified in the standard PATH.

Once the configuration process is completed, you can run make, and then `make install':

 
$ make
…
$ make install

BeOS uses bash as its shell; thus, you use gawk the same way you would under Unix. If these steps do not work, please send in a bug report (see section Reporting Problems and Bugs).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.3 Installation on PC Operating Systems

This section covers installation and usage of gawk on x86 machines running DOS, any version of Windows, or OS/2. In this section, the term "Windows32" refers to any of Windows-95/98/ME/NT/2000.

The limitations of DOS (and DOS shells under Windows or OS/2) has meant that various "DOS extenders" are often used with programs such as gawk. The varying capabilities of Microsoft Windows 3.1 and Windows32 can add to the confusion. For an overview of the considerations, please refer to `README_d/README.pc' in the distribution.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.3.1 Installing a Prepared Distribution for PC Systems

If you have received a binary distribution prepared by the DOS maintainers, then gawk and the necessary support files appear under the `gnu' directory, with executables in `gnu/bin', libraries in `gnu/lib/awk', and manual pages under `gnu/man'. This is designed for easy installation to a `/gnu' directory on your drive--however, the files can be installed anywhere provided AWKPATH is set properly. Regardless of the installation directory, the first line of `igawk.cmd' and `igawk.bat' (in `gnu/bin') may need to be edited.

The binary distribution contains a separate file describing the contents. In particular, it may include more than one version of the gawk executable.

OS/2 (32 bit, EMX) binary distributions are prepared for the `/usr' directory of your preferred drive. Set UNIXROOT to your installation drive (e.g., `e:') if you want to install gawk onto another drive than the hardcoded default `c:'. Executables appear in `/usr/bin', libraries under `/usr/share/awk', manual pages under `/usr/man', Texinfo documentation under `/usr/info' and NLS files under `/usr/share/locale'. If you already have a file `/usr/info/dir' from another package do not overwrite it! Instead enter the following commands at your prompt (replace `x:' by your installation drive):

 
install-info --info-dir=x:/usr/info x:/usr/info/gawk.info
install-info --info-dir=x:/usr/info x:/usr/info/gawkinet.info

However, the files can be installed anywhere provided AWKPATH is set properly.

The binary distribution may contain a separate file containing additional or more detailed installation instructions.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.3.2 Compiling gawk for PC Operating Systems

gawk can be compiled for MS-DOS, Windows32, and OS/2 using the GNU development tools from DJ Delorie (DJGPP; MS-DOS only) or Eberhard Mattes (EMX; MS-DOS, Windows32 and OS/2). Microsoft Visual C/C++ can be used to build a Windows32 version, and Microsoft C/C++ can be used to build 16-bit versions for MS-DOS and OS/2. (As of gawk 3.1.2, the MSC version doesn't work. However, the maintainer is working on fixing it.) The file `README_d/README.pc' in the gawk distribution contains additional notes, and `pc/Makefile' contains important information on compilation options.

To build gawk for MS-DOS, Windows32, and OS/2 (16 bit only; for 32 bit (EMX) you can use the configure script and skip the following paragraphs; for details see below), copy the files in the `pc' directory (except for `ChangeLog') to the directory with the rest of the gawk sources. The `Makefile' contains a configuration section with comments and may need to be edited in order to work with your make utility.

The `Makefile' contains a number of targets for building various MS-DOS, Windows32, and OS/2 versions. A list of targets is printed if the make command is given without a target. As an example, to build gawk using the DJGPP tools, enter `make djgpp'. (The DJGPP tools may be found at ftp://ftp.delorie.com/pub/djgpp/current/v2gnu/.)

Using make to run the standard tests and to install gawk requires additional Unix-like tools, including sh, sed, and cp. In order to run the tests, the `test/*.ok' files may need to be converted so that they have the usual DOS-style end-of-line markers. Most of the tests work properly with Stewartson's shell along with the companion utilities or appropriate GNU utilities. However, some editing of `test/Makefile' is required. It is recommended that you copy the file `pc/Makefile.tst' over the file `test/Makefile' as a replacement. Details can be found in `README_d/README.pc' and in the file `pc/Makefile.tst'.

The 32 bit EMX version of gawk works "out of the box" under OS/2. In principle, it is possible to compile gawk the following way:

 
$ ./configure
$ make

This is not recommended, though. To get an OMF executable you should use the following commands at your sh prompt:

 
$ CPPFLAGS="-D__ST_MT_ERRNO__"
$ export CPPFLAGS
$ CFLAGS="-O2 -Zomf -Zmt"
$ export CFLAGS
$ LDFLAGS="-s -Zcrtdll -Zlinker /exepack:2 -Zlinker /pm:vio -Zstack 0x6000"
$ export LDFLAGS
$ RANLIB="echo"
$ export RANLIB
$ ./configure --prefix=c:/usr --without-included-gettext
$ make AR=emxomfar

These are just suggestions. You may use any other set of (self-consistent) environment variables and compiler flags.

To get an FHS-compliant file hierarchy it is recommended to use the additional configure options `--infodir=c:/usr/share/info', `--mandir=c:/usr/share/man' and `--libexecdir=c:/usr/lib'.

If you use GCC 2.95 it is recommended to use also:

 
$ LIBS="-lgcc"
$ export LIBS

You can also get an a.out executable if you prefer:

 
$ CPPFLAGS="-D__ST_MT_ERRNO__"
$ export CPPFLAGS
$ CFLAGS="-O2 -Zmt"
$ export CFLAGS
$ LDFLAGS="-s -Zstack 0x6000"
$ LIBS="-lgcc"
$ unset RANLIB
$ ./configure --prefix=c:/usr
$ make

NOTE: Versions later than GCC 2.95, i.e., GCC 3.x using the Innotek libc were not tested.

NOTE: Even if the compiled gawk.exe (a.out) executable contains a DOS header, it does not work under DOS. To compile an executable that runs under DOS, "-DPIPES_SIMULATED" must be added to CPPFLAGS. But then some nonstandard extensions of gawk (e.g., `|&') do not work!

After compilation the internal tests can be performed. Enter `make check CMP="diff -a"' at your command prompt. All tests except for the pid test are expected to work properly. The pid test fails because child processes are not started by fork().

`make install' works as expected.

NOTE: Most OS/2 ports of GNU make are not able to handle the Makefiles of this package. If you encounter any problems with make try GNU Make 3.79.1 or later versions. You should find the latest version on http://www.unixos2.org/sw/pub/binary/make/ or on ftp://hobbes.nmsu.edu/pub/os2/.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.3.3 Compiling gawk For Dynamic Libraries

To compile gawk with dynamic extension support, uncomment the definitions of DYN_FLAGS, DYN_EXP, DYN_OBJ, and DYN_MAKEXP in the configuration section of the `Makefile'. There are two definitions for DYN_MAKEXP: pick the one that matches your target.

To build some of the example extension libraries, cd to the extension directory and copy `Makefile.pc' to `Makefile'. You can then build using the same two targets. To run the example awk scripts, you'll need to either change the call to the extension function to match the name of the library (for instance, change "./ordchr.so" to "ordchr.dll" or simply "ordchr"), or rename the library to match the call (for instance, rename `ordchr.dll' to `ordchr.so').

If you build gawk.exe with one compiler but want to build an extension library with the other, you need to copy the import library. Visual C uses a library called `gawk.lib', while MinGW uses a library called `libgawk.a'. These files are equivalent and will interoperate if you give them the correct name. The resulting shared libraries are also interoperable.

To create your own extension library, you can use the examples as models, but you're essentially on your own. Post to comp.lang.awk or send electronic mail to ptjm@interlog.com if you have problems getting started. If you need to access functions or variables which are not exported by gawk.exe, add them to `gawkw32.def' and rebuild. You should also add ATTRIBUTE_EXPORTED to the declaration in `awk.h' of any variables you add to `gawkw32.def'.

Note that extension libraries have the name of the awk executable embedded in them at link time, so they will work only with gawk.exe. In particular, they won't work if you rename gawk.exe to awk.exe or if you try to use pgawk.exe. You can perform profiling by temporarily renaming pgawk.exe to gawk.exe. You can resolve this problem by changing the program name in the definition of DYN_MAKEXP for your compiler.

On Windows32, libraries are sought first in the current directory, then in the directory containing gawk.exe, and finally through the PATH environment variable.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.3.4 Using gawk on PC Operating Systems

With the exception of the Cygwin environment, the `|&' operator and TCP/IP networking (see section Using gawk for Network Programming) are not supported for MS-DOS or MS-Windows. EMX (OS/2 only) does support at least the `|&' operator.

The OS/2 and MS-DOS versions of gawk search for program files as described in The AWKPATH Environment Variable. However, semicolons (rather than colons) separate elements in the AWKPATH variable. If AWKPATH is not set or is empty, then the default search path for OS/2 (16 bit) and MS-DOS versions is ".;c:/lib/awk;c:/gnu/lib/awk".

The search path for OS/2 (32 bit, EMX) is determined by the prefix directory (most likely `/usr' or `c:/usr') that has been specified as an option of the configure script like it is the case for the Unix versions. If `c:/usr' is the prefix directory then the default search path contains `.' and `c:/usr/share/awk'. Additionally, to support binary distributions of gawk for OS/2 systems whose drive `c:' might not support long file names or might not exist at all, there is a special environment variable. If UNIXROOT specifies a drive then this specific drive is also searched for program files. E.g., if UNIXROOT is set to `e:' the complete default search path is ".;c:/usr/share/awk;e:/usr/share/awk".

An sh-like shell (as opposed to command.com under MS-DOS or cmd.exe under OS/2) may be useful for awk programming. Ian Stewartson has written an excellent shell for MS-DOS and OS/2, Daisuke Aoyama has ported GNU bash to MS-DOS using the DJGPP tools, and several shells are available for OS/2, including ksh. The file `README_d/README.pc' in the gawk distribution contains information on these shells. Users of Stewartson's shell on DOS should examine its documentation for handling command lines; in particular, the setting for gawk in the shell configuration may need to be changed and the ignoretype option may also be of interest.

Under OS/2 and DOS, gawk (and many other text programs) silently translate end-of-line "\r\n" to "\n" on input and "\n" to "\r\n" on output. A special BINMODE variable allows control over these translations and is interpreted as follows:

The modes for standard input and standard output are set one time only (after the command line is read, but before processing any of the awk program). Setting BINMODE for standard input or standard output is accomplished by using an appropriate `-v BINMODE=N' option on the command line. BINMODE is set at the time a file or pipe is opened and cannot be changed mid-stream.

The name BINMODE was chosen to match mawk (see section Other Freely Available awk Implementations). Both mawk and gawk handle BINMODE similarly; however, mawk adds a `-W BINMODE=N' option and an environment variable that can set BINMODE, RS, and ORS. The files `binmode[1-3].awk' (under `gnu/lib/awk' in some of the prepared distributions) have been chosen to match mawk's `-W BINMODE=N' option. These can be changed or discarded; in particular, the setting of RS giving the fewest "surprises" is open to debate. mawk uses `RS = "\r\n"' if binary mode is set on read, which is appropriate for files with the DOS-style end-of-line.

To illustrate, the following examples set binary mode on writes for standard output and other files, and set ORS as the "usual" DOS-style end-of-line:

 
gawk -v BINMODE=2 -v ORS="\r\n" …

or:

 
gawk -v BINMODE=w -f binmode2.awk …

These give the same result as the `-W BINMODE=2' option in mawk. The following changes the record separator to "\r\n" and sets binary mode on reads, but does not affect the mode on standard input:

 
gawk -v RS="\r\n" --source "BEGIN { BINMODE = 1 }" …

or:

 
gawk -f binmode1.awk …

With proper quoting, in the first example the setting of RS can be moved into the BEGIN rule.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.3.5 Using gawk In The Cygwin Environment

gawk can be used "out of the box" under Windows if you are using the Cygwin environment.(70) This environment provides an excellent simulation of Unix, using the GNU tools, such as bash, the GNU Compiler Collection (GCC), GNU Make, and other GNU tools. Compilation and installation for Cygwin is the same as for a Unix system:

 
tar -xvpzf gawk-3.1.6.tar.gz
cd gawk-3.1.6
./configure
make

When compared to GNU/Linux on the same system, the `configure' step on Cygwin takes considerably longer. However, it does finish, and then the `make' proceeds as usual.

NOTE: The `|&' operator and TCP/IP networking (see section Using gawk for Network Programming) are fully supported in the Cygwin environment. This is not true for any other environment for MS-DOS or MS-Windows.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.4 How to Compile and Install gawk on VMS

This subsection describes how to compile and install gawk under VMS.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.4.1 Compiling gawk on VMS

To compile gawk under VMS, there is a DCL command procedure that issues all the necessary CC and LINK commands. There is also a `Makefile' for use with the MMS utility. From the source directory, use either:

 
$ @[.VMS]VMSBUILD.COM

or:

 
$ MMS/DESCRIPTION=[.VMS]DESCRIP.MMS GAWK

Depending upon which C compiler you are using, follow one of the sets of instructions in this table:

VAX C V3.x

Use either `vmsbuild.com' or `descrip.mms' as is. These use CC/OPTIMIZE=NOLINE, which is essential for Version 3.0.

VAX C V2.x

You must have Version 2.3 or 2.4; older ones won't work. Edit either `vmsbuild.com' or `descrip.mms' according to the comments in them. For `vmsbuild.com', this just entails removing two `!' delimiters. Also edit `config.h' (which is a copy of file `[.config]vms-conf.h') and comment out or delete the two lines `#define __STDC__ 0' and `#define VAXC_BUILTINS' near the end.

GNU C

Edit `vmsbuild.com' or `descrip.mms'; the changes are different from those for VAX C V2.x but equally straightforward. No changes to `config.h' are needed.

DEC C

Edit `vmsbuild.com' or `descrip.mms' according to their comments. No changes to `config.h' are needed.

gawk has been tested under VAX/VMS 5.5-1 using VAX C V3.2, and GNU C 1.40 and 2.3. It should work without modifications for VMS V4.6 and up.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.4.2 Installing gawk on VMS

To install gawk, all you need is a "foreign" command, which is a DCL symbol whose value begins with a dollar sign. For example:

 
$ GAWK :== $disk1:[gnubin]GAWK

Substitute the actual location of gawk.exe for `$disk1:[gnubin]'. The symbol should be placed in the `login.com' of any user who wants to run gawk, so that it is defined every time the user logs on. Alternatively, the symbol may be placed in the system-wide `sylogin.com' procedure, which allows all users to run gawk.

Optionally, the help entry can be loaded into a VMS help library:

 
$ LIBRARY/HELP SYS$HELP:HELPLIB [.VMS]GAWK.HLP

(You may want to substitute a site-specific help library rather than the standard VMS library `HELPLIB'.) After loading the help text, the command:

 
$ HELP GAWK

provides information about both the gawk implementation and the awk programming language.

The logical name `AWK_LIBRARY' can designate a default location for awk program files. For the `-f' option, if the specified file name has no device or directory path information in it, gawk looks in the current directory first, then in the directory specified by the translation of `AWK_LIBRARY' if the file is not found. If, after searching in both directories, the file still is not found, gawk appends the suffix `.awk' to the filename and retries the file search. If `AWK_LIBRARY' is not defined, that portion of the file search fails benignly.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.4.3 Running gawk on VMS

Command-line parsing and quoting conventions are significantly different on VMS, so examples in this Web page or from other sources often need minor changes. They are minor though, and all awk programs should run correctly.

Here are a couple of trivial tests:

 
$ gawk -- "BEGIN {print ""Hello, World!""}"
$ gawk -"W" version
! could also be -"W version" or "-W version"

Note that uppercase and mixed-case text must be quoted.

The VMS port of gawk includes a DCL-style interface in addition to the original shell-style interface (see the help entry for details). One side effect of dual command-line parsing is that if there is only a single parameter (as in the quoted string program above), the command becomes ambiguous. To work around this, the normally optional `--' flag is required to force Unix style rather than DCL parsing. If any other dash-type options (or multiple parameters such as data files to process) are present, there is no ambiguity and `--' can be omitted.

The default search path, when looking for awk program files specified by the `-f' option, is "SYS$DISK:[],AWK_LIBRARY:". The logical name `AWKPATH' can be used to override this default. The format of `AWKPATH' is a comma-separated list of directory specifications. When defining it, the value should be quoted so that it retains a single translation and not a multitranslation RMS searchlist.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.4.4 Building and Using gawk on VMS POSIX

Ignore the instructions above, although `vms/gawk.hlp' should still be made available in a help library. The source tree should be unpacked into a container file subsystem rather than into the ordinary VMS filesystem. Make sure that the two scripts, `configure' and `vms/posix-cc.sh', are executable; use `chmod +x' on them if necessary. Then execute the following two commands:

 
psx> CC=vms/posix-cc.sh configure
psx> make CC=c89 gawk

The first command constructs files `config.h' and `Makefile' out of templates, using a script to make the C compiler fit configure's expectations. The second command compiles and links gawk using the C compiler directly; ignore any warnings from make about being unable to redefine CC. configure takes a very long time to execute, but at least it provides incremental feedback as it runs.

This has been tested with VAX/VMS V6.2, VMS POSIX V2.0, and DEC C V5.2.

Once built, gawk works like any other shell utility. Unlike the normal VMS port of gawk, no special command-line manipulation is needed in the VMS POSIX environment.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.3.4.5 Some VMS Systems Have An Old Version of gawk

Some versions of VMS have an old version of gawk. To access it, define a symbol, as follows:

 
$ gawk :== $ sys$common:[syshlp.examples.tcpip.snmp]gawk.exe

This is apparently version 2.15.6, which is quite old. We recommend compiling and using the current version.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.4 Unsupported Operating System Ports

This sections describes systems for which the gawk port is no longer supported.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.4.1 Installing gawk on the Atari ST

The Atari port is no longer supported. It is included for those who might want to use it but it is no longer being actively maintained.

There are no substantial differences when installing gawk on various Atari models. Compiled gawk executables do not require a large amount of memory with most awk programs, and should run on all Motorola processor-based models (called further ST, even if that is not exactly right).

In order to use gawk, you need to have a shell, either text or graphics, that does not map all the characters of a command line to uppercase. Maintaining case distinction in option flags is very important (see section Command-Line Options). These days this is the default and it may only be a problem for some very old machines. If your system does not preserve the case of option flags, you need to upgrade your tools. Support for I/O redirection is necessary to make it easy to import awk programs from other environments. Pipes are nice to have but not vital.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.4.1.1 Compiling gawk on the Atari ST

A proper compilation of gawk sources when sizeof(int) differs from sizeof(void *) requires an ISO C compiler. An initial port was done with gcc. You may actually prefer executables where ints are four bytes wide but the other variant works as well.

You may need quite a bit of memory when trying to recompile the gawk sources, as some source files (`regex.c' in particular) are quite big. If you run out of memory compiling such a file, try reducing the optimization level for this particular file, which may help.

With a reasonable shell (bash will do), you have a pretty good chance that the configure utility will succeed, and in particular if you run GNU/Linux, MiNT or a similar operating system. Otherwise sample versions of `config.h' and `Makefile.st' are given in the `atari' subdirectory and can be edited and copied to the corresponding files in the main source directory. Even if configure produces something, it might be advisable to compare its results with the sample versions and possibly make adjustments.

Some gawk source code fragments depend on a preprocessor define `atarist'. This basically assumes the TOS environment with gcc. Modify these sections as appropriate if they are not right for your environment. Also see the remarks about AWKPATH and envsep in Running gawk on the Atari ST.

As shipped, the sample `config.h' claims that the system function is missing from the libraries, which is not true, and an alternative implementation of this function is provided in `unsupported/atari/system.c'. Depending upon your particular combination of shell and operating system, you might want to change the file to indicate that system is available.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.4.1.2 Running gawk on the Atari ST

An executable version of gawk should be placed, as usual, anywhere in your PATH where your shell can find it.

While executing, the Atari version of gawk creates a number of temporary files. When using gcc libraries for TOS, gawk looks for either of the environment variables, TEMP or TMPDIR, in that order. If either one is found, its value is assumed to be a directory for temporary files. This directory must exist, and if you can spare the memory, it is a good idea to put it on a RAM drive. If neither TEMP nor TMPDIR are found, then gawk uses the current directory for its temporary files.

The ST version of gawk searches for its program files, as described in The AWKPATH Environment Variable. The default value for the AWKPATH variable is taken from DEFPATH defined in `Makefile'. The sample gcc/TOS `Makefile' for the ST in the distribution sets DEFPATH to ".,c:\lib\awk,c:\gnu\lib\awk". The search path can be modified by explicitly setting AWKPATH to whatever you want. Note that colons cannot be used on the ST to separate elements in the AWKPATH variable, since they have another reserved meaning. Instead, you must use a comma to separate elements in the path. When recompiling, the separating character can be modified by initializing the envsep variable in `unsupported/atari/gawkmisc.atr' to another value.

Although awk allows great flexibility in doing I/O redirections from within a program, this facility should be used with care on the ST running under TOS. In some circumstances, the OS routines for file-handle pool processing lose track of certain events, causing the computer to crash and requiring a reboot. Often a warm reboot is sufficient. Fortunately, this happens infrequently and in rather esoteric situations. In particular, avoid having one part of an awk program using print statements explicitly redirected to `/dev/stdout', while other print statements use the default standard output, and a calling shell has redirected standard output to a file.

When gawk is compiled with the ST version of gcc and its usual libraries, it accepts both `/' and `\' as path separators. While this is convenient, it should be remembered that this removes one technically valid character (`/') from your file name. It may also create problems for external programs called via the system function, which may not support this convention. Whenever it is possible that a file created by gawk will be used by some other program, use only backslashes. Also remember that in awk, backslashes in strings have to be doubled in order to get literal backslashes (see section Escape Sequences).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.4.2 Installing gawk on a Tandem

The Tandem port is only minimally supported. The port's contributor no longer has access to a Tandem system.

The Tandem port was done on a Cyclone machine running D20. The port is pretty clean and all facilities seem to work except for the I/O piping facilities (see section Using getline from a Pipe, Using getline into a Variable from a Pipe, and Redirecting Output of print and printf), which is just too foreign a concept for Tandem.

To build a Tandem executable from source, download all of the files so that the file names on the Tandem box conform to the restrictions of D20. For example, `array.c' becomes `ARRAYC', and `awk.h' becomes `AWKH'. The totally Tandem-specific files are in the `tandem' "subvolume" (`unsupported/tandem' in the gawk distribution) and should be copied to the main source directory before building gawk.

The file `compit' can then be used to compile and bind an executable. Alas, there is no configure or make.

Usage is the same as for Unix, except that D20 requires all `{' and `}' characters to be escaped with `~' on the command line (but not in script files). Also, the standard Tandem syntax for `/in filename,out filename/' must be used instead of the usual Unix `<' and `>' for file redirection. (Redirection options on getline, print etc., are supported.)

The `-mr val' option (see section Command-Line Options) has been "stolen" to enable Tandem users to process fixed-length records with no "end-of-line" character. That is, `-mr 74' tells gawk to read the input file as fixed 74-byte records.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.5 Reporting Problems and Bugs

There is nothing more dangerous than a bored archeologist.
The Hitchhiker's Guide to the Galaxy

If you have problems with gawk or think that you have found a bug, please report it to the developers; we cannot promise to do anything but we might well want to fix it.

Before reporting a bug, make sure you have actually found a real bug. Carefully reread the documentation and see if it really says you can do what you're trying to do. If it's not clear whether you should be able to do something or not, report that too; it's a bug in the documentation!

Before reporting a bug or trying to fix it yourself, try to isolate it to the smallest possible awk program and input data file that reproduces the problem. Then send us the program and data file, some idea of what kind of Unix system you're using, the compiler you used to compile gawk, and the exact results gawk gave you. Also say what you expected to occur; this helps us decide whether the problem is really in the documentation.

Once you have a precise problem, send email to bug-gawk@gnu.org.

Please include the version number of gawk you are using. You can get this information with the command `gawk --version'. Using this address automatically sends a carbon copy of your mail to me. If necessary, I can be reached directly at arnold@skeeve.com. The bug reporting address is preferred since the email list is archived at the GNU Project. All email should be in English, since that is my native language.

Caution: Do not try to report bugs in gawk by posting to the Usenet/Internet newsgroup comp.lang.awk. While the gawk developers do occasionally read this newsgroup, there is no guarantee that we will see your posting. The steps described above are the official recognized ways for reporting bugs.

Non-bug suggestions are always welcome as well. If you have questions about things that are unclear in the documentation or are just obscure features, ask me; I will try to help you out, although I may not have the time to fix the problem. You can send me electronic mail at the Internet address noted previously.

If you find bugs in one of the non-Unix ports of gawk, please send an electronic mail message to the person who maintains that port. They are named in the following list, as well as in the `README' file in the gawk distribution. Information in the `README' file should be considered authoritative if it conflicts with this Web page.

The people maintaining the non-Unix ports of gawk are as follows:

Amiga

Fred Fish, fnf@ninemoons.com.

BeOS

Martin Brown, mc@whoever.com.

MS-DOS

Scott Deifik, scottd.mail@sbcglobal.net and Darrel Hankerson, hankedr@mail.auburn.edu.

MS-Windows

Juan Grigera, juan@biophnet.unlp.edu.ar.

OS/2

The Unix for OS/2 team, gawk-maintainer@unixos2.org.

Tandem

Stephen Davies, scldad@sdc.com.au.

VMS

Pat Rankin, rankin@pactechdata.com.

If your bug is also reproducible under Unix, please send a copy of your report to the bug-gawk@gnu.org email list as well.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

B.6 Other Freely Available awk Implementations

It's kind of fun to put comments like this in your awk code.
      // Do C++ comments work? answer: yes! of course
Michael Brennan

There are a number of other freely available awk implementations. This section briefly describes where to get them:

Unix awk

Brian Kernighan has made his implementation of awk freely available. You can retrieve this version via the World Wide Web from his home page.(71) It is available in several archive formats:

Shell archive

http://cm.bell-labs.com/who/bwk/awk.shar

Compressed tar file

http://cm.bell-labs.com/who/bwk/awk.tar.gz

Zip file

http://cm.bell-labs.com/who/bwk/awk.zip

This version requires an ISO C (1990 standard) compiler; the C compiler from GCC (the GNU Compiler Collection) works quite nicely.

See section Extensions in the Bell Laboratories awk, for a list of extensions in this awk that are not in POSIX awk.

mawk

Michael Brennan has written an independent implementation of awk, called mawk. It is available under the GPL (see section GNU General Public License), just as gawk is.

You can get it via anonymous ftp to the host ftp.whidbey.net. Change directory to `/pub/brennan'. Use "binary" or "image" mode, and retrieve `mawk1.3.3.tar.gz' (or the latest version that is there).

gunzip may be used to decompress this file. Installation is similar to gawk's (see section Compiling and Installing gawk on Unix).

mawk has the following extensions that are not in POSIX awk:

The next version of mawk will support nextfile.

awka

Written by Andrew Sumner, awka translates awk programs into C, compiles them, and links them with a library of functions that provides the core awk functionality. It also has a number of extensions.

The awk translator is released under the GPL, and the library is under the LGPL.

To get awka, go to http://awka.sourceforge.net. You can reach Andrew Sumner at andrew@zbcom.net.

pawk

Nelson H.F. Beebe at the University of Utah has modified the Bell Labs awk to provide timing and profiling information. It is different from pgawk (see section Profiling Your awk Programs), in that it uses CPU-based profiling, not line-count profiling. You may find it at either ftp://ftp.math.utah.edu/pub/pawk/pawk-20020210.tar.gz or http://www.math.utah.edu/pub/pawk/pawk-20020210.tar.gz.

The OpenSolaris POSIX awk

The version of awk in `/usr/xpg4/bin' on Solaris is POSIX compliant. It is based on the awk from Mortice Kern Systems for PCs. The source code can be downloaded from the OpenSolaris web site.(72) This author was able to make it compile and work under GNU/Linux with 1-2 hours of work. Making it more generally portable (using GNU Autoconf and/or Automake) would take more work, and this has not been done, at least to our knowledge.

jawk

This is an interpreter for awk written in Java. It claims to be a full interpreter, although because it uses Java facilities for I/O and for regexp matching, the language it supports is different from POSIX awk. More information is available on the project's home page.(73).


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated on December, 30 2007 using texi2html 1.76.