User Guide
Chapters
Table of Contents
QTAwk History
Built In Variable List

Differences From Awk

Differences Between QTAwk and Awk:

Regular Expressions
are a separate type of their own on an equal footing with strings, integer and floating point numbers. Thus, regular expressions may be assigned to variables and the variables used wherever regular expressions would be used. This behavior also changes the Awk accepted behavior of a regular expression constant always matching the current input record. This behavior is only retained for a regular expression constant in a pattern. Elsewhere, the match must be explicitly coded. Thus, the Awk behavior for the line:

if ( /return/ )

which implicitly performs a match against $0, must be coded under QTAwk as:

if ( $0 ~~ /return/ )

Under Awk attempting to assign a regular expression to a variable is not possible. The line:

Are = /return/;

assigns a value of 0 or 1 to the variable 'Are', depending on whether /return matches $0. Under QTAwk, Are is assigned the regular expression. Awk does not contain the concept of a regular expression as a separate data type.

Expanded Regular Expressions.
All of the Awk regular expression operators are allowed plus the following:
  1. complemented character lists using the Awk notation, '[^...]', as well as the Awk/QTAwk and C logical negation operator, '[!...]'.
  2. Matched character lists, '[#...]'. These lists are used in pairs. The position of the character matched in the first list of the pair, determines the character which must match in the position occupied by the second list of the pair.
  3. Look-ahead Operator. r@t regular expression r is matched only when followed by regular expression t.
  4. Repetition Operator. r{n1,n2} at least n1 and up to n2 repetitions of regular expression r.
  5. Named Expressions. {named_expr} is replaced by the string value of the corresponding variable.
  6. Tagged Expressions. Enclosing a portion of a regular expression, in parenthesis, "()" makes the matching string available for use with the Tag Operator, '[< >]'.
Consistent statement termination syntax.
The QTAwk Utility Creation Tool utilizes the semi-colon, ';', to terminate all statements. The practice in Awk of using new lines to "sometimes" terminate statements is no longer allowed.
~
Expanded Operator Set.
The Awk set of operators has been changed to make them more consistent and to more closely match those of C. The Awk match operator, ~, has been changed to ~~ so that the similarity between the match operators, ~~ and ~, and the equality operators, '==' and '!=", is complete. The single tilde symbol, ~, reverts to the C one's complement operator, an addition to the operator set over Awk. The introduction of the explicit string concatenation operator, '><'. The remaining "new" operators to QTAwk are:
Operation
Operator
tag [< >]
one's complement ~
concatenation ><
shift left/right << >>
matching ~~ !~
bit-wise AND &
bit-wise XOR @
bit-wise OR |
sequence ,

The carat, ^, remains as the exponentiation operator. The symbol @ is used for the exclusive OR operator. For string operands, the shift operators, << and >>, shift the strings with wrap-around instead of a bit shift as for numeric operands.

Expanded set of recognized constants:
  1. decimal integers,
  2. octal integers,
  3. hexadecimal integers,
  4. character constants, and
  5. floating point constants.

These constants are recognized in utilitys, input fields and strings.

Expanded predefined patterns
giving more control:
INITIAL
similar to BEGIN. Actions executed after opening each input file and before reading first record.
FINAL
similar to END. Actions executed after reading last record of each input file and before closing file.
NOMATCH
actions executed for each input record for which no pattern was matched.
GROUP
used to group multiple regular expressions for search optimization. Can speed search by a factor of six.
True multidimensional Arrays.
The use of the comma in index expressions to simulate multiple array indices is no longer supported. True multiple indices are supported. Indexing is in the C manner, 'a[i1][i2]'. The use of the SUBSEP built-in variable of Awk has been redefined.
Integer array indices as well as string indices.
Array indices have been expanded to include integers as well as the string indices of Awk. Indices are not automatically converted to strings as in Awk. Thus, for true integer indices, the index ordering follows the numeric sequence with an integer index value of '10' following an integer value of '2' instead of preceding it.
Arrays integrated into QTAwk.
QTAwk integrates arrays with arithmetic operators so that the operations are carried out on the entire array. QTAwk also integrates arrays into user-defined functions so that they can be passed to and returned from such functions in a natural and intuitive manner. Awk does not allow returning arrays from user-defined functions or allow arithmetic operators to operate on whole arrays.

In addition, with Version 6.00 for PC/MS-DOS and Version 1.00 FOR OS/2, arrays have been fully integrated into all aspects of QTAwk including the match operators, '~~' and '!~', and their implied use in patterns and the built-in functions, 'sub', 'gsub', and 'match'. The MATCH_INDEX built-in variable has been added to return the matching array element index when an array has been used for pattern matching. The string value of the SUBSEP built-in variable is used as the index separator in MATCH_INDEX for multidimensional arrays.

Arrays used as regular expressions with the match operators, both explicit and implied, retain their internal regular expression form between uses. In addition, the internal regular expression form is assigned when the array as a whole is assigned to another variable, the internal regular expression form is also assigned. The internal regular expression form is discarded only when the array is changed. This gives the user a more balanced control over dynamic regular expressions between that of true regular expressions, which retain the internal form until execution is halted, and strings used as regular expressions, which discard the internal regular expression form after each use.

NEW Keywords:
cycle
similar to 'next' except that may use current record in restarting outer pattern matching loop.
deletea
similar to 'delete' except that ALL array values deleted.
switch
case
default
similar to C syntax with the allowed 'switch' values and 'case' labels expanded to include any legal QTAwk expression, evaluated at run-time. The expressions may evaluate to any value including any numeric value, string or regular expression.
local
new keyword to allow the declaration and use of local variables within compound statements, including user-defined functions. Its use in user defined functions instead of the Awk practice of defining excess formal parameters, leads to easier to read and maintain functions. The C 'practice' of allowing initialization in the 'local' statement is followed.
endfile
similar to 'exit'. Simulates end of current input file only, any remaining input files are still processed.
New Arithmetic Functions.
QTAwk includes 18 built-in arithmetic functions. All of the functions supported by Awk plus the following:
acos(x)
arc-cosine of x
asin(x)
arc-sine of x
cosh(x)
hyperbolic cosine of x
fract(x)
fractional portion of x
log10(x)
logarithm base 10
pi or
pi()
pi
sinh(x)
hyperbolic sine of x
New String Functions.
QTAwk includes 33 built-in string functions. All of the functions supported by Awk plus the following:
center(s,w) or
center(s,w,c)
center string
copies(s,n)
copies of string
deletec(s,p,n)
delete characters from a string
gensub(re,rs,how,target)
generalized substitution function
insert(s1,s2,p)
insert one string into another string
justify(a,n,w) or
justify(a,n,w,c)
justify string
overlay(s1,s2,p)
overlay one string on another
remove(s,c)
remove characters from a string
replace(s)
replace all variables in a string
srange(c1,c2)
return string formed of all characters from c1 to c2
srev(s)
reverse characters of string
stran(s) or
stran(s,st) or
stran(s,st,sf)
translate characters
strim(s) or
strim(s,c) or
strim(s,c,d)
trim leading and/or trailing characters
strlwr(s)
translate to lower case
strupr(s)
translate to upper case
New Date and Time functions
_time()
Local time (seconds since midnight)
_ftime(format_str,sjdn,time)
Format date/time
jdn or
jdn() or
jdn(y,m,d) or
Julian Day Number of date specified
jdn(fdate)
Calender date of Julian Day Number specified
New Miscellaneous Functions.

rotate(a)
rotate the elements of the array a.
execute(s) or
execute(s,se) or
execute(s,se,rf)
execute string s
execute(a) or
execute(a,se) or
execute(a,se,rf)
execute array a
findfile(var,pattern,attributes)
find files with specified names and attributes
pd_sym
access pre-defined variables
ud_sym
access user defined variables
resetre
return QTAwk utility to start-up condition for all regular expressions, including patterns and GROUP patterns. Only the internal regular expression forms for arrays are not re-initialized. The internal regular expression forms for arrays are re-initialized whenever the array is changed in any manner.
setlocale
set the locale under which QTAwk is operating
New I/O Functions.
I/O function syntax has been made consistent with syntax of other functions. The redirection operators, '<', '>' and '>>', and pipeline operator, '|', have been deleted as excessively error prone in expressions. The functional syntax of the 'getline' function has been made identical to that of the other built-in functions. The new functions 'fgetline', 'fprint' and 'fprintf' have been introduced for reading and writing to files other than the current input file and to replace the redirection operators.
  1. Single character input/output functions have been added:
    getc()
    return next character from current input file,
    fgetc(F)
    return next character from named file, F
    putc(c)
    output character c to standard output file
    fputc(c,F)
    output character c to file F
  2. The dropped file re-direction operator, '>>', has been replaced by the 'append' function:
  3. append(F) -- Opens the file F for output to the end of the file. All subsequent output to the file is appended to the end of the file. This function must be called before the first output to the file to append. Any output to the file prior to calling this function will open the file and discard any existing contents, i.e., truncate to zero length.
  4. Two functions to search files for one or more regular expressions:
    srchrecord( sp ) or
    srchrecord( sp , rs ) or
    srchrecord( sp , rs , var )
    search current input file for next record containing match to 'sp', using 'rs' as record separator (RS if 'rs' not specified), returning record found in 'var', $0 if 'var' not specified. Update NR and FNR. Also reparse $0 if 'var' not specified and update NF.

    Returns:

    1. n ==> Record Present And Read, n == Number Of Characters In Record plus EOR length plus 1.
    2. 0 ==> End-Of-File, EOF, Encountered
    3. -1 ==> Read Error Occurred (Including Failure To Open File)
    fsrchrecord( fn , sp ) or
    fsrchrecord( fn , sp , rs ) or
    fsrchrecord( fn , sp , rs , var )
    search file 'fn' for next record containing match to 'sp', using 'rs' as record separator (RS if 'rs' not specified), returning record found in 'var', $0 if 'var' not specified. Reparse $0 if 'var' not specified and update NF.

    Returns:

    1. n ==> Record Present And Read, n == Number Of Characters In Record plus EOR length plus 1.
    2. 0 ==> End-Of-File, EOF, Encountered
    3. -1 ==> Read Error Occurred (Including Failure To Open File)
  5. The function 'get_FNR(F)' has been introduced. This function returns the current record number of the input file 'F'. This function is necessary to obtain the current input record number for input files used with the 'fgetline' and 'fsrchrecord' functions.
Expanded capability of formatted Output.
The limited output formatting available with the Awk 'printf' function has been expanded by adopting the complete output format specification of the ANSI C standard.
'local' keyword.
The 'local' keyword has been introduced to allow for variables local to user-defined functions (and any compound statement). This expansion makes the Awk practice of defining 'extra' formal parameters no longer necessary.
Expanded user-defined functions.
With the 'local' keyword, QTAwk allows the user to define functions that may accept a variable number of arguments. Functions, such as finding the minimum/maximum of a variable number of variables, are possible with one function rather than defining separate functions for each possible combination of arguments.
User controlled trace capability.
A user controlled statement trace capability has been added. This gives the user a simple to use mechanism to trace utility execution. Rather than adding 'print' statements, merely re-defining the value of a built-in variable will give utility execution trace information, including utility line number.
Expanded built-in variable list.
With 57 built-in variables, QTAwk includes all of the built-in variables of Awk plus the following
_arg_chk
used to determine whether to check number of arguments passed to user-defined functions.
ARGI
index value in ARGV of next command line argument. Gives more control of command line argument processing.
CONVFMT
used for converting floating point numbers to strings. OFMT used only for output floating point numbers.
CLENGTH
similar to 'RLENGTH' of Awk. Set whenever a 'case' value evaluates to a regular expression.
CSTART
similar to 'RSTART' of Awk. Set whenever a 'case' value evaluates to a regular expression.
CYCLE_COUNT
count number of outer loop cycles with current input record.
DEGREES
if TRUE, trigonometric functions assume degree values, radians if FALSE.
ENVIRON
one dimensional array with elements equal to the environment strings passed to QTAwk
ECHO_INPUT
controls echo of standard input file to standard output file.
FALSE
predefined with constant value, 0.
FIELDFILL
string value used for filling fixed length fields when fields changed.
FIELDWIDTHS
can be assigned a value for fixed width fields, over-riding the use of FS for splitting current record into fields.
FILEATTR
file attributes of current input file.
FILEDATE
date as a Julian Day Number, JDN, of current input file.
FILETIME
time in seconds since midnight of current input file.
FILEDATE_CREATE
creation date as a JDN of current input file.
FILETIME_CREATE
creation time in seconds since midnight of current input file.
FILEDATE_LACCESS
last access date as a JDN of current input file.
FILETIME_LACCESS
last access time in seconds since midnight of current input file.
FILESIZE
size in bytes of current input file.
FILE_SORT
string value to define sort order of array returned by "findfile" function.
FILE_SEARCH
TRUE/FALSE value to search current input file for record(s) containing match to regular expression(s) in FILE_SEARCH_PAT. Default value FALSE.
FILE_SEARCH_PAT
contains one or more patterns for searching current input file.
FS
FS allowed to be an array. If FS is an array, multiple patterns may be set for field separators.
Gregorian
TRUE/FALSE value to distinguish using Gregorian or Julian calendar in computing Julian Day Number or converting back to calendar date.
IGNORECASE
if assigned a true value, QTAwk ignores case is all string and regular expression match operations.
LOCALE
single dimensioned array containing the string values for locale dependent values.
LONGEST_EXP
used to control whether the longest or the first string matching a regular expression is found.
MATCH_INDEX
assigned the string value of the matching array element when an array used for regular expression match.
MAX_CYCLE
maximum number of outer loop cycles permitted with current input record.
MLENGTH
similar to 'RLENGTH' of Awk. Set whenever a stand-alone regular expression is encountered in evaluating a pattern.
MSTART
similar to 'RSTART' of Awk. Set whenever a stand-alone regular expression is encountered in evaluating a pattern.
NF
if value changed, current input record changed to reflect new value.
NG
equal to the number of the regular expression in a GROUP matching a string in the current input record.
OFMT
string value used only as format for output of floating point numbers.
RECLEN
if assigned a non-zero numeric value, integral value used for length of fixed length records. RS not used unless RECLEN has a zero numeric value.
RETAIN_FS
if TRUE the original characters separating the fields of the current input record are retained whenever a field is changed, causing the input record to be re-constructed. If FALSE the output field separator, OFS, is used to separate fields in the current input record during reconstruction. The latter practice is the only method available in Awk.
RS
RS allowed to be an array. If RS is an array, and RECLEN has a zero numeric value, multiple patterns may be set for record separators.
RT
automatically assigned string value of record terminator for current input record.
SUBSEP
string value used as the array element index separator in MATCH_INDEX.
SPAN_RECORDS
TRUE/FALSE, default value FALSE. if TRUE allows matches to FILE_SEARCH_PAT to span multiple input records and return multiple records in $0. If FALSE, matches confined to a single record. Also controls matches spanning records in 'srchrecord' and 'fsrchrecord' functions.
TRACE
value used to determine utility tracing.
TRANS_FROM/TRANS_TO
strings used by 'stran' function if second and/or third arguments not specified.
TRUE
predefined with constant value, 1
QTAwk_Path
initialized from 'QTAWK' environment variable. Sets paths searched for input files.
vargc
used only in used-defined functions defined with a variable number of arguments. At run-time, set equal to the actual number of variable arguments passed.
vargv
used only in used-defined functions defined with a variable number of arguments. At run-time, an single dimensioned array with each element set to the argument actually passed.
New command line options available:

-ffilename
multiple utility files may be specified. In addition, the file directive:

#include "filename"

may be used to include other files.

-vvar=value
sets 'var' to value before any "BEGIN" actions executed
-Wd
delays parsing of input record until any fields or the NF variable referenced.
-Wm
forces QTAwk to allow multiple includes of the same file, issuing an error message and skipping multiple includes.
Definition of built-in variable, RS, expanded.
When value assigned to RS, it is converted to regular expression form. Strings matching regular expression act as record separator. Similar in behavior to field separator, FS. If an array, multiple record separator patterns may be specified.
FILENAME
In QTAwk, setting built-in variable, "FILENAME", to another value will change the current input file. Setting the variable in Awk, has no effect on current input file.
NF
In QTAwk, setting built-in variable, NF to another value will change the current contents of $0. If the new value is greater than the current value, the current input line is lengthened with new empty fields separated by the output field separator strings, OFS. If the new value is less than the current value, then $0 is shortened by truncating at the end of the field corresponding to the new NF value.
The Tag Operator, '[< >]'
The Tag operator may be used to obtain or to set a particular part of the string matching the regular expression pattern.
getline
The return value of the 'getline' function has been changed when a valid record has been read. The return value is the length of the record plus the length of the End-Of-Record plus 1.
Awk Problems
Corrected admitted problems with Awk. The problems mentioned on page 182 of "The Awk Programming Language" have been corrected. Specifically:
  1. true multidimensional arrays have been implemented,
  2. the 'getline' syntax has been made to match that of other functions,
  3. declaring local variables in user-defined functions has been corrected,
  4. intervening blanks are allowed between the function call name and the opening parenthesis (in fact, under QTAwk it is permissible to have no opening parenthesis or argument list for user-defined functions that have been defined with no formal arguments).

TOP
User Guide
Chapters
Table of Contents
QTAwk History
Built In Variable List