QTAwk Update History
Version 6.22/1.22
==> QTAwk Versions 6.22 for DOS and version 1.22 for OS/2,
dated 11/14/96. This version contains two changes from the previous versions:
have been replaced with the date/time functions
==> QTAwk Version 6.21 for DOS and version 1.21 for OS/2, dated 05/01/96. This version contains several changes and additions from the previous versions:
The changed tag operator has the same level of operator precedence as before.
for (;;)
works.
how == 3
then only the third match found is replaced. In addition, no further searches for matches are made.
how == 1
and only the first match is replaced and no search for matches beyond the first is made.
Using "gensub", the count of substitutions can still be obtained under QTAwk quite easily. Since QTAwk does not evaluate the replacement string expression at the time the "gensub" function is called, but delays evaluation until the replacement is done, the following construction, counts the number of substitutions made.
gcnt = 0;
gensub(re, (gcnt++, rs), how[, target]);
The comma operator has been used in the second argument to increment the count variable, gcnt, every time a replacement is made. Note that if how is a numeric, then the final value of gcnt will be equal to the number of matches found or to the value specified for how, whichever is less. Note also that the second argument, (gcnt++, rs) MUST be enclosed in parenthesis.
==> QTAwk Version 6.10 for DOS and version 1.10 for OS/2. 05/01/96.
This version contained bug fixes only and was quickly superceded by versions
6.20 (PC/MS-DOS) and 1.20 (OS/2).
Version 6.00/1.00
==> QTAwk Version 6.00 for DOS and version 1.00 for OS/2, dated 05/01/94. This version contains several changes and additions from the previous versions:
If a match is found, the new variable MATCH_INDEX is set to the string value of the index in the array of the matching regular expression. If a multidimensional array is used, the array indices are separated by the value of the built-in variable SUBSEP (which has been re-introduced from Awk with a slightly different use).
In addition, the use of arrays for the built-in variables RS and FS enables the user to specify multiple regular expressions for use as record separators and/or field separators. The use of arrays for RS and/or FS does not affect the value of MATCH_INDEX.
Arrays used for regular expression matching retain their internal regular expression form until the whole array or an array element is changed. Thus arrays can be be used as dynamic regular expressions for which the user controls when the internal form is changed.
For QTAwk utilities in which all patterns contain a regular expression match or for those files for which actions are executed only for those records matching a set of one or more regular expressions, the above process for each record can be time consuming. It would be much faster to scan the input file for matches to the desired regular expression(s) and then execute each pattern expression once such a record has been found. This process by-passes the time consuming process of reading individual records and parsing each into fields. Only the desired records need to be read and parsed with the new method, thus saving much time in the execution of the QTAwk utility.
QTAwk Version 6.00 and version 1.00 for OS/2 implements the new search method. Two new variables:
have been introduced for this purpose. When FILE_SEARCH is TRUE, the next record read will be the record matching a regular expression from FILE_SEARCH_PAT. If FILE_SEARCH is FALSE, the normal file input process described above is followed. The new file search process may be turned on and off as necessary for a single input file in this manner.
FILE_SEARCH_PAT is set by the user utility to one or more regular expressions against which records from the current input file are matched. FILE_SEARCH_PAT may be set to a single regular expression as a simple variable, e.g.,
or a singly dimensioned array, e.g.,
When FILE_SEARCH is TRUE, the current input file is scanned for a match to FILE_SEARCH_PAT. When a record is found matching a regular expression in FILE_SEARCH_PAT, the record is read, parsed into fields according to FS or FIELDWIDTHS and each pattern expression executed. The associated actions for TRUE pattern expressions are executed. Note that the variables RS or RECLEN still determine the parsing of the input file into records.
Under some circumstances, the above process can return in '$0' multiple records from the current input file. In searching the input file for a match to FILE_SEARCH_PAT, a match may span more than one record if the new variable, SPAN_RECORDS, is TRUE. In this case, '$0' is set to the full set of records spanning the match to FILE_SEARCH_PAT. If SPAN_RECORDS is FALSE, any matches to FILE_SEARCH_PAT are not allowed to span input records and '$0' will contain only a single record.
is similar to the function getline. srchrecord will search the current input file for the next record or records matching the search pattern, 'sp'. If the record separator parameter, 'rs', is not specified, records are determined by the variable RS or RECLEN. If 'rs' is specified, record boundaries are determined by the strings matching 'rs'. 'rs' may be a simple constant or variable or an array. The record or records matching the search pattern are returned in '$0' if 'var' is not specified. If 'var' is specified, the matching record or records are returned in 'var'. The built-in variables, FNR and NR are updated to reflect the current position and record number after the search. The built-in variables, NF and '$i', i <= 0 <= NF, are set when 'var' is not specified.
is similar to the function fgetline. fsrchrecord will search the file specified for the next record or records matching the search pattern, 'sp'. If the record separator parameter, 'rs', is not specified, records are determined by the variable RS or RECLEN. If 'rs' is specified, record boundaries are determined by the strings matching 'rs'. 'rs' may be a simple constant or variable or an array. The record or records matching the search pattern are returned in '$0' if 'var' is not specified. If 'var' is specified, the matching record or records are returned in 'var'. The built-in variables, NF and '$i', i <= 0 <= NF, are set when 'var' is not specified.
Both functions have identical returns to the getline and fgetline functions, i.e.,
| |
getline() | getline(v) | fgetline(F) | fgetline(F,v) |
| |
srchrecord() | srchrecord(v) | fsrchrecord(F) | fsrchrecord(F,v) |
| |
|
|
|
|
| $0 | updated | not updated | updated | not updated |
| $i, i>0 | updated | not updated | updated | not updated |
| NF | updated | not updated | updated | not updated |
| NR | updated | updated | not updated | not updated |
| FNR | updated | updated | not updated | not updated |
Note
This form returns the current record number of the current input file. The value returned is equal to the built-in variable FNR.
This form returns the current record number of the input file specified. If filename == FILENAME, this form is equivalent to the first form. If the filename specified is not open or is not open for input, a value of zero, 0, is returned.
This function has been added because of the input functions fgetline and fsrchrecord. For the current input file, the built-in variable FNR is always updated automatically to contain the record number of the last record input (the current record). However, when reading from a file other than the current input file, previously there was no means of obtaining the current record number of the input file. With fgetline, the user utility could maintain an independent count of records read. However, if the fsrchrecord function is used, there is no other means of obtaining the record number of the last record read.
Note that the use of arrays for match patterns falls between the use of strings, for which the internal regular expression form is rebuilt for each use, and regular expressions for which the internal form is built for the first use and then remains static. When arrays are used for matching, the internal regular expression form is built when first used and retained until the array is changed. For arrays the internal regular expression form is assigned when the array as a whole is assigned to another variable. Thus the internal regular expression form can be retained and reused.
n1 n2 n3 ... nn
the splitting of input records into fields is governed by the numbers in FIELDWIDTHS rather than FS. Each number in FIELDWIDTHS specifies the width of a field including columns between fields. If you want to ignore the columns between fields, you can specify the width as a separate field that is subsequently ignored. When the value FIELDWIDTHS does not match this form, field splitting is done using FS in the usual manner.
==> QTAwk Version 5.11, dated 03/30/92. This version contains two additions from the previous versions:
==> QTAwk Version 5.10, dated 10/01/91. This version contains several changes and additions from the previous versions:
variable = value
has been fixed.
where:
This function returns the number of files found which match the pattern specified. The file names, sizes, last modify date and times are returned via the variable specified. The file date and time are returned in "DOS format" which is good for sorting purposes, but unreadable. The "stime" and "sdate" functions have been expanded to include the file time and date and format them as desired.
==> QTAwk Version 5.00, dated 02/01/91. This version contains several additions from the previous versions:
The value of QTAwk_Path may be reset at any time by the user's utility to change the paths to be searched for desired files to be opened for reading.
Thus $$5.02, is the string matching the regular expression contained within the parenthesis at the fifth level and 2nd count.
==> QTAwk Version 4.20, dated 10/11/90. This version contains three additions from the version 4.20 previous versions:
These functions allow the user to naturally obtain single characters from any file including the standard input file (which would be the keyboard if not redirected or piped).
==> QTAwk Version 4.10. This version contains one addition from the previous versions: version 4.10
GROUP /regular expression constant/ { actions }
GROUP "string constant" { actions }
GROUP Variable_name { actions }
GROUP patterns are still converted into an internal form for regular expressions only once, when the pattern is first used to scan an input line. Any variables in a GROUP pattern will be evaluated, converted to string form and interpreted a regular expression.
==> QTAwk Version 4.02. This version contains two additions from the previous versions:
name = string
where the blanks on either side of the equal sign, '=', are optional and depend on the particular form used in the "SET" command. The QTAwk utility may scan the elements of ENVIRON for a particular name or string as desired.
| POSIX["alnum"] |
all alphebetic and numeric characters |
| POSIX["alpha"] |
all alphebetic characters |
| POSIX["blank"] |
the blank and tab characters |
| POSIX["cntrl"] |
all control characters |
| POSIX["digit"] |
all numeric digits |
| POSIX["graph"] |
all visible and printable characters. A tab character
is "printable", but not visible. 'a' is both printable and visible |
| POSIX["lower"] |
all lower-case alphabetic characters |
| POSIX["print"] |
all printable characters |
| POSIX["punct"] |
all punctuation characters |
| POSIX["space"] |
all white space characters |
| POSIX["upper"] |
all upper-case alphabetic characters |
| POSIX["xdigit"] |
all hexadecimal numeric digits |