QTAwk
Distributed
under the GPL
See below for information on
downloading executables and source
QTAwk
Manual
- Latest Version date: Feb 13, 2008, Version 2.31 (Linux) - bug fix
-
Latest version date: Feb 12, 2006, Version 2.20 (Linux) - Changes:
- Code cleanup,
notably moving macro constants to constants of type 'enum'. Added three
functions for interfacing to the QTAwk
file system (excluding pipes) from user defined functions in loadable
modules:
- QTAwk_set_file - function to set/open a file for use in a
QTAwk script.
- QTAwk_close_file - Call this function to close an open
file.
- QTAwk_get_file - Call this function to get a file known
to
the QTAwk script.
The loadable module 'tmpfile.c' now uses these functions.
- Added a new element to the 'PROCINFO' array with index
"cmdline" and value equal to the full command line invoking QTAwk.
Latest version date: Jan 22, 2006, Version 2.10 (Linux) - Changes:
- Fix some miscellaneous bugs, one in "rotate"function and 2 in
the new module interface functions.
- Replaced the "QTAwk_start_array_seq" module interface
function with "QTAwk_first _array_element" function. This replaces the
2 calls:
- QTAwk_start__array_seq, and
- QTAwk_next_array_element
with the single call:
- QTAwk_first _array_element
This makes the use in a "for" statement more natural and easier.
- Added the module interface function: "QTAwk_array_finish".
This call is necessary to close the array sequence structures used by
the "QTAwk_first _array_element" and "QTAwk_next_array_element"
functions.
- Added the following modules:
- secure_hash_module - computes the following secure hashes
of strings and files:
- Secure Hash Algorithm 160 bit
- Secure Hash Algorithm 224 bit
- Secure Hash Algorithm 256 bit
- Secure Hash Algorithm 384 bit
- Secure Hash Algorithm 512 bit
- RMD Hash160 bit
- Whirlpool 512 bit Hash
- Tiger 192 bit Hash
There are separate calls for initializing string hashes, updating
string hashes, finalizing string hashes, computing file hashes,
computing multiple string hashes and computing mutliple file
hashes.
- cpu.time_module - computes elapsed CPU execution time.
Three function calls:
- start CPU execution timer - can start up to 20 timers.
- return CPU elapsed time for a timer started in 1.
- return clock tics for QTAwk process, system clock
tics used to support QTAwk, process clock tics of child processes,
system clock tics used to support child processes.
- format_module - this module contains the user defined
function replacement for the previous "justify" builtin function.
- system.info_module - this module contains the user
defined function for returning various information on the system being
used, OS version, machine, host name, etc.
- strtoint - contains the user defined function, strtoint,
which provides the functionality of the the C standard library function
of the same name.
- Latest version date: Jan 02, 2006, Version 2.00 (Linux) - Bug fix
- Latest version date: Dec 31 2005, Version 2.00 (Linux) - Bug fix,
Added ability to add user defined functions in compiled C
'modules' which can be loaded and unloaded dynamically at run-time.
Added the following builtin variables:
- MODULES - a singly dimensioned array. The index of each
element equals the name of a loaded module
- Module_Path - corresponds to QTAwk_Path. List of semi-colon
separated paths to search for modules.
- USER_FUNCTIONS - a singly dimensioned array. The index of
each element corresponds to the name of a user defined function and the
element value equals the name of the file in which the function was
defined.
Added the following builtin functions:
- load_module - this function is called with the filename of
the module to load containing dynamically loaded user defined
functions.
- unload_module - this builtin function unloads a module loaded
with the load_module function.
Modules containing user defined functions may be loaded and unloaded as
needed.
- Latest version date: Dec 3, 2005, Version 1.90 (Linux) - Added
"RS" array element to PROCINFO variable. Similar to "FS" element.
- Latest version date: Dec 3, 2005, Version 1.90 (Linux) - The
following has changed:
- Various bugs have been fixed.
- Code cleanup in creating ENVIRON, LOCALE and POSIX arrays.
- Added builtin variable array PROCINFO following gawk use.
Following array elements available:
- PROCINFO["FS"]
- PROCINFO["egid"]
- PROCINFO["euid"]
- PROCINFO["gid"]
- PROCINFO["pgrpid"]
- PROCINFO["pid"]
- PROCINFO["ppid"]
- PROCINFO["uid"]
- Added bi-directional pipe operator, '|&', following use
in gawk. The same conditions and restrictions in use prevail as in
gawk.
- A second optional argument has been added to the 'close'
function for independently closing either part of a bi-directional pipe.
- Latest version date: Nov 28, 2005, Version 1.83 (Linux) - The
action upon assigning a new value to the field separator variable, FS,
has changed. In previous versions, the new value of FS only affected
subsequent records. This is now changed so that the current record is
immediately re-parsed when the assignment is accomplished. This is
useful when a field variable value is changed and contains field
seapartor values within the new value. Assigning FS to itself:
FS = FS;
is a convient method of immediately re-parsing the record to re-arrange
the field variable values.
- Latest version date: Nov 11, 2005, Version 1.82 (Linux) - Minor
change to Error reporting when opening a file. Under Linux, report if
file permission lacking for reading or writing the file.
- Latest version date: July 31, 2005, Version 1.81 (Linux) - Fixed
minor error message when file specified with the 'f' option not found.
Updated documentation for 'execute' function. 'execute' function may
also dynamically define additional pattern/action pairs and
user-defined functions.
- Latest version date: June 03, 2005, Version 1.80 (Linux) -
- deprecated 'deletea' function. Use 'delete' with no
subscripts on statement object.
Thus, instead of
deletea array;
use
delete array;
delete array[1] will still delete the single array element indexed.
deletea statement will disappear in future versions.
- Two pipe operators, '>|' and '|>" combined into single
pipe operator, '|>'
Use of the pipe operator '>|' no longer supported. Thus, use single
operator in fashion:
while ( print(line) |> "cmd string" ) ...
and
while ( "cmd string" |> getline(line) ) ..
In future version this will enable the use of the "bi-directional pipe":
while ( print(out_line) |> "cmd string" |> getline(in_line) ) ...
in a more "normal" fashion.
- Latest version date: May 31, 2005, Version 1.70 (Linux) - Fixed
errors in array indexing logic. Array indexing now works as it should.
- Latest version date: April 29, 2005, Version 1.63 (Linux) -
cleaned up "LOCALE" setting code, decerased executable size by about 12
KB. Linux executable compiled with gcc 4.0.0. Downloadable Linux
executable now compressed with bzip2.
- Latest version date: April 26, 2005, Version 1.62 (Linux) -
corrected [sf]print[f] functions to properly interpret "%%" in format
strings and to ignore 'm', 'n' and 'p' format specifiers. Updated
source for header files created by QTGrep
utility.
- Latest version date: April 11, 2005, Version 1.61 (Linux) - Added
a second, optional, parameter to the "rotate" function to specify the
direction of rotation.
- Latest version date: Feb. 03, 2005, Version 1.60 (Linux) -
Discontinued use of "QTAWK" environment variable. No longer used in
1.6.
Replaced with more powerful and useful resource configuration file(s), .qtawkrc, QTAwk uses 2 such files, a
global, or system-wide, file, "/usr/local/etc/.qtawkrc", and a local,
or user specific, file, "~/.qtawkrc".
- Latest version date: Jan. 03, 2005, Version 1.50 (Linux) -
Add
hash
files to test for validity - need QTCrypt
package to test hashes.
- Latest version date: Dec. 08, 2004, Version 1.50 (Linux) - Minor
changes - mostly in source configuration and bringing the source in
compliance with the upgrade of QTGrep,
version 2.30
- Latest version date: Nov. 29, 2004, Version 1.40 (Linux) - Minor
changes to header files and email address update - changed makefile for
gcc 3.4.3 to use C99 standard
- Latest version date: Nov. 21, 2004, Version 1.40 (Linux) -
compiled
using gcc 3.4.3 and fixes problem with replacement string in
'gsub/gensub' functions. The replacement string is now evaluated
correctly at function execution
- Latest version date: Oct. 23, 2004, Version 1.40 (Linux)
- Latest version - Oct. 23, 2004 - corrects regular expression
scanning
routine from QTGrep when
tagged string is only 1 character long.
- Latest version - Oct. 31, 2004
Refer to "The AWK Programming Language", by Alfred V. Aho, Brian W.
Kernighan and Peter J. Weinerger or the GNU gawk reference material. A
tarred, bzip2 version of the QTAwk
reference document is available, signature file. The Linux
executable is available, signature
file. The
tarred and bezip2 source is available
under the GPL, signature
file. Sample QTAwk
utility files are
available, signature file.
The file also contains some QTGrep
pattern files.
Downloading the QTCrypt
package/executables will allow you to check
the source file hashes included with the source. Also, you will be able
to check the signatures. See the Home
page for more information on the signatures.
Differences Between QTAwk
and Awk:
- Dynamically Loaded/Unloaded modules
- modules written in C, compiled and linked may be dynamically
loaded into QTAwk to add user
defined functions which run natively and extend the capabilities of QTAwk.
- Regular Expressions
- are a separate type of their own on an equal footing with
strings, integer and floating point numbers. Thus, regular expressions
may be assigned to variables and the variables used wherever regular
expressions would be used. This behavior also changes the Awk accepted
behavior of a regular expression constant always matching the current
input record. This behavior is only retained for a regular expression
constant in
a pattern. Elsewhere, the match must be explicitly coded. Thus, the Awk
behavior for the line:
if ( /return/ )
which implicitly performs a match against $0, must be coded
under QTAwk as:
if ( $0 ~~ /return/ )
Under Awk attempting to assign a regular expression to a
variable is not possible. The line:
Are = /return/;
assigns a value of 0 or 1 to the variable 'Are', depending on
whether /return/ matches $0. Under QTAwk,
Are is assigned the regular
expression. Awk does not contain the concept of a regular expression as
a separate data type.
- Expanded Regular Expressions.
- All of the Awk regular expression operators are allowed plus the
following:
- complemented
character lists using the Awk notation, '[^...]', as well as the
Awk/QTAwk and C logical
negation operator, '[!...]'. This is more
consistent, since the same operator symbol is used for negation.
- Matched character
lists, '[#...]'. These lists are used in pairs. The position of the
character matched in the first list of the pair, determines the
character which must match in the position occupied by the second list
of the pair.
- Look-ahead
Operator.r@t regular expression r is matched only when followed by
regular expression t.
- Interval Operator.
r{n1,n2} at least n1 and up to n2 repetitions of regular expression r.
Also called the Repetition Operator.
- Named Expressions.
{named_expr} is replaced by the string value of the corresponding
variable named 'named_expr'. If no such variable exists, the operator
is not replaced.
- Tagged
Expressions. Enclosing a portion of a regular expression in
parenthesis,
"()", makes the matching string available for use with the Tag
Operator,'[< >]'.
- Consistent statement
termination syntax.
- The QTAwk Utility
Creation Tool utilizes the semi-colon, ';', to terminate all
statements. The practice in Awk of using new lines to "sometimes"
terminate statements is no longer allowed.
- Expanded Operator Set.
- The Awk set of
operators has been changed to make them more consistent and to more
closely match those of C. The Awk match operator, ~, has been changed
to ~~ so that the similarity between the match operators, ~~ and !~,
and the equality operators, '==' and '!=", is complete. The single
tilde symbol, ~, reverts to the C one's complement operator, an
addition to the operator set over Awk. The introduction of the explicit
string concatenation operator, '><'. The remaining "new"
operators to QTAwk are:
Operation
|
Operator
|
| tag |
[< >] |
| one's complement |
~ |
| concatenation |
>< |
| shift left/right |
<< >> |
| matching |
~~ !~ |
| bit-wise AND |
& |
| bit-wise XOR |
@ |
| bit-wise OR |
| |
| sequence |
, |
The carat, ^, remains as the exponentiation operator. The symbol
@ is used for the exclusive OR operator. For string operands, the shift
operators, << and >>, shift the strings with wrap-around
instead of a bit shift as for numeric operands. The expression sequence
operator has been introduced to match that in C.
- Expanded set of recognized constants:
-
- decimal integers,
- octal integers,
- hexadecimal integers,
- character constants, and
- floating point constants.
These constants are recognized in utilities, input fields and
strings.
- Expanded predefined patterns
- giving more control:
- INITIAL
- similar to BEGIN. Actions executed after opening each input
file and before reading first record.
- FINAL
- similar to END. Actions executed after reading last record of
each input file and before closing file.
- NOMATCH
- actions executed for each input record for which no pattern
was matched.
- GROUP
- used to group multiple regular expressions for search
optimization. Can speed search by a factor of six.
- True multidimensional Arrays.
- The use of the comma in index expressions to simulate multiple
array indices is no longer supported. True multiple indices are
supported. Indexing is in the C manner, 'a[i1][i2]'. The use of the
SUBSEP built-in variable of Awk has been redefined.
- Integer array indices
as well as string indices.
- Array indices have
been expanded
to include integers as well as the string indices of Awk. Indices are
not automatically converted to strings as in Awk. Thus, for true
integer indices, the index ordering follows the numeric sequence with
an integer index value of '10' following an integer value of '2'
instead of preceding it.
- Arrays integrated into
QTAwk.
- QTAwk integrates
arrays with
arithmetic operators so that the operations are carried out on the
entire
array. QTAwk also integrates
arrays into user-defined functions so that
they can be passed to and returned from such functions in a natural and
intuitive manner. Awk does not allow returning arrays from user-defined
functions or allow arithmetic operators to operate on whole arrays.
In addition, for all
Linux versions, arrays have been fully integrated into all aspects of QTAwk including the match
operators, '~~' and '!~', and their implied
use in patterns and the built-in functions, 'sub', 'gsub', and 'match'.
The MATCH_INDEX built-in variable has been added to return the matching
array element index when an array has been used for pattern matching.
The string value of the SUBSEP built-in variable is used as the index
separator in MATCH_INDEX for multidimensional arrays.
Arrays used as
regular expressions
with the match operators, both explicit and implied, retain their
internal
regular expression form between uses. In addition, the internal regular
expression form is assigned when the array as a whole is assigned to
another
variable, the internal regular expression form is also assigned. The
internal
regular expression form is discarded only when the array is changed.
This
gives the user a more balanced control over dynamic regular expressions
between that of true regular expressions, which retain the internal
form
until execution is halted, and strings used as regular expressions,
which
discard the internal regular expression form after each use.
- New Keywords:
-
| cycle |
similar to 'next' except that may use
current input record or next input record in restarting outer pattern
matching loop. Current values of CYCLE_COUNT and MAX_CYCLE used to
determine which input record to use. If CYCLE_COUNT <= MAX_CYCLE,
use current input record else read next input record. |
switch
case
default |
similar to C syntax with the allowed
'switch' values and 'case' labels expanded to include any legal QTAwk
expression, evaluated at run-time. The expressions may evaluate to
any value including any numeric value, string or regular expression. |
| local |
new keyword to allow the declaration and use
of local variables within compound statements, including user-defined
functions. Its use in user defined functions instead of the Awk
practice of defining excess formal parameters, leads to easier to read
and maintain functions. The C 'practice' of allowing initialization in
the 'local' statement is followed. |
| endfile |
similar to 'exit'. Simulates end of current
input file only, any remaining input files are still processed. |
- New Arithmetic
Functions.
- QTAwk includes 18
built-in arithmetic functions. All of the functions supported by Awk
plus the following:
| acos(x) |
arc-cosine of x |
| asin(x) |
arc-sine of x |
| cosh(x) |
hyperbolic cosine of x |
| fract(x) |
fractional portion of x |
| log10(x) |
logarithm base 10 |
pi or
pi() |
pi |
| sinh(x) |
hyperbolic sine of x |
- New String Functions.
- QTAwk includes 33
built-in string functions. All of the functions supported by Awk plus
the following:
center(s,w)
or
center(s,w,c) |
center
string |
| copies(s,n) |
copies
of string |
| deletec(s,p,n) |
delete
characters from a string |
| gensub(re,rs,how,target) |
generalized
substitution function |
| insert(s1,s2,p) |
insert
one string into another string |
justify(a,n,w)
or
justify(a,n,w,c) |
justify
string |
| overlay(s1,s2,p) |
overlay
one string on another |
| remove(s,c) |
remove
characters from a string |
| replace(s) |
replace
all variables in a string |
| srange(c1,c2) |
return
string formed of all characters from c1 to c2 |
| srev(s) |
reverse
characters of string |
stran(s)
or
stran(s,st) or
stran(s,st,sf) |
translate
characters |
strim(s)
or
strim(s,c) or
- strim(s,c,d)
|
trim
leading and/or trailing characters |
| strlwr(s) |
translate
to lower case |
| strupr(s) |
translate
to upper case |
- New Date and Time functions
-
| _time() |
Local
time (seconds since midnight) |
| _ftime(format_str,sjdn,time) |
Format
date/time |
jdn
or
jdn() or
jdn(y,m,d) |
Julian
Day Number of today or date specified |
| jdn(fdate) |
Calender
date of Julian Day Number specified |
- New Miscellaneous Functions.
-
| rotate(a) |
rotate
the elements of the array a. |
execute(s)
or
- execute(s,se)
or
- execute(s,se,rf)
|
execute
string s |
execute(a)
or
- execute(a,se)
or
- execute(a,se,rf)
|
execute
array a |
| findfile(var,pattern,attributes) |
find
files with specified names and
attributes |
| pd_sym |
access
pre-defined variables |
| ud_sym |
access
user defined variables |
| resetre |
return
QTAwk utility to
start-up condition for all regular expressions,
including patterns and GROUP patterns. Only the internal regular
expression forms for arrays are not re-initialized. The internal
regular expression forms for arrays are re-initialized whenever the
array is changed in any manner. |
| setlocale |
set
the locale under which QTAwk
is
operating |
- New I/O Functions.
- I/O function syntax
has been
made consistent with syntax of other functions. The redirection
operators,
'<', '>' and '>>', and pipeline operator, '|', have been
deleted
as excessively error prone in expressions because of confusion with the
value testing and shifting operators. The pipeline operator has been
replaced by the new pipeline operator, '|>'. The functional syntax
of the
'getline'
function has been made identical to that of the other built-in
functions. The new functions 'fgetline', 'fprint' and 'fprintf' have
been introduced
for reading and writing to files other than the current input file and
to replace the redirection operators.
- Single character
input/output functions have been added:
| getc() |
return next character from current
inputfile |
| fgetc(F) |
return next character from named file, F |
| putc(c) |
output character c to standard output
file |
| fputc(c,F) |
output character c to file F |
- The dropped file re-direction operator, '>>',
has been replaced by the 'append' function:
- append(F) --
Opens the file F for output to the end of the file. All subsequent
output to the file is appended to the end of the file. This function
must be called before the first output to the file to append. Any
output to the file prior to calling this function will open the file
and discard any existing contents, i.e., truncate to zero length.
- Two functions to search files for one or more regular
expressions:
srchrecord(sp) or
srchrecord(sp,rs) or
srchrecord(sp,rs,var) |
search current input file for next
record containing match to 'sp', using 'rs' as record separator (RS
if 'rs' not specified), returning record found in 'var', $0 if 'var'
not specified. Update NR and FNR. Also reparse $0 if 'var' not
specified and update NF.
Returns:
- n ==> Record Present And Read, n == Number Of
Characters In Record plus EOR length plus 1.
- 0 ==> End-Of-File, EOF, Encountered
- -1 ==> Read Error Occurred (Including Failure To
Open File)
|
fsrchrecord(fn,sp) or
fsrchrecord(fn,sp,rs) or
fsrchrecord(fn,sp,rs,var) |
search file 'fn' for next record
containing match to 'sp', using 'rs' as record separator (RS if 'rs'
not specified), returning record found in 'var', $0 if 'var' not
specified. Reparse $0 if 'var' not specified and update NF.
Returns:
- n ==> Record Present And Read, n == Number Of
Characters In Record plus EOR length plus 1.
- 0 ==> End-Of-File, EOF, Encountered
- -1 ==> Read Error Occurred (Including Failure To
Open File)
|
- The function 'get_FNR(F)' has been introduced. This function
returns the current record number of the input file 'F'. This function
is necessary to obtain the current input record number for input files
used with the 'fgetline' and 'fsrchrecord' functions.
- Expanded capability of formatted
Output.
- The limited output formatting
available with the Awk 'printf' function has been expanded by adopting
the complete output format specification of the ANSI C standard.
- 'local' keyword.
- The 'local' keyword has been
introduced to allow for variables local to user-defined functions (and
any compound statement). This expansion makes the Awk practice of
defining 'extra' formal parameters no longer necessary.
- Expanded user-defined functions.
- With the 'local' keyword, QTAwk
allows the user to define functions that may accept a variable number
of arguments. Functions, such as finding the minimum/maximum of a
variable number of variables, are possible with one function rather
than defining separate functions for each possible combination of
arguments.
- User controlled trace capability.
- A user controlled statement trace
capability
has been added. This gives the user a simple to use mechanism to trace
utility execution. Rather than adding 'print' statements, merely
re-defining
the value of a built-in variable will give utility execution trace
information,
including utility line number.
- Expanded built-in variable list.
- With 61 built-in variables, QTAwk
includes
all of the built-in variables of Awk plus the following:
- _arg_chk
- used to determine whether to
check number of arguments passed to user-defined functions.
- ARGI
- index value in ARGV of next command line argument. Gives more
control of command line argument processing.
- CONVFMT
- used for converting floating point numbers to strings. OFMT
used only for output floating point numbers.
- CLENGTH
- similar to 'RLENGTH' of Awk. Set
whenever a 'case' value evaluates to a regular expression.
- CSTART
- similar to 'RSTART' of Awk. Set
whenever a 'case' value evaluates to a regular expression.
- CYCLE_COUNT
- count number of outer loop cycles
with current input record.
- DEGREES
- if TRUE, trigonometric functions
assume degree values, radians if FALSE.
- DELAY_INPUT_PARSE
- If TRUE parsing of input record
into fields is delayed until the value of NF or one of the input fields
is needed in an expression. Useful when the values of NF or any input
field are only rarely used. Record parsing is done only when needed.
- ENVIRON
- one dimensional array with
elements equal to the environment strings passed to QTAwk
- ECHO_INPUT
- controls echo of standard input
file to standard output file.
- FALSE
- predefined with constant value,
0.
- FIELDFILL
- string value used for filling
fixed length fields when fields changed.
- FIELDWIDTHS
- can be assigned a value for fixed
width fields, over-riding the use of FS for splitting current record
into fields. Similar to the same variable in gawk.
- FILEATTR
- file attributes of current input
file.
- FILEDATE
- date as a Julian Day Number, JDN,
of current input file.
- FILETIME
- time in seconds since midnight of
current input file.
- FILEDATE_CREATE
- creation date as a JDN of current
input file.
- FILETIME_CREATE
- creation time in seconds since
midnight of current input file.
- FILEDATE_LACCESS
- last access date as a JDN of
current input file.
- FILETIME_LACCESS
- last access time in seconds since
midnight of current input file.
- FILESIZE
- size in bytes of current input
file.
- FILE_SORT
- string value to define sort order
of array returned by "findfile" function.
- FILE_SEARCH
- TRUE/FALSE value to search
current input file for record(s) containing match to regular
expression(s) in FILE_SEARCH_PAT. Default value FALSE.
- FILE_SEARCH_PAT
- contains one or more patterns for
searching current input file. Useful when next record wanted matches
know regular expression(s) and may not be next input record. Speeds
reading of file in such cases.
- FS
- FS allowed to be an array. If FS
is an array, multiple patterns may be set for field separators.
- Gregorian
- TRUE/FALSE value to distinguish
using Gregorian or Julian calendar in computing Julian Day Number or
converting back to calendar date.
- IGNORECASE
- if assigned a true value, QTAwk
ignores case in all string and regular expression match operations.
- LOCALE
- single dimensioned array
containing the string values for locale dependent values.
- LONGEST_EXP
- used to control whether the
longest or the first string matching a regular expression is found.
- MATCH_INDEX
- assigned the string value of the
matching array element when an array used for regular expression match.
- MAX_CYCLE
- maximum number of outer loop
cycles permitted with current input record.
- MLENGTH
- similar to 'RLENGTH' of Awk. Set
whenever a stand-alone regular expression is encountered in evaluating
a pattern.
- MSTART
- similar to 'RSTART' of Awk. Set
whenever a stand-alone regular expression is encountered in evaluating
a pattern.
- NF
- if value changed, current input
record changed to reflect new value.
- NG
- equal to the number of the
regular expression in a GROUP matching a string in the current input
record.
- OFMT
- string value used only as format
for output of floating point numbers.
- RECLEN
- if assigned a non-zero numeric
value, integral value used for length of fixed length records. RS not
used unless RECLEN has a zero numeric value.
- RETAIN_FS
- if TRUE the original characters
separating the fields of the current input record are retained whenever
a field is changed, causing the input record to be re-constructed. If
FALSE the output field separator, OFS, is used to separate fields in
the current input record during reconstruction. The latter practice is
the only method available in Awk.
- RS
- RS allowed to be an array. If RS
is an array, and RECLEN has a zero numeric value, multiple patterns may
be set for record separators.
- RT
- automatically assigned string
value of record terminator for current input record.
- SUBSEP
- string value used as the array
element index separator in MATCH_INDEX.
- SPAN_RECORDS
- TRUE/FALSE, default value FALSE.
if TRUE allows matches to FILE_SEARCH_PAT to span multiple input
records and return multiple records in $0. If FALSE, matches confined
to a single record. Also controls matches spanning records in
'srchrecord' and 'fsrchrecord' functions.
- TRACE
- value used to determine utility
tracing.
- TRANS_FROM/TRANS_TO
- strings used by 'stran' function
if second and/or third arguments not specified.
- TRUE
- predefined with constant value,
1
- QTAwk_Path
- initialized from resource
configuration file(s). Sets paths searched for input files.
- vargc
- used only in used-defined
functions defined with a variable number of arguments. At run-time, set
equal to the actual number of variable arguments passed.
- vargv
- used only in used-defined
functions defined with a variable number of arguments. At run-time, an
single dimensioned array with each element set to the argument actually
passed.
- Module_Path
- initialized from resource
configuration file(s). Sets paths searched for loadable module files.
- USER_FUNCTIONS
- singly dimensioned array with indices equal to names of user
defined functions and element values equal to the names of the files in
which function were defined.
- MODULES
- singly dimensioned array with indices equal to the file bnames
of currently loaded modules and element values equal to the module
count.
- New command line options
available:
- -ffilename
- multiple utility files may be
specified. In addition, the file directive:
#include "filename"
-
or
-
#include <filename>
may be used to include other
files. The path for finding files follows the C pract
- -vvar=value
- sets 'var' to value
before
any "BEGIN" actions executed
- -Wd
- delays parsing of input record
until any fields or the NF variable referenced.
-
- -Wm
- forces QTAwk to allow multiple
includes of the same file, issuing an error message and skipping
multiple
includes. Without this option specified, QTAwk exits with an error
message upon finding multiple includes of the same file.
-
- Definition of built-in variable,
RS, expanded.
- When value assigned to RS, it is
converted to regular expression form. Strings matching regular
expression act as record separator. Similar in behavior to field
separator, FS. If an array, multiple record separator patterns may be
specified.
-
- FILENAME
- In QTAwk, setting built-in
variable, "FILENAME", to another value will change the current input
file. Setting the variable in Awk, has no effect on current input file.
-
- NF
- In QTAwk, setting built-in
variable, NF to another value will change the current contents of $0.
If
the new value is greater than the current value, the current input
line is lengthened with new empty fields separated by the output field
separator strings, OFS. If the new value is less than the current
value, then $0 is shortened by truncating at the end of the field
corresponding to the new NF value.
-
- The Tag Operator, '[< >]'
- The Tag operator may be used to
obtain or to set a particular part of the string matching the regular
expression pattern.
-
- getline
- The return value of the 'getline'
function
has been changed when a valid record has been read. The return value
is the length of the record plus the length of the End-Of-Record plus
1.
Loadable modules
The ability to dynamically
load and unload modules defining user defined functions.
-
- Awk Problems
- Corrected admitted problems with
Awk. The problems mentioned on page 182 of "The Awk Programming
Language" have been corrected. Specifically:
- true multidimensional arrays have
been
implemented,
- the 'getline'
syntax has been made to match that of other functions,
- declaring local
variables in user-defined functions has been corrected,
- intervening blanks
are allowed between the function call name and the opening parenthesis
(in fact, under QTAwk it is permissible to have no opening parenthesis
or argument list for user-defined functions that have been defined with
no formal arguments).
- Resource Configuration File(s)
- With Linux version 1.60, QTAwk
uses
up to two resource configuration files. Both are named ".qtawkrc". One
is global and used by all
users and is located in "/usr/local/etc". It must be placed there by
the sys admin, root user or super user (whatever name you use). The
second or "local" resource configuration file is
located in the users Home directory, which is named in the "HOME"
environment variable by the Bash shell.
- QTAwk first tries to open
the global resource configuration file.
If it exists, it is opened and executed. QTAwk then tries to open the
user's local resource configuration file. If it exists in the user's
Home directory, it is opened and executed.
- The use of resource configuration file(s) by QTAwk offers more
ways for the user to customize QTAwk
for the user's own use. The QTAWK
environment variable was limited to setting the search path for QTAwk
utility files. The resource configuration files are now used for that
plus much more.
- The
following commands are available for use in resource configuration
files:
- pattern_action:
This command allows the user to insert a pattern/action pair into the
resource configuration file. Any pattern/action pair may be inserted
including those for pre-defined patterns (BEGIN, INITIAL, NOMATCH,
GROUP, FINAL and END) and user defined functions. This allows the use
of pre-defined pattern actions and user defined functions for all QTAwk
utilities without having to "include" the file into all utility files.
The pattern/action pair or user defined function may span as many lines
as needed, simply use the backslash, '\', character as the last
character on a line to indicate to continue on the next line.
- Statement:
This command allows any valid "statement" to be inserted into the
resource configuration file. The statement will be executed
immendiately before any "BEGIN" pre-defined pattern action. The
statement may span as many lines as
needed, simply use the backslash, '\', character as the last character
on a line to indicate continuation on the next line.
- Expression:
This command allows any valid "expression" to be inserted into the
resource configuration file. The expression will be executed after any
statements defined by the above command and
immendiately before any "BEGIN" pre-defined pattern action. The
expression may span as many lines as
needed, simply use the backslash, '\', character as the last character
on a line to indicate to continue on the next line.
- Immediate Expression:
This command allows any valid "expression". The expression is executed
immediately. This allows the user to specify the values of Predefined
variables. Uually used to set the value of the pre-defined variable
"QTAwk_Path" - the search path(s) for utility files.
- Delay Input Parse = on/off.
This command allows the user to turn on or off the Delay Input Parse
mode of QTAwk without having
to invoke the mode on every command line.
- Replace Pattern_Actions
This command instructs QTAwk
to first delete any pattern/action pairs
read previously from this configuration file or
from a resource configuration file read before the present one. This
command would only be useful in the user's local resource configuration
file and the user desired to exclude any pattern/action and user
defined functions from the global resource configuration file.
Normally, the pattern/action pairs in the user's local resource
configuration file would simply be used in addition to any from the
global file.
- Replace Statements
This command is the same as the previous command, except it works with
any statements.
- Replace Expressions
This command is the same as the previous two commands, except it works
with
any expressions
Note that the resource configuration file is scanned for the keyword
commands listed above. Any line not matching the keys is ignored. A
sample configuration file, ".qtawkrc", is included and should be
customized to
the users or system administrator's use.
© Terry D. Boldt 1997-2006
All Right Reserved
Last Updated: Feb. 13, 2006