User Guide
Chapters
Table of Contents
Statements
User Defined Functions

Built-In Functions



Built-In Functions

QTAwk offers a rich set of built-in arithmetic, string, I/O, array and system functions. The array of built-in functions available has been extended over that available with Awk. The I/O functions have been changed to match the functional syntax of all other built-in and user defined functions.

  1. Arithmetic Functions
  2. String Functions
  3. File Functions
  4. Date and Time Functions
  5. Miscellaneous Functions

Arithmetic Functions

QTAwk offers the following built-in arithmetic functions. Those marked with an asterisk, '*', are new to QTAwk:

  1. acos(x)
  2. asin(x)
  3. atan2(y,x)
  4. cos(x)
  5. *cosh(x)
  6. exp(x)
  7. *fract(x)
  8. int(x)
  9. log(x)
  10. *log10(x)
  11. *pi()
  12. rand()
  13. sin(x)
  14. sinh(x)
  15. sqrt(x)
  16. srand
acos(x) return arc-cosine of x. The return value has degree units if the built-in variable DEGREES is true, radian units otherwise.
asin(x) return arc-sine of x. The return value has degree units if the built-in variable DEGREES is true, radian units otherwise.
atan2(y,x) return arc-tangent of y/x, -ã to ã. The return value has degree units if the built-in variable DEGREES is true, radian units otherwise.
cos(x) return cosine of x. Assumes x is in degrees if built-in variable DEGREES is true, otherwise assumes x in radians.
cosh(x) return hyperbolic cosine of x.
exp(x) return e^x.
fract(x) return fractional portion of x. Refer to the int for the function returning the integer portion.
int(x) return integer portion of x. Refer to fract for the function returning the fractional portion.
log(x) Return natural (base e) logarithm of x.
log10(x) Return base 10 logarithm of x.
pi() Return pi.
rand() Return random number r, 0 <= r < 1.
sin(x) Return sine of x. Assumes x is in degrees if built-in variable DEGREES is true, otherwise assumes x is in radians.
sinh(x) Return hyperbolic sine of x.
sqrt(x) Return square root of x. If x < 0, returns 0.
srand This function has two forms:
srand(x)
Set x as new seed for rand().
srand()
Use current system time as new seed for rand().

String Functions

QTAwk offers the following built-in string handling functions. Those marked with an asterisk, '*', are new to QTAwk.

Some string functions require or return a string position. String positions start with the first character as position 1.

  1. *center
  2. *copies(s,n)
  3. *deletec(s,p,n)
  4. *gensub
  5. gsub
  6. index(s1,s2)
  7. *insert(s1,s2,p)
  8. length
  9. match(s,r)
  10. *overlay(s1,s2,p)
  11. *remove(s,c)
  12. *replace(s)
  13. split
  14. *srange(c1,c2)
  15. *srev(s)
  16. *stran
  17. *strim
  18. *strlwr(s)
  19. *strupr(s)
  20. sub
  21. substr
center This function has two forms:
center(s,w)
return string s centered in w blank characters.
center(s,w,c)
return string s centered in w 'c' characters.

In both cases, if w is less than the length of s, length(s), then s will be truncated to w length.

copies(s,n) Return n copies of string s.
deletec(s,p,n) Return string s with n characters deleted starting at position p.
gensub This function searches for expressions or strings in one string and substitutes specified strings for those found. The changed string is returned.

This function has two forms:

gensub(re,rs,how)
substitute for strings matched by the regular expression, re, globally in $0, using replacement string, rs, according to how. Return the modified string. $0 is unchanged. The evaluation of the replacement string expression, rs, is the same as for gsub.
gensub(re,rs,how,target)
substitute for strings matched by the regular expression, re, globally in target, using the replacement string, rs, according to how. Return the modified string. target is unchanged. The evaluation of the replacement string expression, rs, is the same as as for gsub.

The how parameter determines the number of replacements made and which matching string to replace, using the following rules:

  1. If how is a string, including a regular expression or a single character, which starts with the character 'g' or 'G', then all matches are replaced.
  2. If how is a string, including a regular expression or a single character, not starting with 'g' or 'G', then the first match only is replaced and no search for matches beyond the first is made.
  3. If how is a numeric greater than zero, 0, then the numbered match equal to how is replaced only. For example, if
    how == 3

    then only the third match found is replaced. In addition, no further searches for matches are made.

  4. If how is a numeric equal to zero, then no search for matches is done and no replacements are made.
  5. If how is a negative numeric, then it is treated as though
    how == 1

    and only the first match is replaced and no search for matches beyond the first is made.

Using gensub, the count of substitutions can be obtained under QTAwk quite easily. Since QTAwk does not evaluate the replacement string expression, rs. at the time the gensub function is called, but delays evaluation until the match has been found and the replacement is to be accomplished. The following construction, counts the number of matches found.

gcnt = 0;
gensub(re, (gcnt++, rs), how[, target]);

The comma operator has been used in the second argument to increment the count variable, gcnt, every time a match has been found. Note that if how is a numeric, then the final value of gcnt will be equal to the number of matches found or to the value specified for how, whichever is less. Note also that the second argument, (gcnt++, rs) MUST be enclosed in parenthesis.

The description of the built-in function gsub contains an expanded explanation of the arguments re and rs.

gsub This function has two forms:
gsub(re,rs)
substitute for strings matched by regular expression, re, globally in $0, return number of substitutions made.
gsub(re,rs,target)
substitute for strings matched by regular expression, re, globally in string target, return number of substitutions made.

The substitution strings are determined by the replacement string, rs, the string value of the second argument. The replacement string expression, rs, is not evaluated at the time the function is called, but when the replacement string is used for replacement.

Since the first argument, re, may be an array, the value of the built-in variable, MATCH_INDEX, can be effected as matches are made. By delaying the evaluation of the replacement string expression until the replacement is made, the change in MATCH_INDEX can be used to effect the value of the replacement string for each replacement.

Replacing a list of strings in another string could be accomplished by two methods. In both methods the strings to replace, the pattern strings, are contained in the singly dimensioned array 'str_pat' and the replacement strings are contained in the singly dimensioned array 'str_rep'. The first method uses a loop to replace each string in str_pat.

for ( i in str_pat ) gsub(str_pat[i],str_rep[i],rep_var);

The second method uses the array capabilities of QTAwk to automatically scan for all pattern strings in 'str_pat'.

gsub(str_pat,str_rep[MATCH_INDEX + 0],rep_var);

As each pattern string is found, MATCH_INDEX is set to the string value of the corresponding index of the pattern string. The replacement string expression is then evaluated. Adding 0 to MATCH_INDEX is necessary to convert its string value to a numeric for indexing into the str_rep array.

An example of such use is found in the "more.exp" utility included with QTAwk. Using the ANSI display option, strings may be highlighted on output. Finding and highlighting the desired strings is accomplished with the single statement:

# Put In High Light ANSI Sequences For High Lighted Text
if ( highlight )
gsub(High_pat,High_Text[MATCH_INDEX + 0] >< "[<0>]" >< Normal,dl);

where 'highlight' is a variable with a TRUE/FALSE value used to flag if strings are to be highlighted. 'High_pat' and 'High_Text' are arrays with the string patterns to highlight and the ANSI character sequences necessary for changing the display colors. [<0>] is the tag operator replaced by the the string matching the desired pattern in 'High_pat'. 'Normal' contains the ANSI character sequence necessary to return the display to the normal display colors. 'dl' is the variable containing the display line.

The replacement string expression:

High_Text[MATCH_INDEX + 0] >< "[<0>]" >< Normal

is evaluated each time gsub finds a string match in 'dl' for one of the patterns in 'High_pat'. The value of MATCH_INDEX reflects the index of the element in 'High_pat' matched (0 is added to convert MATCH_INDEX from a string value to an integer value for indexing 'High_Text').

QTAwk guarantees that the replacement string expression will be evaluated as replacements are made in the text string from left to right. For constant expressions used for the replacement string, evaluating the replacement string expression when the function is called or when the replacement is made are equivalent.

Without the use of arrays as search patterns, the above could only be accomplished as a much slower loop:

for ( i in High_pat )
gsub(High_pat[i],High_Text[i] >< "[<0>]" >< Normal,dl);

The above example uses the string replacement token, '[<0>]'. This "token" is the same as the tag operator. The full replacement string token is the same as the tag operator

[<i,j>], 0 <= i <= 7, 0 <= j <= 31.
The token is replaced with the tagged string at the row, column specified with i and j. If both i and j are zero, the token is replaced by the entire matching string.
[<j>], 0 <= j <= 31.
This is a special case of the above. If j == 0, then the token is replaced by the entire match string. If j is greater than 0, then it assumed to be the column count at row 1.

To conform to prior practice in Awk, the replacement string token & is also supported and is equivalent to [<0>].

Refer to Tag Operator for a full description of the tag operator and Tagged Strings in Regular Expressions for a full description of tagged strings.

The replacement string tokens can be turned off by escaping the token, i.e., preceding the token with a backslash, e.g., \[<0>] or \&.

One difference between the gensub function and the sub and gsub functions is how escaped characters are treated in the replacement string. In gensub all escaped characters are translated into the character following the escape character, '\', ASCII 92. In the sub and gsub functions, only the escape character itself and the replacement string tokens, &, [<k>] and [<i,j>] are translated. In the gensub function all escaped characters are translated.

The following table illustrates how the three functions treat escaped characters in the replacement string. The first column in the table shows the string as contained in the QTAwk utility. The second column contains the string as passed to the substitution function after the initial translation by QTAwk when reading the utility. The third column shows the string generated by the substitution functions sub and gsub. The fourth column shows the string generated by the gensub function.
utility
Input
Passed to
function
'sub'/'gsub'
output
'gensub'
Output
[<0>] [<0>] the matched text the matched text
& & the matched text the matched text
\\\\\\& \\\& a literal \& a literal \&
\\\\& \\& a literal \ followed
by the matched text
a literal \ followed
by the matched text
\\& \& a literal & a literal &
\\q \q a literal \q a literal q

Note the last line in the above table and the different treatment by 'gensub" and the 'gsub/sub' functions. 'gensub' translates '\q' into q while 'gsub/sub' both pass the whole sequence through.

As another example of the use of a substitution function and the replacement string token, consider the need to replace one set of words with another set. Whenever multiple substitutions must be made in a single string, the gsub function is optimized to make the substitutions and will be easier than attempting to accomplish the same with a user-defined function. To change all occurrences of a list of words contained in the array, "word_list", with another word contained in the replacement word list array, "replace_list".

To replace the following words with those indicated
Search Word Replacement Word
imminent impending
expurgate cleanse
antipodal opposite
laconic brief
fidgety nervous
furvor enthusiasm
aficionado devotee
didactic moral
merchandise advertise
fastidious particular

Define the two arrays:
Search words: Replacement words:
word_list[0] = "imminent"; replace_list[0] = "impending";
word_list[1] = "expurgate"; replace_list[1] = "cleanse";
word_list[2] = "antipodal"; replace_list[2] = "opposite";
word_list[3] = "laconic"; replace_list[3] = "brief";
word_list[4] = "fidgety"; replace_list[4] = "nervous";
word_list[5] = "furvor"; replace_list[5] = "enthusiasm";
word_list[6] = "aficionado"; replace_list[6] = "devotee";
word_list[7] = "didactic"; replace_list[7] = "moral";
word_list[8] = "merchandise"; replace_list[8] = "advertise";
word_list[9] = "fastidious"; replace_list[9] = "particular";

The following gsub call will accomplish all replacements in the string value of the line variable with a single call

r_cnt = gsub(word_list,replace_list[MATCH_INDEX + 0],line);

the value of the variable r_cnt will be the number of replacements made. If the value of line must be unchanged, then the gensub function may be used:

new_line = gensub(word_list,replace_list[MATCH_INDEX + 0],"g",line);

and to still obtain the number of replacements made, use:

r_cnt = 0;
new_line = gensub(word_list,(r_cnt++ , replace_list[MATCH_INDEX + 0]),"g",line);

Note that if only words surrounded by white space, commas, periods or parenthesis are desired, then the values of word_list must be changed as follows:

Define the regular expressions for the word to be the beginning of the line or the end of the line, or to be preceded or followed by white space or other separators:

ldg = /(^|{_w}|[\.,\(\)<>{}\[\]])/;
tlg = /($|{_w}|[\.,\(\)<>{}\[\]])/;

then add these to the desired words

for ( i in word_list ) word_list[i] = ldg >< word_list[i] >< tlg;

Then, to effect the replacement

sr_cnt = gsub(word_list,"[<1>]" >< replace_list[MATCH_INDEX + 0] >< "[<2>]",line);

The replacement string tokens [<1>] and [<2>] are needed here to place the characters surrounding the original word before and after the replacement word. Note that if the desired word starts or ends a line then [<1>] or [<2>] are the null string as appropriate.

index(s1,s2) Return position of string s2 in string s1.

Return zero, 0, if string s1 does not contain string s2 as a substring.

insert(s1,s2,p) Return string formed by inserting string s2 into string s1 starting at position p.
length This function has three forms:
length
return number of characters in $0.
length()
return number of characters in $0.
length(s)
return number of characters in string s.
match(s,r) Return true, 1, if string s contains a substring matched by regular expression r.

If string s contains no match to regular expression r, then return false, 0.

Set RLENGTH to length of substring matched (or zero) and RSTART to start position of substring matched (or zero).

overlay(s1,s2,p) Return string formed by overlaying string s2 on string s1 starting at position p. May extend length of s1. If p > length(s1), s1 padded with blanks to appropriate length.
remove(s,c) Return string formed by removing all 'c' characters from string s.
replace(s) Return string formed by replacing all repeated expressions, {n1,n2}, and named expressions, {name}, in string s.

Same operation performed for strings used as regular expressions and in converting regular expressions into internal form.

split This function has two forms:
split(s,a)
Split string s into array a on field separator FS. Return number of fields. The same rules applied to FS for splitting the current input record apply to the use of FS in splitting s into a.
split(s,a,fs)
Split string s into array a on field separator fs. Return number of fields. The same rules applied to FS for splitting the current input record apply to the use of fs in splitting s into a.

Note: Whenever an element of the return array, 'a', contains a single character, the value returned is a character type rather than a string type. If some of the return elements may be single characters, it may be necessary to test array elements for a single character return and convert those array elements to string type by appending the null string to the element value:

    for ( i in a ) if ( e_type(a[i]) == 5 ) a[i] ><= "";

Note: if the third parameter, fs, is the null string, the string value of s is split into individual characters. This is the same behavior as for the input record when FS is the null string. The use of the null string as the 'fs' argument is a handy method for splitting a string into a character array so that each character may be addressed individually. For example:

cn = split("abcdefg",ca,"");

yields:

cn == 7;
ca[1] == 'a'
ca[2] == 'b'
ca[3] == 'c'
ca[4] == 'd'
ca[5] == 'e'
ca[6] == 'f'
ca[7] == 'g'

srange(c1,c2) Return string formed by concatenating characters from c1 to c2 inclusive. If c2 < c1 the null string is returned. Thus,

srange('a','k') == "abcdefghijk".

srev(s) Return string formed by reversing string s.

srev(srange('a','k')) == "kjihgfedcba".

stran Returns a string formed by translating characters in string s matching characters in string sf to corresponding characters in string st. If there is no corresponding character in st, then the matched character is replaced with a blank character. The corresponding character is that character at the same position in st as the position of the matching character in sf.

This function has three forms:

strans(s)
is equivalent to:
stran(s,TRANS_TO,TRANS_FROM)
sf assumes the value of the built-in variable TRANS_FROM. st assumes the value of the built-in variable TRANS_TO. If the string values of TRANS_TO and TRANS_FROM have not been changed, this in effect translates the string s to lowercase and is equivalent to: strlwr(s)

stran(s,st)
sf assumes the value of the built-in variable TRANS_FROM.

stran(s,st,sf)
translates all characters in s which are contained in sf to the character in the same position in st. If the default values of TRANS_TO and TRANS_FROM have not been changed, then the statement:
stran(s,TRANS_FROM,TRANS_TO);

returns string s translated to uppercase and is equivalent to: strupr(s)

strim This function has three forms. Each form returns a string formed by trimming leading and/or trailing characters from string s.
strim(s)
return string formed by trimming leading and tailing white space from string s. Leading white space matches the regular expression /^{_w}+/. Tailing white space matches the regular expression /{_w}+$/.

strim(s) is equivalent to strim(s,1,1)

strim(s,le)
return string formed by trimming string matching le.

Differing actions are taken depending the type of le:
le type action
regular expression delete first string matching regular expression
string convert to regular expression and delete first matching string
single character delete all leading characters equal to 'le'
nonzero numeric delete leading white space matching /^{_w}+/
zero numeric ignore

strim(s,le). is equivalent to the form strim(s,le,0)

The following all delete the leading dashes from the given string:

strim("------ remove leading -------",/^-+/);
strim("------ remove leading -------",/-+/);
strim("------ remove leading -------",'-');

==> "remove leading -------"
strim(s,le,te)
return string formed by trimming string matching le and string matching te from s. le and te may be a regular expression, a string, a single character or a numeric. Differing actions are taken depending the type of le and te:
le/te type action
regular expression delete first string matching regular expression
string convert to regular expression and delete first matching string
single character delete all leading/tailing characters equal to 'le'/'te' respectively
nonzero numeric delete leading/tailing white space matching /^{_w}+/ or /{_w}+$/ respectively
zero numeric ignore
strim("======remove leading and tailing-------",'=','-')
or
strim("======remove leading and tailing-------",/^=+/,'-')
or
strim("======remove leading and tailing-------",'=',/-+$/)
or
strim("======remove leading and tailing-------",/^=+/,/-+$/)
or
strim("======remove leading and tailing-------",/=+/,/-+/)
or
strim("======remove leading and tailing-------",/-+/,/=+/)

==> "remove leading and tailing"
strim("======remove leading       ",'=',FALSE)
==> "remove leading "

strim(" remove tailing-------",FALSE,'-')
==> " remove tailing"
strlwr(s) Return string s translated to lower-case.
strupr(s) Return string s translated to upper-case.
sub This function has two forms:
sub(re,rs)
substitute for leftmost string matched by regular expression, re, in $0, return number of substitutions made (0/1). Refer to description of built-in function gsub for when the replacement string expression, rs, is evaluated.
sub(re,rs,target)
substitute for leftmost string matched by regular expression, re, in target, return number of substitutions made (0/1). Refer to description of built-in function gsub for when the replacement string expression, rs, is evaluated.
substr This function has two forms:
substr(s,p)
return string formed from suffix of string s starting at position p.
substr(s,p,n)
return string formed from n characters of string s starting at position p. If n == 1, a character constant is returned.

Note: Whenever n == 1, the return value is a character type rather than a string type. If a string type is essential for the returned value then it will be necessary to convert the value to string type by appending the null string to the value:

    ret = substr(s,p,1) >< "";

File Functions

QTAwk offers the following built-in file functions.

For input files, if a drive and/or path is specified with the filename, then only that drive and path are searched for the desired file. If no drive or path is specified, then the current directory is searched, if the file is not found, then the paths (optionally with a drive specifier) specified by the string value of the built-in variable QTAwk_Path are searched for the file. Multiple paths may be specified in QTAwk_Path by separating them with semi-colons.

The file functions are:

Input:

  1. getline
  2. fgetline
  3. srchrecord
  4. fsrchrecord
  5. getc
  6. fgetc

Output:

  1. print
  2. fprint
  3. printf
  4. fprintf
  5. putc
  6. fputc
  7. sprintf

Miscellaneous:

  1. append
  2. close
  3. get_FNR
  4. findfile
  5. fflush
  6. load_module
  7. unload_module
getline This function has two forms:
getline
getline()
reads next record from current input file into $0. Sets fields, NF, NR and FNR.
getline v
getline(v)
reads next record from current input file into variable v. Sets NR and FNR.

The effect of all three forms is summarized in the input function table

Returns:

  1. the number of characters read plus the length of the End-Of-Record plus 1,
  2. 0 if End-Of-File was encountered, or
  3. -1 if an error occurred.

Note: The built-in variables FILE_SEARCH and FILE_SEARCH_PAT have no effect on the next record input with this function. The next physical record in the file is read irregardless of the value of FILE_SEARCH.


fgetline This function has two forms:
fgetline(F)
reads next record from file F into $0. Sets fields and NF.
fgetline(F,v)
reads next record from file F into variable v.

The effect of the fgetline function is summarized in the input function table

Returns:

  1. the number of characters read plus the length of the End-Of-Record plus 1,
  2. 0 if end-of-file was encountered or
  3. -1 if an error occurred.

Note: The built-in variables FILE_SEARCH and FILE_SEARCH_PAT have no effect on the next record input with this function. The next physical record in the file is read irregardless of the value of FILE_SEARCH.


srchrecord This function has three forms:
  1. srchrecord(sp)
  2. srchrecord(sp,rs)
  3. srchrecord(sp,rs,var)

This function searches the current input file for the next record or records matching the search pattern, sp. If the record separator parameter, rs, is not specified, records are determined by the variable RS. If rs is specified, record boundaries are determined by the strings matching rs. rs may be a simple constant or variable or an array. As for the built-in variable RS. specifying rs as the null string, "", will set a blank line as the record separator. If rs is specified as the null string, QTAwk will silently use the regular expression /\n\n+/.

The record or records matching the search pattern are returned in $0 if var is not specified. If var is specified, the matching record or records are returned in var. The built-in variables, FNR and NR are updated to reflect the current position and record number after the search. The built-in variables, NF and $i, i <= 0 <= NF, are set when var is not specified. The effect of the srchrecord function is summarized in the input function table

Returns

  1. the number of characters read plus the length of the End-Of-Record plus 1,
  2. 0 if end-of-file was encountered or
  3. -1 if an error occurred.
fsrchrecord This function has three forms:
  1. fsrchrecord(fn,sp)
  2. fsrchrecord(fn,sp,rs)
  3. fsrchrecord(fn,sp,rs,var)

This function searches the file specified in fn for the next record or records matching the search pattern, sp. If the record separator parameter, rs, is not specified, records are determined by the variable RS. If rs is specified, record boundaries are determined by the strings matching rs. rs may be a simple constant or variable or an array. As for the built-in variable RS. specifying rs as the null string, "", will set a blank line as the record separator. If rs is specified as the null, QTAwk will silently use the regular expression /\n\n+/. The record or records matching the search pattern are returned in $0 if var is not specified. If var is specified, the matching record or records are returned in var. The built-in variables, NF and $i, i <= 0 <= NF, are set when var is not specified.

Returns

  1. the number of characters read plus the length of the End-Of-Record plus 1,
  2. 0 if end-of-file was encountered or
  3. -1 if an error occurred.

The effect of the fsrchrecord function is summarized in the input function table

getc(); Reads next character from current input file and returns the character read, 0 if End-Of-File or -1 on file read error.

The ECHO_INPUT built-in variable controls echo of characters read from the standard input file or keyboard file to the standard output file.

fgetc(F); Reads next character from file F and returns the character read, 0 if End-Of-File or -1 if a file error occurs.

If reading from the standard file, keyboard, there are no End-Of-File or error returns. The fgetc function will return 0 when a function key, cursor key or other key not corresponding to an ASCII character is pressed when reading from the keyboard file. A second call must be made to read the keyboard scan code to recognize the key pressed.

The ECHO_INPUT built-in variable controls echo of characters read from the standard input file or keyboard file to the standard output file.

print This function has four forms:
  1. print()
  2. print
  3. print(...)
  4. print ...;

Forms 1 and 2 print $0 to the standard output file followed by ORS. Returns the number of characters printed.

Forms 3 and 4 print the expressions in the expr_list, ..., to the standard output file, each separated by OFS. The last expression is followed by ORS. Returns the number of characters printed.

fprint This function has two forms:
fprint(F)
prints $0 to file F followed by ORS. Returns the number of characters printed.
fprint(F,...);
prints expressions in the expr_list, '...', to the file F, each separated by OFS. The last expression is followed by ORS. Returns the number of characters printed.
printf(fmt,...) Print expr_list, ..., to the standard output file according to the format string, fmt. See the Format Specification for a complete description of the format string.

Returns the number of characters printed.

fprintf(F,fmt,...) Print expr_list, ..., to file F according to the format string fmt. See the Format Specification for a complete description of the format string.

Returns the number of characters printed.

putc(c) writes the character c to the standard output file and returns the character c if no error occurred on the write. Returns -1 if an error occurred.
fputc(c,F) writes the character c to the file F and returns the character c if no error occurred on the write. Returns -1 if an error occurred.
sprintf(fmt,...) return string formed by formatting expr_list, ... , according to the format string, fmt. See the Format Specification for a complete description of the format string.
append(F) This function causes all subsequent output to file F to be written, printed, at the end of the file, i.e., appended to the end. Closing the file, F with the close function will cancel the effect of append for any subsequent output to that file.

If any output is printed to the file before executing this function, the prior write operation opened the file, if it existed, and truncated it to zero length prior to printing the first character. Thus, printing to a file prior to invoking the append function will delete any file contents. Using the append function prior to printing will assure that the current contents are preserved.

close(F) Close file or pipe F. If the same file is open for both reading and writing, the close function will close the file opened for writing first. The same file may be opened for both reading and writing with the append function. Specifying the current scan file for the append function is possible and will direct any output to the end of the file without altering the current read position.

For bi-directional pipes, a second argument can be used:

close(pipe,"to")

or

close(pipe,"from")

The first form will close the part of the bi-directional pipe "to" the command process and the second form will close the part of the bi-directional pipe "from" the command process.

Both parts of the bi-directional pipe can be closed with a single call by omitting the second argument. Additionally, if one part has been closed previously by the use of the second argument, then the second argument is not necessary to close the remaining part.
get_FNR This function has two forms:
  1. get_FNR()
  2. get_FNR(F)

This function returns the current record number of the input file specified. The first form returns a value equal to the built-in variable FNR and is equivalent to:

get_FNR(FILENAME)

If the filename specified is not open or is not open for input, a value of zero, 0, is returned.

This function has been added because of the input functions fgetline and fsrchrecord. For the current input file, the built-in variable FNR is always updated automatically to contain the record number of the last record input (the current record). However, when reading from a file other than the current input file, there is no other means of obtaining the current record number of the file. With fgetline. the user utility could maintain an independent count of records read. However, if the fsrchrecord function is used, it is not possible for the utility to maintain such a count since not all records are actually 'read'. Many may be skipped when searching for a string matching the search pattern.

findfile QTAwk offers two forms of the findfile function. Both forms search for files matching a specified "\path\filename" pattern. Since operating system function calls are utilized for the search operations, only operating system style wild cards are allowed in the pattern. The two forms of the file search function are:
  1. findfile(variable)
  2. findfile(variable,pattern)

Both forms of the function return the number of files found which match the pattern specified. The first argument specifies the variable in which the array of files found is to be returned. The second argument specifies the pattern to match against. If the file pattern is not passed or is passed as the null string, then the following pattern:
Pattern Operating System
* OS/2
* Linux

is utilized.

The variable passed in the first argument will be used to return the array of files found. The zeroth, 0, element of the array will contain the path specified in the pattern passed or the current working directory if no path is specified. Starting with element one, 1, the file name, file size, file write date, file write time, file creation date, file creation time, file last access date, file last access time and file attributes will be returned. Thus, if the first argument passes the variable "files", this variable will contain elements with the following information (with 1 <= n <= N, N == the number of files found == the return value of the function):
array element value
files[0] path specified or current working directory
files[n]["name"] name of n'th file found
files[n]["size"] size of n'th file found
files[n]["wdate"] write date of n'th file found expressed as a Julian Day Number, JDN. Refer to the jdn function for a complete explanation of the Julian Day Number.
files[n]["wtime"] write time of n'th file found expressed as seconds since midnight.
files[n]["cdate"] creation date of n'th file found expressed as a Julian Day Number, JDN.
files[n]["ctime"] creation time of n'th file found expressed as seconds since midnight.
files[n]["adate"] last access date of n'th file found expressed as a Julian Day Number, JDN. format.
files[n]["atime"] last access time of n'th file found expressed as seconds since midnight.
files[n]["attr"] attributes of n'th file found. Return value dependent on operating system. For OS/2 and Linux, an integer value is returned.

For OS/2, the bits are set according to the Archive, Read-Only, System, Hidden or directory attribute of the file. The following attribute values correspond to the following file attributes:
attribute value OS/2 File Attribute
00 all attributes cleared
1 read only
2 hidden
4 system
16 directory
32 archived

For Linux, the bits are set with the following values:
attribute value Linux File Attribute
0170000 These bits determine file type
0040000 Directory
0020000 Character device
0060000 Block device
0100000 Regular file
0010000 FIFO
0120000 Symbolic link
0140000 Socket
0007777 These bits determine file protections
0000777 These bits determine File permissions
0004000 Set user ID on execution
0002000 Set group ID on execution
0001000 Save swapped text after use (sticky)
0000400 Read by owner
0000200 Write by owner
0000100 Execute by owner
0000040 Read by group
0000020 Write by group
0000010 Execute by group
0000004 Read by Others
0000002 Write by Others
0000001 Execute by Others

The QTAwk utility attr_string will convert the Linux numeric attribute to an attribute string like that displayed by the 'ls' utility.

Normally, the list of files returned in the file array are unsorted, i.e., in the order found by the operating system. If the files returned in the file array have to be ordered by filename, file extension, file date, file time or file size, then the FILE_SORT built-in variable must be used to specified the sort order desired.

The _ftime built-in function may be used to format the file date and time. Also, the file date may be used in the jdn function to obtain the calender date directly.

For example, the following loop will re-set the appropriate array elements with the formatted file date and time:

date_fmt_str = "%m/%d/%Y"; # format date as month/day/year
time_fmt_str = "%H:%M:%S"; # format time as hour:minute:second
for ( elem in file ) {
if ( elem == 0 ) continue; # skip path element
file[elem]["wdate"] = _ftime(date_fmt_str,file[elem]["wdate"],0);
file[elem]["wtime"] = _ftime(time_fmt_str,0,file[elem]["wtime"]); }

Since system function calls are used to match against wildcards, normal QAwk regular expression operators cannot be used for matching against filenames. Instead the system functions use a form introduced in the 'ksh'. The following patterns are recognized by the matcher: (where PATTERN-LIST is a `|' separated list of patterns)
?(PATTERN-LIST)
The pattern matches if zero or one occurrences of any of the patterns in the PATTERN-LIST allow matching the input string.
*(PATTERN-LIST)
The pattern matches if zero or more occurrences of any of the patterns in the PATTERN-LIST allow matching the input string.
+(PATTERN-LIST)
The pattern matches if one or more occurrences of any of the patterns in the PATTERN-LIST allow matching the input string.
@(PATTERN-LIST)
The pattern matches if exactly one occurrence of any of the patterns in the PATTERN-LIST allows matching the input string.
!(PATTERN-LIST)
The pattern matches if the input string cannot be matched with any of the patterns in the PATTERN-LIST.

Thus, to match all backup files with a trailing tilde on the filename, the QTAwk operator would be:
~?
But for the system utility filename match, the operator will be:
?(~)
Thus, to find all filenames with an "awk" extension (and all such backup files), the pattern to use would be:
*.awk?(~)

fflush QTAwk offers three forms of the fflush function. All three forms flush output file buffers, i.e., any characters in the output stream which have not been written, are immediately written to the desired media. The forms of the fflush function are:
  1. fflush(filename)
    Immediately flush the output buffer for the named file
  2. fflush("")
    Immediately flush the output buffers for all output files
  3. fflush()
    Immediately flush the output buffer for the standard output file, stdout.

All forms of fflush return zero if the buffer was successfully flushed, and nonzero otherwise.

load_module
This function has the following form: 
  1. load_module(module_file_name)
The module named in the first parameter  is loaded and linked.
The load_module builtin function performs the following actions:
  1. It loads the file named in the parameter. In searching for the named file, QTAwk follows the same rules as when searching for QTAwk utility program file, namely:
    1. If the filename specification contains a path, either absolute or relative, the file is searched for at the specified location only.
    2. If no path is specified in the filename, then QTAwk searches in the following order:
      1. the current working directory,
      2. any paths specified in the 'Module_Path' builtin variable. This variable is similar to the QTAwk_Path variable and may be set in the configuration file(s) or by directly by the QTAwk utility.
    If the file is found in the current working directory or in one of the directories specified, then that file is specified to the system program to load and link.

    If the file is not found in the above search, then the filename is passed to the system program which loads and links which performs it's own search in system specified directories. If it finds a file with the proper name, then that file is loaded and linked.
  2. It searches for and links the module initialization function contained in the loaded module. This initialization function is named: 'QTAwk_init_load_module'
  3. It searches for and links the module exit function contained in the loaded module if the exit function has the standard name: 'QTAwk_exit_load_module'
  4. It calls the module initialization function which initializes the module, calls a QTAwk function to inform QTAwk of the number of user defined functions which will be registered and calls a QTAwk function to register each user defined function in the module. The initialization function may also register a module exit function with a non-standard name and a module shut-down function. The module exit function is executed when the 'unload_module' builtin function is executed. The module shutdown function is executed upon normal termination of QTAwk.
QTAwk has the following standard names for the module initialization function and the module exit function:

QTAwk_init_load_module
The QTAwk standard name for the module initialization function
QTAwk_exit_load_module
The QTAwk standard name for the module exit function

After the 'load_module' function has completed, any user defined function registered are available to call as user defined functions.
unload_module
This function has the following form:

unload_module(module_file_name)

Note that the module_file_name must be identical to the desired module named in a 'load_module' function call.

This builtin function has the following actions:
  1. It calls the module exit function contained in the module if such a function exists and was found and linked by the 'load_module' builtin function.
  2. It unlinks and unloads the module from memory.
After the 'unload_module' function has completed, any user defined function(s) registered by the module are no longer available to call as user defined functions.

Redirection & Pipeline

The use of the re-direction operators, '<', '>', '>>' have been discontinued as error prone. The pipeline 'operator, '|', has been changed to the operator, '|>'.

The use of the redirection syntax:

{ print $1, $2 > $3 }

has been replaced by the fprint function:

{ fprint($3,$1,$2); }

or

{ fprint $3,$1,$2; }

The use of the pipeline syntax with the 'print' function:

{ print $1, $2 | "pipeline command string" }

has been replaced by:

{ print $1, $2 |> "pipeline command string" }

The use of the pipeline syntax with the 'getline' function:

while { "pipeline command string" | getline(var) } statement;

has been replaced by:

while { "pipeline command string" |> getline(var) } statement;

Following GNU awk, gawk, QTAwk has incorporated the bi-directional pipline operator, '|&'. This operator can be used to direct output to the standard input of another process and read the standard output of that process.

Thus,

for ( i = 0 ; i <= limit ; i++ )
    print(variable list) |& "pipeline command string" |& getline(var);

As with gawk, there are some problems with this operator. Notably if the "pipeline command string" process does not close its output until it receives all of the input, then getline will "hang" waiting for input.

To remedy this situation, QTAwk has done as gawk and enabled breaking the pipeline into two statements and closing the two parts of the pipeline independently with the "close" function:

close("pipeline command string","to")

and

close("pipeline command string","from")

Thus, the above could also b done as:

for ( i = 0 ; i <= limit ; i++ )
    print(variable list) |& "pipeline command string";

close("pipeline command string","to");

while { "pipeline command string" |& getline(var) } statement;

close("pipeline command string","from");

NOTE that:

close("pipeline command string");

could also be used as the second closing statement.

If both parts of the bi-directional pipe are to be closed at the same time or if only one part remains open, the second argument of the 'close' function can be omitted.

Two examples of the bi-directional pipeline follow:

  1. This example use the 'tee' utility to echo the lines written into the pipline back:
    BEGIN {
    command = "ls -al";
    while ( (command |> getline(line)) > 0 ) {
    sline[i++] = line;
    }
    close(command);

    #command = "tee /tmp/qtawk.pipe.tee";
    command = "tee";
    print("Starting bi-pipe");
    for ( i in sline ) {
    print("=S" >< sline[i]) |& command;
    command |& getline(gline);
    print("R" >< gline);
    }
    close(command);
    }

    Note: the 'tee' utility as used will simply echo the input to the output. Uncomment the first 'tee' line and comment the second to have the input also echod to a file.

  2. This example uses the 'sort' utility. Since the 'sort' utility doesn't output anything until all of the input has been read, trying to read from the pipleine before then would simply hang the program. Thus, the reading and writing must be split, and the 'to' pipe closed before attempting to read the 'from' pipe:
    BEGIN {
    command = "LC_ALL=C sort";

    cnt = split(srev(POSIX["upper"] >< POSIX["lower"]),alpha,"");

    for ( i in alpha ) {
    print(alpha[i]) |& command;
    } ## endfor
    close(command,"to");

    while ( (command |& getline(line)) > 0 ) {
    print("got",line);
    } ## endwhile

    close(command);
    }

Standard Files

QTAwk has six standard files:

  1. keyboard
  2. standard Error File, stderr
  3. standard Input File, stdin
  4. standard Output File, stdout

These files are always open and available for input or output.
keyboard This file is designed to read input from the keyboard.

The fgetc function will return 0 when a function key, cursor key or other key not corresponding to an ASCII character is pressed when reading from the keyboard file. A second call must be made to read the keyboard scan code to recognize the key pressed.

The ECHO_INPUT built-in variable controls echo of characters read from the keyboard file to the standard output file. If characters read are not to be displayed, then ECHO_INPUT should be set to a false value. The default value of ECHO_INPUT is 0.

stderr Standard Error File, the console display.
stdin Standard Input File, normally the console keyboard, but may be redirected to a disk file or piped from the output of another application program.

If stdin has been redirected to a disk file by the operating system redirection or piping facility, then the End-Of-File will not be recognized. Any utility reading redirected input from this file must supply a means within the utility for termination. This can be done on special input and use of the exit or endfile statement.

This file is the utility input file if no input file has been specified on the command line or an input file of "-" has been specified.

If not redirected or piped, input will be read from the keyboard. This file is buffered and the input is not available to the QTAwk utility until the carriage return, or enter, key is pressed. If single key characters are needed by the utility as the keys are pressed, then the keyboard file should be used for input using the fgetc input function. Also, cursor keys, function keys or other "special" character keys cannot be input from this file. The keyboard file and the fgetc function must be used for such input.

The ECHO_INPUT built-in variable controls echo of characters read from the redirected or piped standard input file to the standard output file. If the standard input file has not been redirected or piped, then input read from the standard input file will always be displayed irregardless of the value of ECHO_INPUT. If characters read from the redirected or piped standard input file are not to be displayed, then ECHO_INPUT should be set to a false value. The default value of ECHO_INPUT is 0. Normally, the default value is desired, otherwise redirected or piped input will be sent to the standard output file as it is read.

The following table shows the effect of ECHO_INPUT on input from the standard input file and the keyboard file:

ECHO_INPUT
true
ECHO_INPUT
false
echo keyboard to display yes no
echo stdin to display yes yes
echo stdin to stdout yes no
echo redirected stdin to display no no
echo redirected stdin to stdout yes no

stdout Standard Output File, normally the console screen, but may be redirected to a disk file or piped to the input of another application program. Note that the print, printf, and putc, functions all output only to this file.

Date and Time Functions

QTAwk offers the following date and time functions. All of these functions are new to QTAwk:

  1. _time
  2. _ftime
  3. jdn
_time This function returns the local time as seconds elapsed since midnight. This function has no passed parameters and the function parenthesis are optional
  1. _time(), or
  2. _time - returns the current system time as seconds since midnight.
The function to_UCT will convert local date/time to Universal Coordinated Time, UCT. In order for the function to work, the environment variable 'TZ' must be set.
_ftime This function has two forms:
_ftime(fmt_str)
return the string representing the current system date/time formatted according to the format specification in 'fmt_str'.
_ftime(fmt_str,sjdn,stime)
return the string representing the date/time represented by the Julian Day Number, 'sjdn', and seconds since midnight, 'stime', formatted according to the format specification in 'fmtstr'. The Julian Day Number, 'sjdn', passed must represent a date on or after January 1, 1900 (2,415,021). If sjdn is less than 2,415,021, it is set to that value. To obtain the appropriate calender date for a Julian Day Number less than 2,415,021, use the jdn function.

The format strings used in both forms of _ftime use the date and time format specifications listed in the Date/Time Format Table. The format string is similar to that used in the print function, except that the format specifications are those listed in the Date/Time Format Table.

Formatted strings for the current system date and time, or any date and time since midnight, January 1, 1900, can be obtained using the _time, _ftime and jdn functions.

  1. _ftime(fs,jdn,_time) - returns string representing current system time formatted according to 'fs'. This returns the same string as: _ftime(fs)

  2. at = to_UCT(jdn,_time);
    _ftime(fs,at[0],at[1])
    returns string representation for current system UCT date/time formatted according to 'fs'

  3. _ftime(fs,fjdn,fsecs) - returns string representation for file date and time specified in 'fjdn' and 'fsecs'. 'fjdn' and 'fsecs' are the File date/time returned by the findfile function.
jdn The Julian Day Number, JDN, is the number of whole days that have elapsed since a certain reference time in the past. The JDN is widely used in astronomy and elsewhere in calculations involving the counting of days and the computation of the day of the week. The reference time from which all JDN's are measured has been chosen by astronomers to be January 1, 4713 B.C. (Julian Calendar) at noon. As an example of a JDN in modern times, from noon October 1, 1991 to noon October 2, 1991 is 2,448,531. Julian Day Numbers were originated by Joseph Scaliger in 1582 and named after his father Julius, not after Julius Caesar. They are not related to the Julian calendar.

Pope Gregory XIII decreed that the Julian calendar would end on Oct 4, 1582 AD and that the next day would be Oct 15, 1582 in the Gregorian Calendar, making Oct 5, 1582 in the Julian calender the same day as Oct 15, 1582 in the Gregorian calender. The only other change is that centesimal years (years ending in 00) would no longer be leap years unless divisible by 400. Britain and its possessions and colonies continued to use the Julian calendar up until Sep 2, 1752, when the next day became Sep 14, 1752 in the Gregorian Calendar.

The use of the Gregorian or Julian Calendar by QTAwk in the computation of the JDN is controlled by the built-in variable Gregorian. If Gregorian == TRUE, QTAwk uses the Gregorian Calendar, otherwise QTAwk uses the Julian Calendar. The Julian Day Number computed by QTAwk is, by default, based on the Gregorian Calendar. The function auto_jdn will set the Gregorian variable automatically before calling the jdn built-in function.

Years BC are given (and returned) as negative numbers. Note that there is no year 0 BC; the day before Jan 1, 1 AD is Dec 31, 1 BC. Note also that 1 BC, 5 BC, etc. are leap years.

As mentioned, the JDN of a given date is handy in performing date computations. For example, given two dates, their respective JDN's may be found with this function and the number of days between the two dates computed simply by subtracting one JDN from the other. A day a specified number of days in the future (past) is easily computed by adding (subtracting) the desired number of days to the JDN and using the jdn function to obtain the new date. Also, the day of the week of a given date may be computed quite simply with the formula:

dow = (JDN + 1) % 7

where dow is the "day of week". The value computed is between 0 and 6 inclusive with:
value day of week
0 Sunday
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday

In addition, the two functions month_day_date and last_dow_of_month illustrate using the JDN for more complex date calculations that become surprisingly easy when using the JDN.

The jdn function has three forms:

jdn(), or
jdn
return the Julian Day Number, JDN, of the current system date. Note that for this form, the function parenthesis are optional.
jdn(sjdn)
return the calender date of the Julian Day Number passed in sjdn. The calender date is returned in an array with the array elements
  • array[0] = year
  • array[1] = month, 1 to 12
  • array[2] = day, 1 to 31

To obtain the array for the current system date, the 'jdn' function may be invoked as

current_sys_date = jdn(jdn());

or

current_sys_date = jdn(jdn);

jdn(year,month,day)
return the Julian Day Number, JDN, of the date specified.

Miscellaneous Functions

QTAwk offers the following miscellaneous functions:
  1. e_type(expr)
  2. execute
  3. rotate
  4. system
  5. pd_sym
  6. ud_sym
  7. resetre
  8. setlocale
e_type(expr) This function returns the type of 'expr'. The function evaluates the expression 'expr' and returns the type of the final result. The return is an integer defining the type:
Return Type
0 Uninitialized (returned when 'expr' is a variable which has not had a value assigned to it. Also if the variable has not had a value assigned to it since acted on by "deletea" statement)
1 Regular Expression Value
2 String Value
3 Integer as String Value
4 Floating Point as String Value
5 Single Character Value
6 Integral Value
7 Floating Point Value
Note: types 3 and 4 are set only in the following circumstances:
  1. For all fields of input records which match either integral (decimal, octal, or hexadecimal) or floating point values.
  2. For all tagged strings which match either integral (decimal, octal, or hexadecimal) or floating point values.
  3. For an array element returned by the built-in split function which match either integral (decimal, octal, or hexadecimal) or floating point values.

The following example demostrates the return values of this function:

    local lvar;
local ary;

split("123 045 0x45 45.6",ary);
e_type(lvar); # output ==> 0
e_type(/string test/); # output ==> 1
e_type("string test"); # output ==> 2
e_type('a'); # output ==> 5
e_type(45); # output ==> 6
e_type(45.6); # output ==> 7
e_type(45.6 >< ""); # output ==> 2
e_type("45.6" + 0.0); # output ==> 7
e_type("45.6" + 0); # output ==> 7
e_type("45" + 0); # output ==> 6
e_type("45" + 0.0); # output ==> 7
e_type(ary[1]); # output ==> 3 , decimal integer
e_type(ary[2]); # output ==> 3 , octal integer
e_type(ary[3]); # output ==> 3 , hexadecimal integer
e_type(ary[4]); # output ==> 4

execute QTAwk offers two forms of a function to dynamically execute QTAwk expressions, statements or pattern/actions/user-defined functions. The first form will execute strings. The second will execute array elements.

execute(s[,se[,rf]])

Execute string s as a QTAwk statement or expression depending on the value of the parameter se.

  1. If se == 0, then string/array s is executed as a statement and the constant value of one, 1, is returned.
  2. If se == 1, then string/array s is executed as an expression and the resultant value is returned by the execute function.
  3. If se == 2, then string/array s is compiled into the current QTAwk utility. The pattern/action pair or user-defined function is not executed. In this manner, a utility may dynamically add pattern/action pairs or user-defined functions. The constant value of one, 1, is returned.

The se parameter is optional and defaults to FALSE. Any built-in or user-defined function may be executed in the execute function except the execute function itself. New variables may be defined as well as new constant strings and regular expressions.

The optional rf parameter is the error recovery flag.

  1. If rf = FALSE (the default value), an error encountered in parsing or executing the string s will cause QTAwk to issue the appropriate error message and halt execution.
  2. If rf == TRUE, an error encountered in parsing or executing the string s will cause QTAwk to issue the appropriate error message, discontinue parsing or execution of the string and continue executing the current QTAwk utility.

Attempting to execute the execute function from within the execute function is a fatal error and will always cause QTAwk to halt execution.

The following string can be executed as either an expression or statement:

nvar = "power2 = 2 ^ 31;";

If executed as an expression:

print execute(nvar,1);

the output will be: 2147483648

If executed as a statement:

print execute(nvar,0);

or

print execute(nvar);

the output will be: 1

Multiple statements/expressions may be executed with a compound statement of the form:

pvar = "{ pow8 = 2 ^ 8; pow16 = 2 ^ 16; pow31 = 2 ^ 31; }";

Then

execute(pvar,0);

or

execute(pvar);

will set the three variables:

  1. pow8
  2. pow16
  3. pow31

even if the variables were not previously defined. If the variables were not previously defined, they will added to the list of the utility global variables.

Note that attempting to execute pvar as an expression:

execute(pvar,1);

will result in the error message "Undefined Symbol", since the braces '{}' are only recognized in statements. All three expressions may be executed, as an expression, by the use of the sequence operator in the following manner:

pvar = "pow8 = 2 ^ 8 , pow16 = 2 ^ 16 , pow31 = 2 ^ 31;";

The function call:

*execute(a[,se[,rf]])

will execute the elements of array 'a' as a QTAwk statement or expression. The se and rf parameters have the same function and default values as above. For example, the compound statement contained in 'pvar' above may be split among the elements of an array:

avar[1] = "{";
avar[2] = "pow8 = 2 ^ 8;";
avar[3] = "pow16 = 2 ^ 16;";
avar[4] = "pow31 = 2 ^ 31;";
avar[5] = "}";

and executed as:

execute(avar);

or

execute(avar,0);

rotate QTAwk offers the following built-in array function to rotate the elements of the array. The function has the form:

rotate(a[,direction_flag])

The values of the array are rotated in the direction indicated by the optional "direction_flag" parameter.

If the direction_flag" parameter is not specified or is specified as TRUE, then value of the first element goes to the last element, the second to the first, third to the second, etc. If the array has the following elements:

  • a[1] = 1
  • a[2] = 2
  • a[3] = 3
  • a[4] = 4
  • then "rotate(a)" or "rotate(a,1)" will have the result:

  • a[1] = 2
  • a[2] = 3
  • a[3] = 4
  • a[4] = 1
  • If the optional "direction_flag" parameter is specified with a FALSE value, the function will rotate in the opposite direction. Thus, applying "rotate(a,0)" after the above function call will have the result of putting the values back in the starting position:
  • a[1] = 1
  • a[2] = 2
  • a[3] = 3
  • a[4] = 4
  • Calling "rotate(a,0)" once more will have the result:

    • a[1] = 4
    • a[2] = 1
    • a[3] = 2
    • a[4] = 3

    It is not necessary to specify one-dimensional arrays. If:

  • a[1][1] = 1
  • a[1][2] = 2
  • a[1][3] = 3
  • a[1][4] = 4
  • Then rotate(a[1]) or rotate(a[1],1) will produce the result:

    • a[1][1] = 2
    • a[1][2] = 3
    • a[1][3] = 4
    • a[1][4] = 1
    and applying "rotate(a[1],0)" to the original array will produce the result:

    • a[1][1] = 4
    • a[1][2] = 1
    • a[1][3] = 2
    • a[1][4] = 3

    system system(e)

    executes the system command specified by the string value of the expression 'e'.

    [pu]d_sym There are two built-in functions available for access to variables. The first, pd_sym. accesses predefined variables and the second, ud_sym. accesses user-defined variables. Each has two forms:
  • pd_sym(name_str)
  • ud_sym(name_str)
  • or

  • pd_sym(name_num,name_str)
  • ud_sym(name_num,name_str)
  • To access predefined variables, the function pd_sym may be used. This function has been supplied to provide access to predefined variable similar to the function ud_sym for accessing user-defined variables. The forms and returns are similar.

    To access user-defined variables where the variable name may not be known in advance, the function ud_sym has been supplied. The first form:

    ud_sym(name_expr)

    is useful in situations where the variable name is not known until the statement is to be executed. In these cases, name_expr may be any expression or variable with a string value equal to the name of the unknown variable. In this form, the string value of name_expr is used to access the variable. ud_sym returns the variable in question, if one exists, whose name is equal to the string value passed.

    The functional return value may be used in any expression just as the variable itself would. This includes operating on the return value with the array index operators, "[]".


    Note: This form may be used to access both local and global variables. If both a local and global variable have been defined with the desired name and the local variable is within scope, then the local variable is returned.


    The second form:

    ud_sym(name_expr,name_str)

    is useful in those situations where it may be impractical to use string values to access the variables, e.g., in a for, while, or do, loop, but a numeric value can be used to access the variables.

    The user variables are accessed in the order defined in the user utility starting with one, 1. If the integer value of name_expr exceeds the number of user-defined variables, then a constant is returned. The second parameter must be a variable. Upon return, this variable will have a string value equal to the name of the variable found or the null string if name_expr exceeds the number of user-defined variables. The return value of this variable may be tested to assure that a variable was found.

    The second form of the ud_sym function may be used to test for whether a variable with a specified name has been defined. The function:

    function name_defined(name)
    {
    local i, j, nn;
    local stat = FALSE; # assume variable with name 'name' does not exist

    for ( i = 1 , j = ud_sym(i,nn) ; nn ; j = ud_sym(++i,nn) ) {
    if ( name == nn) {
    stat = TRUE;
    break;
    } # endif
    } # endfor

    return stat;
    } # name_defined

    is used to search for a variable with a specified name. If the variable has been defined, the function returns a TRUE value. Otherwise a FALSE value is returned.

    The functional return value may be used in any expression just as the variable itself would. This includes operating on the return value with the array index operators, "[]".

    The section Built-In Variable List lists the pre-defined variables with the integer value used in the first argument to access the variable.


    Note: Local variables cannot be accessed with this form of the function.


    The following short function will return the number of user-defined global variables:

    # function to return the current number of
    # GLOBAL variables defined in utility
    function var_number(display) {
    local cnt, j, jj;

    for ( cnt = 1, j = ud_sym(cnt,jj) ; jj ; j = ud_sym(cnt,jj) ) {
    if ( display ) print cnt >< " : " >< jj >< " ==>" >< j >< "<==";
    cnt++;
    }
    return cnt - 1;
    }

    The following function may be called with the name of the variable desired. The value of the variable will be returned. Note that the appropriate variables have been defined in the BEGIN action.

    BEGIN {
    #define the conversion variables
    _kilometers_to_statute_miles_ = 1.609344;
    _km_to_sm_ = 1.609344; # mile / kilometers (exact)

    _statute_miles_to_kilometers_ = 1/1.609344;
    _sm_to_km_ = 1/1.609344; # kilometers / mile (exact)

    _inches_to_centimeters_ =
    _in_to_cm_ = 2.54;

    _centimeters_to_inches_ =
    _cm_to_in_ = 1/2.54;

    _radians_to_degrees_ =
    _rad_to_deg_ = 180/pi;

    _degrees_to_radians_ =
    _deg_to_rad_ = pi/180;
    }

    # function to return the appropriate conversion
    function conversion_factor(to_n,from_n) {
    local c_name = '_' >< from_n >< "_to_" >< to_n >< '_';

    return ud_sym(c_name);
    }

    Calling conversion_factor as:

    conversion_factor("degrees","radians")

    returns a value of: 57.2957795130823 (== 180/pi)

    The following function will list all pre-defined variables with their current value and the integer used to access them:

    BEGIN {
    for ( i = 1 , jj = pd_sym(i,j) ; j ; jj = pd_sym(++i,j) ) {
    gsub(/\n/,"\\n",jj);
    gsub(/\s/,"\\s",jj);
    gsub(/\t/,"\\t",jj);
    print i,j,jj;
    }
    }
    resetre Once a regular expression has been used in QTAwk the internal form is set and cannot normally be changed. If the value of named expressions used in regular expressions changes infrequently, string may be used where regular expressions would normally be used. The value of the string may be changed to effect changing search patterns. However, the use of strings for search patterns is slow. Also the value of regular expressions used in GROUP expressions cannot normally be changed. For this reason QTAwk has included the built-in function:

    resetre()

    This function resets ALL regular expressions except array regular expressions, including all GROUP expressions. The internal forms for regular expressions are deleted. On the next use of a regular expression for matching, the internal form is rederived with the current values of any named expressions. The next time a GROUP is matched against an input record, the GROUP expressions are evaluated and the internal regular expression form for the GROUP is rederived.

    Refer to arrays in Section Arrays, for the use of arrays as regular expressions. Arrays may be utilized anywhere in an expression a regular expression is used. The internal form of the regular expression is retained after the expression has been executed. The internal regular expression form is deleted only when the array is changed. The use of arrays as regular expressions gives the user more control over the dynamic changing of the regular expression internal form.

    setlocale This function sets and returns the locale under which QTAwk is working. The locale set when QTAwk is invoked is initially the "C" locale, specified for the C language. The locale is automatically switched to the locale specified in the "LANG" environment variable if one is specified. This function is called as:

    setlocale(catagory_string,locale_string)

    The parameters are:

    1. catagory_string == a string value which must be one of the following values:
      1. LC_COLLATE - set the manner in which strings are collated.
      2. LC_CTYPE - set the manner in which characters are classified and converted.
      3. LC_MONETARY - set the manner in which monetary values are formatted.
      4. LC_NUMERIC - set the manner in which numeric values which are not monetary are formatted.
      5. LC_TIME - set the manner in which date and times are formatted.
      6. LC_MESSAGES - selects the language in which user interface messages are translated. Not currently applicable to QTAwk.
      7. LC_ALL - sets all of the above catagories to the locale specified.
    2. locale_string == a valid locale name or an empty string, "". If the empty string is used, the current locale is changed to the locale specified in the "LANG" environment variable if one is specified. If no "LANG" variable is specified, the locale is unchanged. If an invalid locale name is specified, no action is taken. If "locale_string" is not a string value, then the current locale string is returned.

    If the locale is successfully changed, the new locale name is returned, otherwise no value is returned.


    TOP
    User Guide
    Chapters
    Table of Contents
    Statements
    User Defined Functions