The names of user defined variables start with an upper or lower case character or underscore optionally followed by one or more upper or lower case characters, digits or underscores. Most QTAwk built-in variables are named with upper case letters and underscores (only five are defined with lower case characters).
Variables are defined by using them in expressions. Variables have regular expression, string, character, integer or floating point numeric values or all four depending upon the context in which they are used in expressions or function calls.
Except for variables defined with the local keyword or used as function parameters, all variables are global in scope. That is, they are accessible and can be changed anywhere within a QTAwk utility. Local variables are discussed under the local keyword. Function parameters for user defined functions are discussed in the section on User Defined Functions.
All variables are initialized with a zero, 0, integer numeric value and the null string value when created by reference. The value of the variable is changed with the assignment operator, '=' or 'op='.
var1 = 45.87;
var2 = "string value";
var3 = /[\s\t]+[A-Za-z_][A-Za-z0-9_]+/;
var1 has a numeric value of 45.87 from the assignment statement. It has a string value of "45.87" and a value as a regular expression of /45\.87/. The string and regular expression values of var1 may be changed by changing the value of the built-in variable CONVFMT. The string value of CONVFMT is used to convert floating point numeric values to string and regular expression values. CONVFMT is initialized with a value of "%.6g" and can be changed with an assignment statement. Such changes would then affect the string and regular expression values of floating point numeric quantities. For example, if CONVFMT is assigned a value of "%u", then the string and regular expression values of var1 would become "45" and /45/ respectively.
The numeric value of both var2 and var3 is zero (0).
The string value of var3 is "[\s\t]+[A-Za-z_][A-Za-z0-9_]+". Note
that
the tab escape sequence, '\t', is not expanded in converting
the
regular expression to a string. The reverse is not true. One difference
between strings and regular expressions is the time at which escape
sequences
such as '\t' are translated to ASCII hexadecimal characters.
For
strings, the translation is done when the strings are read from the QTAwk
utility file. For regular expressions the escape sequences are
translated
when the regular expression is converted to internal form. For this
reason,
strings used in the place of regular expressions undergo a double
translation,
first when read from the QTAwk utility file and second when
converted
into the internal regular expression form. The second translation of
strings
used for regular expressions is the reason backslash characters, '\',
must
be doubled for strings used in this manner. Strings
and Regular Expressions has a more complete discusion of strings
and
regular expressions.
Variable Value Data Types
A variable can have 7 different value types, sometimes several at the same time. The various value types a variable may assume are:
All variables are created initially with a value type of No Value. In numeric expressions this evaluates to integer zero, 0, and in string expressions this evaluates to a null string.
String type values and the integer and floating point numeric types are familiar to everyone. A full discussion of the remaining value types:
A variable is given a Regular Expression value by an assignment expression such as:
RE_Val = /some r.e. pattern/;
QTAwk is unique in the ability to assign a regular expression to a variable. QTAwk differs considerably from Awk in this respect. For those familiar with Awk, a regular expression in Awk is really a shorthand for matching with the current input record. Thus, the above assignment if attempted in Awk would actually expand to:
RE_Val = $0 ~ /some r.e. pattern/;
and RE_Val would be assigned the value of zero or one accordingly. This is NOT true in QTAwk. In QTAwk, the above shorthand prevails only in patterns. In QTAwk, the above statement would have to be written in full as:
RE_Val = $0 ~~ /some r.e. pattern/;
to achieve the same effect as AWK.
In QTAwk, regular expressions have been raised to the level of a separate data type in their own right. This gives the user of QTAwk the ability to dynamically change the regular expressions used for pattern matching in ways that cannot be accomplished in Awk. For example, the short utility:
BEGIN {
match_pattern = /some pattern to match/;
}
match_pattern {
if ( some_condition ) match_pattern = /another match pattern/;
}
allows the user to change the records selected for processing. In Awk, only regular expression constants are allowed in patterns. The above simple utility, if executed in Awk would deselect all records, i.e., no records would be processed. The variable match_pattern would be assigned a value of zero in the BEGIN section since $0 is null and cannot match the regular expression specified. Since match_pattern has been assigned the constant value, 0, the pattern is always false, and no records are processed.
Dynamic regular expressions can be used not only in patterns, but also anywhere a regular expression is needed. Refer to Expressions for a full description of expressions. Variables assigned regular expressions can also be used where strings would be used. In such situations, the string value of the regular expression value is used.
When a variable with a regular expression data type is used in
expressions,
either the regular expression value or the equivalent string value is
used
depending upon the operation performed. If the operation is a regular
expression
matching operation, the regular expression value is used. For string
operations,
the equivalent string value is used.
Numeric String Types
QTAwk has two numeric string types,
a string value which can be fully interpreted as an integer. Examples are:
If converted to integer values, the corresponding values would be:
Note that hexadecimal and octal values are recognized as well as decimal values.
a string value which can be fully interpreted as a floating point number. Examples are:
If converted to floating point values, the corresponding values would be:
Some variables are automatically assigned these value types. The current input record, $0, and the fields of the current input record, $i, i > 0, are scanned and assigned either a string value type, Integer as String Value type or Floating Point Numeric as String Value type depending upon their value. The returned array elements of the split function are also scanned and given one of the three types depending upon their value. Tagged strings returned with the tag operator, [< >], are also scanned and assigned one of the three types according to their value. Other variables may assume the Integer as String Value or Floating Point Numeric as String Value type only by being assigned the value of one of the above variables.
Variables with the Integer as String Value type or the Floating
Point Numeric as String Value type are treated specially in
expressions
involving other variables with a numeric type. Refer to a description
of
the Comparison operators for a
description
of how these two special string types are treated.
Single Character Type
A variable can assume this type as the result of an assignment of a single character constant to a variable:
SC_Var = 'a';
or assigned the return value of a built-in function which returns a single character value. The substr and split built-in functions can return a single character value also.