Skip to content

TclTips

Sektor van Skijlen edited this page Apr 25, 2017 · 1 revision

Look and feel it

Yes, in order to write Silvercat files, you have to know some basic rules of Tcl language - because "Silvercat files" are simply Tcl scripts.

If you don't know this language - not even used tools like expect or dejagnu - this is really not a big problem, especially if you are familiar with C++ and shell interpreters, like Bash.

There was one important reason, why I found Tcl best for that task - which makes it different to tools like cmake, qmake, jam, even make:

  • because I don't need no stinkun' parser!

Yeah, but you can still say that there are tools with more popularly known syntax, like scons that is based on Python, ant that uses XML syntax, or gn that uses the Javascript syntax? Well, this is actually the biggest problem with these tools - you have to focus on syntax tricks and corners when writing anything for these tools. Even Makefile has way less syntax tricks and its syntax is quite clean, in contrast to tools based on Javascript, Python, Perl, or even worse XML syntax.

In Silvercat you can simply focus on what you define rather on the syntax tricks. How's that? Because Tcl has two smart features:

  • Everything enclosed in {...} is taken as is without any processing (at least any processing before passing it to the command handler). Much like the text in '...' in typical shell command line interpreters. And this text can even span into multiple lines.
  • Every Tcl script consists of commands - such as commands in shell scripts.

Like this:

command argument1 argument2 argument3
command1 arg1; command2 arg2
command arg1 arg2 \
 arg3 arg4
command arg1 {
    series of
    multi-line data
} arg3 { and here arg4
    also in multiple lines
}

So how's that this script is somehow interpreted?

if { $a > 0 } {
     puts $a
     break
}

Simply: this undergoes the following breakthru:

if CONDITION CODEBLOCK

That is, if is a command, and it gets two arguments wrapped in {...}. The CONDITION is expected to be an expression, and if it evaluates to any form of standard Tcl boolean true value, the CODEBLOCK is executed as Tcl script. If not, it does nothing - no matter what CODEBLOCK contains.

But how does the Tcl interpreter know that the first one is expression and the second one is code block? It doesn't! It just tries to interpret them according to what they should be, and if this fails, it results in execution error.

Tcl command interpreter simply executes commands one after another, commands may also have return values and be used in nested execution expression (in [ ]). For special situations Tcl has the mechanism known from other languages as "exceptions" - with throwing, catching and propagation. It lacks a proper naming in the language definition, so I'd try to explain it using the C++ nomenclature. This mechanism is used among others to handle such errors as mentioned above. But not only to handle errors - it also handles for example the break command.

So, the break command throws an exception (of type "break"). If the propagation leads it to a looping command (for, foreach, while), it will be interpreted as a request to stop iterating. Otherwise it will reach the procedure boundary (in which case it will "translate" this exception to "error" type) or the interpreter's origin, in which case it will do what exceptions normally do - it will crash your program. In all other languages it means using loop control command outside the loop, so this reaction is "close enough". This is how break, continue and return commands are implemented. Yes, it also means that you can define your own looping command or your own command that will behave, at least partially, as "break".

It's completely up to the command's implementation what it will do with the given text - as a passed argument it's always just a text, nothing more. In Tcl it's nothing unusual to see something like:

test-language c {
   #include <stdio.h>
   int main()
   {
       printf( "It works with %d\n", 20 );
       return 0;
   }
}

test-language perl {
      use strict;

      my $var = 20;
      print "It works with " . $var . "\n";
}

in one script.

And more-less the same way we have lists: elements are separated by spaces, nested lists are in {...}:

% set l {This is a {nested list} in a list}
This is a {nested list} in a list
% puts [lindex $l 1]
is
% puts [lindex $l 3]
nested list

The same with dictionaries - these are just lists where odd elements are keys and even ones - values.

% set d {color black shape {triangular, but not sure}}
color black shape {triangular, but not sure}
% puts [dict get $d shape]
triangular, but not sure

How then does the interpreter distinguish between a text value with spaces being a list element, and a nested list, or list from dictionary, or even generally a list from just a text string? As usual, it doesn't! You just need to know that you are expecting that kind of data at specified place and interpret it as you expect. If any special hints for how to interpret particular data has to be added, it has to be somehow added explicitly.

Can this work? Yes, as long as you free yourself from thinking in the frames of Python or Javascript, and stop blindly believing in that a scripting language needs syntax-supported data types.

So? The only "limitation" you can imagine in case of Tcl is that the syntax you have to use must be based on braces. But, well, the syntax of most of today used languages does (with Python, Ruby, and Lua being notable exceptions). Rest of the syntax in Silvercat relies on the following rule: I define commands, you use them, Tcl interpreter takes care of the rest. Voila!

There are two important conventions in case of Tcl scripts you'd better be aware of:

  1. Dash-prefixed -options. Although they are not like usual shell options, where you can integrate multiple one-letter options -xvp2g and have --long-option additionally. It's rather like -style good -options single. Silvercat also follows this convention.
  2. Many commands are also available in ensembles. You can imagine it as various "command with subcommand" things - like git or svn, or net on Windows. In Tcl the examples are string or file command ensembles. Silvercat rarely uses this convention.

Important list processing rules

Tcl is a command language and it makes it similar to shell scripting languages - all toplevel statements in this language are commands to execute. Such a command is also a list, where the first element is the command name. However such lists don't work as in other languages (Bash, Perl, CMake, also Make). Let's take an example: prepare these three files in Bash shell:

$ touch one two three

And now in Bash you have:

$ WORDS="one two three"
$ ls $WORDS
one  three  two

Your command ls $WORDS was expanded to ls one two three. It doesn't work this way in Tcl:

$ tclsh
% set WORDS "one two three"
% exec ls $WORDS
/usr/bin/ls: cannot access one two three: No such file or directory
child process exited abnormally

It's because the command has been expanded to exec ls {one two three}. Similar as if you had in Bash:

$ ls "$WORDS"
ls: cannot access one two three: No such file or directory

However it will be completely correct if you do in Tcl:

% exec ls {*}$WORDS
one  three  two

The {*} syntax placed in front of any kind of expression that may result in a list (be it an expression in {braces} or "quotes" or [command result] or $variable_value) causes expanding the list in place (it's new in Tcl 8.5).

In other words:

  • In Bash, using $V unfolds the V variable - you need "$V" to prevent it
  • In Tcl, using $V keeps V variable folded - you need {*}$V to enforce unfolding

Of course, remember: it concerns only cases when a variable or nested command is substituted to its value when it is interpreted as a single list element. When it's inside "...", it expands normally in place. That is, this:

set a {value or object}
puts "a: with $a type"
puts [list b: with $a type]

Results in:

a: with value or object type
b: with {value or object} type

(The list command does completely nothing but returning its arguments. You can define it as proc list {args} {return $args}, with the note, of course, that args has special meaning as argument name. It's however the only method to have variables and commands expanded in the resulted list, which wouldn't happen in { ... }, while preserving single list elements, which would be smashed in " ... ".).

Lists in Tcl are just words separated by spaces - however they can contain single elements that also contain spaces, in which case we have nested lists. Nested lists always occur in {...}:

% set ll one
one
% lappend ll "two three four" 
one {two three four}
% lappend ll five six seven ;# VARIABLE number of arguments
one {two three four} five six seven
% lappend ll {*}"eight nine ten"
one {two three four} five six seven eight nine ten

Extracting the element being a nested list is simply taking out one nesting level. This can be continued until you reach the bottommost nesting, in which case the extraction simply returns its argument. Here is an example of lindex command, which extract the element of the list at given index (it accepts multiple index values that represent nesting levels):

% set ll {{{one two three}}}
{{one two three}}
% lindex $ll 0
{one two three}
% lindex $ll 0 0
one two three
% lindex $ll 0 0 0
one
% lindex $ll 0 0 0 0
one

One of the important consequence of that is that if you follow the usual statement in Makefile:

CCLINKSO := gcc -rdynamic

and try to do the same here:

set CCLINKSO gcc -rdynamic

your script will end up with a Tcl exception - the "set" command should get two, not three arguments. The correct call would be:

set CCLINKSO {gcc -rdynamic}  ;# or "gcc -rdynamic"

So, as these list processing rules sometimes work for your disadvantage, especially when you'd like to "feel like in Makefile", there are several additional commands provided for your convenience.

Other important syntax tricks you better be aware of

There are two things in Tcl you have to be careful about:

  1. MIND THE BRACES. All braces in the Tcl script text must be 100% balanced.

    • Even in comments or a text in quotes!
    • Put \ before the brace character to mark it not balanced.
    • Whether this backslash will be resolved, it depends on what function will be interpreting the text in { ... }. Nothing is resolved by default. For example, {\}} resolves to \}, but "\}" and just \} resolve to }.
    • If backslashed-brace looks ugly for you in comments, then simply remember to balance them. Balance, that is, don't leave any single open or closed brace between two non-backslashed braces. The "content" of the braces used in comments is not meaningful at all - they just must be balanced. For example: here the instruction in the second line will still be executed - being "inside braces" that are in comments doesn't mean anything. But they still have to be balanced.
       # Critical Section {
       set value [AtomicGet value]
       # } end.
      
  2. HASH IS NOT MAGIC 'comment starts here' THING. Comments start with #, but:

    • Only where a Tcl command is expected.
    • If { ... } contains just some data, you must cut off comments by yourself. Well, they are not even comments - Tcl doesn't care what you have inside { ... }, if it doesn't interpret it.
    • Don't worry. Decent API that require longer multi-line text in {...} are predictive for that comments may be required. Silvercat also provides a convenience function puncomment to cut off all lines that start with #, and plist command that also takes care of comments. For example the ag command, if you pass all options with values inside the braces, you can use lines starting from # as comments inside - the ag command handler will take care of them. Just don't blindly put #comments in every possible place in the text.

Look at this code (interactive tclsh session - return values are printed on the console) to see the problem:

% # this is a { comment }
% puts here! ; set a {
# new value
2
}  ;# command ends here
here!

# new value
2
%

Similarities to Bash

To make you more familiar with it, let me show you an example Bash script:

function make_name()
{
      local ipstr=${1//./-}
      echo MACH_${ipstr}  # return is for code value
}

if [ $NAME == "" ]; then
    NAME=$(make_name 10.0.0.2)
fi

export NAME

In Tcl it would be:

proc make_name {ip} {
      set ipstr [string map {. -} $ip]  ;# OR: [join [split $ip .] -]
      return MACH_${ipstr}
}

if { $NAME == "" } {
      set NAME [make_name 10.0.0.2]
}

set env(NAME) $NAME

You can see here that the common factor is the way how variables are being used and the line-to-command assignment - although Bash uses lots of various special syntax cases, in which case Tcl uses only commands.

One important difference is that variables in Tcl are not mixed with environment variables as it's in Bash. In Tcl environment variables are available as keys in the env array - e.g. env(PATH). So, you couldn't set NAME variable anyhow by the environment, the script must read explicitly from $env(NAME) and possibly set to NAME (arrays will be explained soon).

Another important difference is that executable files found in PATH environment variable are not mapped to Tcl commands by default (it's done in interactive shell mode, though). To execute an executable file you should use exec command. By default its return value is the text printed on the standard output, but if there's anything printed on the standard error, it's considered a failure. Same if the shell return value isn't 0. The failure results in throwing an exception. The behavior of what happens in case of both outputs can be changed using redirection specifications, see the manpage for details.

Similarities to C++

Most of the statements are rather closer to what we have in C++. For example, this C++ code:

bool IsEvenSince(int n, int base = 0)
{
      if ( n <= base )
          return false;

      return !(n%2);
}

// Our example data
vector<int> a { 2, 6, 2, 8, 9, 4, 5, 192, 42 };

for (size_t i = 0; i < a.size(); ++i)
{
       if ( IsEvenSince(a[i], 9) )
       {
               break;
       }
       printf("%d\n", 2*i+3);
}

Would look this way in Tcl:

proc IsEvenSince {n {base 0}} {
      if { $n <= $base } {
          return false
      }

     return [expr {!($n%2)}]
}

# Our example data
set a { 2 6 2 8 9 4 5 192 42 }

for {set i 0} {$i < [llength $a]} {incr i} {
      if { [IsEvenSince [lindex $a $i] 9] } {
             break
      }

      puts [expr {2*$i+3}]
}

In this code in Tcl please note that:

  • "if" or "for" must be followed by the space. That's because they are commands and rest of the things are their arguments.
  • Leaving open brace at the end is the best way to pass multi-line argument to a command in the same line before EOL terminates the argument list!
  • You don't need expr for if argument # 1 or for argument # 2 - think of it be passed to expr already.
  • These true and false in arguments to return aren't anything special! This is just a text string. Will explain that later.

Important common things with C++:

  • Procedures use strict argument list (like Python, unlike Perl or Javascript)
  • Default arguments and variadic arguments are supported
  • Variables used in procedures are local by default

Slight differences:

  • No type system that would matter for the syntax. All values are of "string" type, they may only be interpreted specific way by some commands.
  • Global variables are not accessible by default (in C++ if you use an undeclared local variable, it defaults to a global variable from surrounding namespace - in Tcl it does not and the use of undeclared variable results in error). You should:
    • Either declare them first using variable command
    • Or use the name with its namespace path (::var_name)
  • No overloading (without type system it doesn't make much sense anyway)
  • No goto, even in Java flavor (impossible to implement "labels")
  • No do-while loop (although you can write it yourself :) )
  • No fallthru possible in switch (you better think of switch command being rather a function that gets argument of type map<string,function> - it has little to do with the switch statement in C++)
  • An instruction can contain variables and subcommands, not expressions with infix operators - for that you should use expr command.
  • An expression passed to expr command can contain only such operators that do not write to variables (i.e., no assignment, compound-assignment, increment and decrement operators)

The Tcl expressions support additionally the following operators:

  • eq and ne: the same as == and !=, with the difference that arguments are always compared as strings
    • { 0x10 eq 16 } evaluates to false
  • in and ni: the "element in list" operator (ni means "not in")
    • { -ldl ni $libraries } : same as { [lsearch -exact $libraries -ldl] == -1 }
  • **: exponentation
    • 2**16 -> evaluates to 65536

Syntax elements

All syntax elements in Tcl base on the following symbols:

  • $... (or ${...}) substitute to variable's value
  • ...( ... ) access key in an "array" (alternative for dictionary)
  • [ ... ] substitute to result of command call
  • \... encode special characters (like \n), or escape any otherwise interpretable characters
  • " ... " - treat contents as single word, including whitespaces (but still substitute used [command-calls], $variables and \backslash codes)
  • { ... } - treat contents as single word, including whitespaces (don't substitute anything inside)
    • Both "..." and {...} may contain end-of-line character. If they enclose an argument for a command, they can span to multiple lines, just the opening character must appear in the same line
  • :: separate namespace path elements
  • {*} expand a list in place

As you can see, syntax elements in Tcl concern only and exclusively operating with data (which is always text anyway). Nothing about control flow or structural constructs - these are done exclusively by commands. You can even get help for these commands in manpages (usually "n" or "3tcl" section). The most important commands are:

  • set: assign value to a variable:

    set filename /etc/hosts

  • if: conditional execution

    if { $a < 0 } { puts "done" }

  • while: conditional looping

    while { $n > 0 } { DoGrab n }

    • you can use break and continue commands inside
  • for, foreach: iterational looping

    foreach i [info vars] { puts $i } for {set i 0} {$i < 10} {incr i} { puts [lindex $list $i] }

  • proc: define a named and parametrized executable script

    proc putsn {text} { puts -nonewline $text; flush stdout }

    • use return to set return value and immediately exit
  • expr: interpret given text as arithmetical/conditional expression

    set a [expr {$b*2+3}]

    • better put the expression in {...} to prevent variable resolution before passing to the expr command (highly unwanted in most cases)

(No) data types

Just like in all other "shell script languages", in Tcl the only "data type" that matters for the syntax is "text string" (which can even potentially contain binary data). There happen to be some standard cases of what particular string value may mean, for example:

  • Block of code is a string that contains a Tcl script
  • Lists are just words separated by spaces
  • Dictionaries are lists with even number of elements. Odd elements are keys, even elements are values.
  • Numbers are strings formatted as numbers, be it floating-point or integer, formatted as hex, oct etc.

Generally there are no limits what the string can be. Every command may have its own crazy idea how to understand it. There are, however, some standards - see string is subcommand. Anyway, usually commands put limitations on what the string can contain, for example:

  • the incr command (similar to ++ operator) requires an integer number in the given variable
  • the lindex command (extract list element at given position) requires a text that can be parsed as a list in the first argument, second argument should be the index (usually an integer, but additionally it accepts special value end and two possible values glued with + or -)
  • the dict command group requires a list with even number of elements
  • conditional commands (if, while) require that the expressions in the first argument evaluate to a boolean value (see below).

So, for example, if the "0x12" text can be interpreted as integer, then it is integer (and a string simultaneously). Being not a value qualifiable as some specific "type" can only make some commands not work (throw an exception).

The only datatype-dependent thing in Tcl is the expression of comparison: the == operator compares two values as numbers if both can be treated as numbers, otherwise it compares them as strings. So, this evaluates to true:

expr { "07" == "7" }

while these to false:

expr { "07" eq "7" }
expr { "08" == "8" }

AFAIR this was new in Tcl 8.4 - previously using == to compare non-integers resulted in an exception.

The expr command is another interesting exception, that is, only values that can be understood as numeric (in C++ nomenclature, of int, double and bool types) can be passed as immediate values. If you have an immediate value that is just a text, it must be "in quotes". Especially important in comparisons and the ?: operator.

If a variable or command call is used, the expansion is done on already parsed expression:

% set a 2+3
% puts [expr {$a*6}]
can't use non-numeric string as operand of "*"
% puts [expr $a*6]  ;# passes "2+3*6" to expr command
20

Of course, it's never an error to put also the numeric constant in quotes! And, of course, you have rather no influence on what is substituted from $variable or [command], so they also don't need quotes around.

As a boolean value, usually the 0/1 pair is used (this is the return value of an expression using logical and comparison operators). Although alternative pairs are false/true, no/yes, off/on.

Variables and values

In contrast to many scripting languages, Tcl doesn't operate with any special kind of objects in the language - the only kind of object is text. Meaning there's no such things in this language as "function", "reference" or anything like that. So, for example, in cases when you need a "reference to a variable", as for example in set command, you simply use the name of the variable.

The name of the variable can be also set to some other variable and therefore make it hold a reference to a variable. Tcl doesn't feature double-dereference (as found in "make"), however you can use a trick with one-argument set command call:

% set a XXL
% set b a
% puts $b
a
% puts [set $b]
XXL

Using this technique you can even create set of variables with "glueable" names, having this way a simple key-value mappings. Just do

set a:$key val

But Tcl already has a tool for that: arrays. Array is something a little bit different to variable - you can't extract its value by $variable. However they are still local to function and globals can be declared by variable command. To access particular key, you do:

set ar(2) val
puts $ar(2)

This is better than "gluing names" because this solution gets some more support from the interpreter, that is:

  • expression in (...) is evaluated separately
  • expression in (...) may contain spaces (although take care when passing to make sure that this space is included in the name)
  • there's an array command to support some extra operations on the array itself - for example array names that gets all used keys

The arrays implement the functionality of "dictionaries". However this is done by a little bit clumsy syntax, and only one level of nesting is used. Although useful in many cases, if you really want to have the full flexibility of dictionaries, see dict command.

The reason why there are two different solutions for dictionaries is that the dict command is new in Tcl 8.5, so arrays was so far the only implementation of a dictionary. These two solutions may be treated as equivalent:

set a(x) 10
set a(y) 20
set k y
puts $a($k)

and

dict set a x 10
dict set a y 20
set k y
puts [dict get $a $k]

The dict set doesn't do anything magic - these both calls to dict set can be simply expressed as set a {x 10 y 20}. The important differences between arrays and dictionaries are that:

  • The dictionary in dict may be multiple levels - an array is only one level
  • The dict-based dictionary is simply a value, so it can be normally passed by value to other functions. Arrays are strongly bound to variables, so they can be at best passed by reference (by name).

However, there's a simple method of translation between array and dictionary: the array get <array-variable> command translates array to dictionary and array set <array-variable> <dictionary> the other way around.

Namespaces

Namespaces in Tcl are much like in C++. There's one thing you have to be aware of, though.

As you saw, everything in this language is only a text string - including names of variables and procedures. It means that if you call a procedure by name, it gets called by that exactly name. There are several commands that are available in every namespace under that name (standard Tcl commands, such as set or while, are available directly under these names inside every namespace - you can guess that such a mechanism must have been added or these namespaces would be unusable). However, normally a command must be called with its namespace-based name. So, global as ::name and in a namespace as x::y::name.

The problem may be if you are calling a procedure from within another procedure where both are defined in the same namespace, and then you'd call the second procedure in the global context (or whatever other namespace). All procedures and global variables identified by "simple names" (without namespace prefix) can only be those "system symbols", otherwise the interpreter would be trying to find the name in the "current" namespace, that is - in this case - the namespace where the procedure call occurs, not in the namespace where that procedure was originally defined!

There are two tools to prevent these problems:

  • namespace current: returns the name of the current namespace. Use that for prefixing the function name.
  • namespace code: return a "code block" that, when executed, will execute given script within the scope of the current namespace