set whole_page "some stuff for the top of the page\n\n"
append whole_page "some stuff for the middle of the page\n\n"
append whole_page "some stuff for the bottom of the page\n\n"
# done composing the page, let's write it back to the user
ns_return 200 text/html $whole_page
If you're processing data from the user, typically entered into an
HTML form, you'll be using a rich variety of built-in string-handling
procedures. Suppose that a user is registering at your site with the
form variables first_names, last_name, email, password
.
Here's how we might build up a list of exceptions (using the Tcl
lappend
command, described in the chapter on lists):
# compare the first_names value to the empty string
if { [string compare $first_names ""] == 0 } {
lappend exception_list "You forgot to type your first name"
}
# see if their email address has the form
# something at-sign something
if { ![regexp {.+@.+} $email] } {
lappend exception_list "Your email address doesn't look valid."
}
if { [string length $password] > 20 } {
lappend exception_list "The password you selected is too long."
}
If there aren't any exceptions, we have to get these data ready for
insertion into the database:
# remove whitespace from ends of input (if any)
set last_name_trimmed [string trim $last_name]
# escape any single quotes with an extra one (since the SQL
# string literal quoting system uses single quotes)
regsub -all ' $last_name_trimmed '' last_name_final
set sql_insert "insert into users (..., last_name, ...)
values
(..., '$last_name_final', ...)"
string first
command. Some users of photo.net complained
that they didn't like seeing classified ads that were simply pointers
to the eBay auction site. Here's a simplified snippet from the code
that inserts ads into the database:
if { [string first "ebay" [string tolower $full_ad]] != -1 } {
# return an exception
...
}
an alternative formulation would be
if { [regexp -nocase {ebay} $full_ad] } {
# return an exception
...
}
Both implementations will catch any capitalization variant of "eBAY".
Both implementations will miss "e-bay" but it doesn't matter because
if the poster of the ad includes a link with a URL, the hyperlink will
contain "ebay". What about false positives? If you visit www.m-w.com and search for "*ebay*"
you'll find that both implementations might bite someone selling
rhododendrons or a water-powered mill. That's why the toolkit code
checks a "DisalloweBay" parameter, set by the publisher, before
declaring this an exception.
If you're just trying to find a substring, you can use either
string first
or regexp
. If you're trying to
do something more subtle, you'll need regexp (described more fully in
the chapter "Pattern Matching"):
if { ![regexp {[a-z]} $full_ad] } {
# no lowercase letters in the ad!
append exception_text "
Your ad appears to be all uppercase.
ON THE INTERNET THIS IS CONSIDERED SHOUTING. IT IS ALSO MUCH
HARDER TO READ THAN MIXED CASE TEXT. So we don't allow it,
out of decorum and consideration for people who may
be visually impaired."
incr exception_count
}
string range
command:
if { [string length $message] > 1000 } {
set complete_message "[string range $message 0 1000]... "
} else {
set complete_message $message
}
format
and scan
resemble
C's printf
and scanf
commands. That's
pretty much all that any Tcl manual will tell you about these
commands, which means that you're kind of S.O.L. if you don't know C.
The basic idea of these commands comes from Fortran, a computer
language developed by John Backus at IBM in 1954. The FORMAT command
in Fortran would let you control the printed display of a number,
including such aspects as spaces of padding to the left and digits of
precision after the decimal point.
With Tcl format
, the first argument is a pattern for how
you'd like the final output to look. Inside the pattern are
placeholders for values. The second through Nth arguments to
format
are the values themselves:
format pattern value1 value2 value3 .. valueN
We can never figure out how to use format without either copying an
earlier fragment of pattern or referring to the man page
(http://www.tcl.tk/man/tcl8.4/TclCmd/format.htm). However, here are some
examples for you to copy:
% # format prices with two digits after the point
% format "Price: %0.2f" 17
Price: 17.00
% # pad some stuff out to fill 20 spaces
% format "%20s" "a long thing"
a long thing
% format "%20s" "23"
23
% # notice that the 20 spaces is a MINIMUM; use string range
% # if you might need to truncate
% format "%20s" "something way longer than 20 spaces"
something way longer than 20 spaces
% # turn a number into an ASCII character
% format "%c" 65
A
The Tcl command scan
performs the reverse operation,
i.e., parses an input string according to a pattern and stuffs values
as it finds them into variables:
% # turn an ASCII character into a number
% scan "A" "%c" the_ascii_value
1
% set the_ascii_value
65
%
Notice that the number returned by scan
is a count of how
many conversions it was able to perform successfully. If you really
want to use scan
, you'll need to visit
the man page: http://www.tcl.tk/man/tcl8.4/TclCmd/scan.htm. For an idea of
how useful this is for Web development, consider that the
entire 250,000-line ArsDigita Community System does not contain a
single use of the scan
command.
string
append variable_name value1 value2 value3 ... valueN
regexp ?switches? expression string ?matchVar? ?subMatchVar subMatchVar ...?
expression
matches string
; 0
otherwise. If successful, regexp
sets the match
variables to the parts of string
that matches the
corresponding parts of expression
.
% set fraction "5/6"
5/6
% regexp {(.*)/(.*)} $fraction match num denom
1
% set match
5/6
% set num
5
% set denom
6
(more: the pattern matching chapter and http://www.tcl.tk/man/tcl8.4/TclCmd/regexp.htm)
regsub ?switches? expression string substitution_spec result_variable_name
result_variable_name
.
Here's an example where we ask a user to type in keywords, separated
by commands. We then expect to feed this list to a full-text search
indexer that will throw an error if handed two commas in a row. We
use regsub
to clean up what the user typed:
# here we destructively modify the variable $query_string'
# replacing every occurrence of one or more commas with a single
# command
% set query_string "samoyed,, sledding, harness"
samoyed,, sledding, harness
% regsub -all {,+} $query_string "," query_string
2
% set query_string
samoyed, sledding, harness
(more: the pattern matching chapter and http://www.tcl.tk/man/tcl8.4/TclCmd/regexp.htm)
regexp
and regsub
were dramatically improved
with the Tcl 8.1 release. For a Web developer the most important
feature is the inclusion of non-greedy regular expressions. This makes it
easy to match the contents of HTML tags. See http://www.scriptics.com/services/support/howto/regexp81.html
for a full discussion of the differences.
string
string compare apple applesauce ==> -1
string compare apple Apple ==> 1
string first tcl catclaw ==> 2
string last abra abracadabra ==> 7
string range catclaw 2 4 ==> tcl
string compare weBmaster Webmaster => 1
string compare [string tolower weBmaster] \
[string tolower Webmaster] => 0
set password "ferrari"
string compare "FERRARI" [string toupper $password] ==> 0
set password [string trim $form_password] ; # see above example
set password [string trimleft $form_password]
set password [string trimright $form_password]
string wordend "tcl is the greatest" 0 ==>3
string wordstart "tcl is the greatest" 5 ==> 4
Continue on to list operations.