Problem:
I wanted to produce a listing of TITLE lines for all
HTML files in my current directory.
Solution:
A perl script to do the job, called htmrpt
#!/usr/local/bin/perl
#
# give a report of the TITLE lines from all *.htm files
foreach $file (<*.htm*>) {
printf "%-20s ", $file;
open(HTFILE, $file);
while (<HTFILE>) {
chop;
$notitle = 1;
if (/^\<TITLE\>/) {
s/<TITLE>//;
s/<\/TITLE>//;
print;
print "\n";
$notitle = 0;
last; # break out of this loop (file)
}
if (/^\<title\>/) {
s/<title>//; # There must be an easier way to
s/<\/title>//; # solve this almost duplicate block of code!
print; # I am just a perl newbie
print "\n";
$notitle = 0;
last; # break out of this loop (file)
}
}
if ($notitle == 1) { print "*** No title found ***\n" };
close(HTFILE);
}
The following script htmrpt2.pl prints the Title Line first
so the script should be piped into sort for an alphabetical
listing by Title
#!/usr/local/bin/perl
foreach $file (<*.htm*>) {
open(HTFILE, $file);
while (<HTFILE>) {
chop;
$notitle = 1;
if (/^\<TITLE\>/) {
s/<TITLE>//;
s/<\/TITLE>//;
print;
$notitle = 0;
last; # break out of this loop (file)
}
if (/^\<title\>/) {
s/<title>//;
s/<\/title>//;
print;
$notitle = 0;
last; # break out of this loop (file)
}
}
if ($notitle == 1) { print "*** No title found *** " };
close(HTFILE);
print " ($file)\n";
}
---------------------------------------------------------------------------
Charles Cave ~ .-_|\ Phone +61 2 416 6877
Customer Services Manager / \ ~ Fax +61 2 416 2086
Unidata Australasia \.--._*<--- Level 2, 280 Pacific Hwy
[email protected] ~ v Lindfield, NSW, 2070 AUSTRALIA
---------------------------------------------------------------------------