Re: Client-side highlighting; tag proposal

Gary Adams - Sun Microsystems Labs BOS ([email protected])
Fri, 10 Mar 1995 20:40:51 +0500


> Date: Thu, 9 Mar 1995 23:12:27 +0500
> From: [email protected] (Wayne Allen)
>
>
> Nick says...
>
> A couple of times lately, I've brought up the notion that clients should
> handle highlights (the terms that match a search query) better. It's
> rather inefficient to force the search server to proxy documents just so
> that it can add highlights. Worse, it takes the decision about *how* to
> highlight (bold? underline? surround with asterisks?) out of the user's
> hands (barring some sort of ugly protocol for telling the server).

A quick survey of current search engines that are intgrated with the
web involve overloaded use of <STRONG> or <EM> tags for highlighting
purposes. In many cases where HTML pages were indexed and retrieved
proper consideration for existing markup have not been handled
correctly. e.g <A HREF="/image/<STRONG>foo.gif</STRONG>">

The only problem I see with using a new tag for the expressed purpose
of search engine highlighting is that it may not be sufficiently
generalized enough as a general HTML markup and that it could creep
into manually generated documents. Is an authored <STRONG> different
than a script generated <EM>?

I believe we will see several active content document schemes deployed
in calendar year 1995 that will make it possible to have much more
dynamic client side behavior for searching and filtering operations.
e.g. Whenever I see a hint of "terrorism", render the paragraph in
red and sound a siren.

>
> We'd like to suggest a very simple approach -- a highlight tag. This way,
> our server could add the highlight tag in the appropriate places, but it
> would be up to the browser (under the user's control, presumably) to decide
> how to identify highlights in the text (turn them red, underline,
> whatever). An appropriate UI enhancement would be the addition of a "next
> highlight" button or menu item and optionally a "previous highlight"
> button.

The forward and backward navigation could be left to PATHs at a
finer granularity that proposed earlier. Each location would need to
be named and the browser navigation would logically look something
like :
<A HREF="#prev1">^</A>
<A NAME="hit2"><STRONG>terrorism<STRONG></A>
<A HREF="#next3">V</A>

This could also lead to result list navigation along the
same lines as subdocument paths.

<A HREF="nextdoc2#nexthit4"> ...

The problem with adding this capability to the browsers, is that
there are many times that I would like to navigate over other
types of markup as well as search hits. Could my "PageDown"
key be mapped to "Advance to the next H1 tag"? Perhaps the browser
includes support for client side searching which would let me
see a list of <H*> tags and let me select from the local result
list.

>
> I agree with Nick that clients should be able to highlight search
> results, but I suggest a simpler approach - when a server sends a
> response to a search (and only the server *really* knows when this
> is), it should send a keyword attribute in the HTML header, containing
> the terms it thinks are relevant to the results (which may not be all
> the terms specified.) Then the browser can (or not) implement a simple
> way to find and highlight the terms. Most (all?) browsers can already
> search on text, so this is a very simple extension for them. It
> relieves the server of having to muck with the actual HTML text it
> sends, and uses existing mechanisms.

One of the problems with KeyWords In Context (KWIC) systems is the
limited expressiveness of results and the narrow field of recall.
I prefer the approach of the search engine specifying the region
for highlighting over a local mechanism the browser uses for
highlighting terms blindly. I want to see whole sentences selected
by the search engine that actually fulfill my information need.
e.g. "How do I print a file?" returns "<EM>A document can be sent to
the laserwriter using the save dialog</EM>".

I have high hopes for the kinds of applications that will be enabled
with the advent of active client side content. A TABLE will be
an ideal means of expressing search result lists. An active table
could easily sort the results my selecting one of the column headings.
Sort by name or date for fancy directory listings. These are all examples
of safe interactions that are limited to user interactions and manipulation
of the current document contents.
>
______________________________________________________________________
Gary R. Adams Email: [email protected]
Sun Microsystems Laboratories Tel: (508) 442-0416
Two Elizabeth Drive Fax: (508) 250-5067
Chelmsford MA 01824-4195 USA (Sun mail stop: UCHL03-207)
SWAN URL: http://labboot.East/~gra/
______________________________________________________________________