Re: Session tracking

Paul Burchard ([email protected])
Tue, 18 Apr 1995 02:33:31 +0500


Brian Behlendorf <[email protected]> writes:
> "Clickstreams" are the paths people take when they
> traverse your site - many content providers would find it
> useful to be able to detect common patterns or the
> effectiveness of various user interfaces.
>
> So, I'd like to propose for discussion a new HTTP header
> (hi Roy!) called "Session-ID". This would be optional,
> of course, and it would change any time the browser is
> restarted (or when the user wished).

This is an excellent idea. With Referer logging, you can already
produce a "Markov model" for your Web site, giving transition
probabilities between pages. But it would be interesting to find
out just how independent link choices really are; i.e., once a user
gets to a page, how much does it matter where they came from? To
the extent that it matters, the Markov model is inaccurate.

> Given that more than one person can use a hostname (proxy
> servers, etc), there's no reliable way to exactly identify
> a unique person without implementing access control

Yes, and the statistics of access intervals don't help. Intervals
between requests from the same host seem to follow a combination of
two very distinct exponential distributions whose decay rates
differing by over an order of magnitude; presumably the long-term
exponential represents intervals between user sessions through the
same gateway host. But the problem with exponential distributions
is that the maximum probability occurs at zero, no matter how long-
or short-term they might be...

--------------------------------------------------------------------
Paul Burchard <[email protected]>
``I'm still learning how to count backwards from infinity...''
--------------------------------------------------------------------