[UAI] Doubts about user-session estimation

From: Richard Dybowski (rdybowski@btinternet.com)
Date: Tue Oct 02 2001 - 14:17:27 PDT

Next message: iat01@kis.maebashi-it.ac.jp: "[UAI] WI/IAT-01 Final Program and CFP"

Next message: Ian Miguel: "[UAI] CP 2001 Programme"

Previous message: John Kimmel: "[UAI] New book: Bayesian Networks and Decision Graphs, Finn V. Jensen"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

I have recently come across two on-line articles on Web-usage analysis that
throw a lot of doubt on the validity of attempting to identify user
sessions from the type of data that is currently recorded in Web server
logs. User-session identification is made difficult by a number of causes,
including caching, load balancing (which assigns multiple IP addresses
during the same user session), and the use of spiders. One of these
critical articles is by Stephen Turner (Cambridge University) [1], the
other is from Susan Haigh and Janette Megarity (National Library of Canada)
[2].

Haigh & Megarity have described user-session estimations as "at best, gross
estimates". It seems to me that what is needed is a systematic validation
of the efficacy of the various Web-analysis algorithms currently available.
This could be done by simulating log-file data from known transactions and
comparing how well an algorithm is able to recover the transactions from
the data. This should be repeated using a wide range of hypothetical
scenarios, such as very frequent load balancing (as occurs in reality with
AOL users).

Does anyone know if such a validation has been done?

Richard

References
---------------

[1] S. Turner. "Analog 5.03: How the Web Works".
http://www.analog.cx/docs/webworks.html [7 July 2001]

[2] S. Haigh, J. Megarity. "Measuring Web Site Usage: Log File Analysis".
http://www.nlc-bnc.ca/9/1/p1-256-e.html [4 August 1998]

-------------------------------
Richard Dybowski, 143 Village Way, Pinner, Middlesex HA5 5AA, UK
Tel (mobile): 079 76 25 00 92

Next message: iat01@kis.maebashi-it.ac.jp: "[UAI] WI/IAT-01 Final Program and CFP"
Next message: Ian Miguel: "[UAI] CP 2001 Programme"
Previous message: John Kimmel: "[UAI] New book: Bayesian Networks and Decision Graphs, Finn V. Jensen"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Tue Oct 02 2001 - 14:26:58 PDT