Markovian analysis for automatic new topic identification in search engine transaction logs


Ozmutlu H. C.

APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, vol.25, no.6, pp.737-768, 2009 (Journal Indexed in SCI) identifier identifier

  • Publication Type: Article / Article
  • Volume: 25 Issue: 6
  • Publication Date: 2009
  • Doi Number: 10.1002/asmb.758
  • Title of Journal : APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY
  • Page Numbers: pp.737-768

Abstract

Topic analysis of search engine user queries is an important task, since successful exploitation of the topic of queries can result in the design of new information retrieval algorithms for more efficient search engines. Identification of topic changes within a user search session is a key issue in analysis of search engine user queries. This study presents ail application of Markov chains in the area of search engine research to automatically identify topic changes in a user session by using statistical characteristics of queries, such as time intervals, query reformulation patterns and the continuation/shift status of the previous query. The findings show that Markov chains provide fairly Successful results for automatic new topic identification with a high level of estimation for topic continuations and shifts. Copyright (C) 2009 John Wiley & Sons, Ltd.