Shengbo Guo, Monday 29 March 2010

Probabilistic Latent Maximal Marginal Relevance

Diversity has been heavily motivated as an objective criterion for result sets in the information retrieval literature and various ad-hoc heuristics have been proposed to explicitly optimize for it. In this talk, we will start from first principles and show that optimizing a simple criterion of set-based relevance in a latent variable graphical model--- a framework we refer to as probabilistic latent accumulated relevance (PLAR) --- leads to diversity as a naturally emergent property of the solution. PLAR derives variants of latent semantic indexing (LSI) kernels for relevance and diversity and does not require ad-hoc tuning parameters to balance them. PLAR also directly motivates the general form of many other ad-hoc diversity heuristics in the literature, albeit with important modifications that we show can lead to improved performance on a diversity testbed from the TREC 6-8 Interactive Track.

I received a Bachelor's degree in computer science in 2004, and a Master's degree in pattern recognition and intelligent systems in 2007. I moved to Australia for a PhD program in computer science in July 2007. Currently, I am a PhD candidate at the Australian National University, and a Graduate Researcher in the National ICT Australia.