Probabilistic Latent Maximal Marginal Relevance
Diversity has been heavily motivated as an objective criterion for result sets in the information retrieval literature and various ad-hoc heuristics have been proposed to explicitly optimize for it. In this talk, we will start from first principles and show that optimizing a simple criterion of set-based relevance in a latent variable graphical model--- a framework we refer to as probabilistic latent accumulated relevance (PLAR) --- leads to diversity as a naturally emergent property of the solution. PLAR derives variants of latent semantic indexing (LSI) kernels for relevance and diversity and does not require ad-hoc tuning parameters to balance them. PLAR also directly motivates the general form of many other ad-hoc diversity heuristics in the literature, albeit with important modifications that we show can lead to improved performance on a diversity testbed from the TREC 6-8 Interactive Track.
I received a Bachelor's degree in computer science in 2004, and a Master's degree in pattern recognition and intelligent systems in 2007. I moved to Australia for a PhD program in computer science in July 2007. Currently, I am a PhD candidate at the Australian National University, and a Graduate Researcher in the National ICT Australia.