Fernando Diaz
Associate Member, Microsoft

My primary research interest is information retrieval, the formal study of searching large collections of data for small bits of information. The most familiar instance of information retrieval is web search where users search a collection of webpages for one or a few relevant webpages. Information retrieval, however, goes beyond web search and includes topics such as cross-lingual retrieval, personalization, desktop search, and interactive retrieval. My research experience includes distributed information retrieval approaches to web search, interactive and faceted retrieval, mining of temporal patterns from news and query logs, cross-lingual information retrieval, graph-based retrieval methods, and exploiting information from multiple corpora. In my dissertation work, I studied the relationship between document clustering and document scoring for retrieval using methods from machine learning and statistics. As a result, I developed an algorithm for system self-assessment and self-tuning which significantly improves the performance of retrieval algorithms across a variety of corpora.