Sunday, February 11, 2007

two important papers

In the discussion of the vector space model, Grossman and Frieder mention Salton's November 1975 CACM paper. I recommend that you read it. It's available through the ACM Digital Library, and through Google Scholar.

They also mention Pivoted Document Length Normalization, which appeared in the 1996 SIGIR conference. The main author is Amit Singhal. That paper is likely still the best explanation of PDLN, which within a year or two of its introduction was widely accepted in IR.

No comments: