Wednesday, January 31, 2007

The project used a few years ago.

Fall 2002 project

Notes from class 1/29

As usual, setting up the computer support took some time. Getting my laptop booted up and connected should be planned for. Now that I know how to use Skype better, talking to remote students should be easier.

I asked people to look at the project that was used a few years ago. I also asked people to try to post a comment to the blog, but I haven't seen anything yet.

On Wednesday 1/31, I'll take some Polaroids! If I can get Google Talk to work on my laptop, that will be good.

If this course was going to be on-line or hybrid, access to the lecture slides is obviously critical. A set time for conference calls with students might be a good idea, although awkward for students in distant time zones. Maybe Skype could be used to record the lecture - but breaking it up slide by slide would help even more.

So Wednesday evening I plan to take pictures, and finish the slides started last time. We'll also discuss the project.

Tuesday, January 9, 2007


Information Retrieval

MW 5:30-6:45pm
Room to be announced
Dr. Charles Nicholas
nicholas@umbc.edu

Dr. Ian Soboroff taught this course a few years back, and I like his introduction:

This course is an introduction to the theory and implementation of software systems designed to search through large collections of text. Ever wonder how World-Wide Web search engines work? Ever wondered why they don't? You'll learn about it here. Information retrieval (IR) is one of the oldest branches of computer science, and has influenced nearly every aspect of computer usage: "search and replace" in a word processor, querying a card catalog, grep'ing through your source code, filtering the spam out of your email, searching the Web.

This course will have two main thrusts. The first is to cover the fundamentals of IR: retrieval models, search algorithms, and IR evaluation. The second is to give a taste of the implementation issues by having you write (a good chunk of) your own text search engine and test it out on a sample text collection. This will be a semester-long project, details TBA.

You will need to have taken the equivalent of CMSC 341 (Data Structures), and an algorithms course (441 or 641) is recommended. Linear algebra (MATH 221) and Statistics (STAT 355) are recommended but not required; they give background which will be helpful in understanding many IR concepts.

Text
The text will be Grossman and Frieder, available at the UMBC bookstore (at least it's been ordered) as well as Amazon. We will follow this book fairly closely. Details about which chapters will be covered, and when, will follow. Other readings will be assigned, and made available.

Grading
There will be a multi-phase programming project, details to be announced, worth about 50% of the grade. Homeworks will be another 25%. There will also be a writing project, worth 25%. Presentations on the programming project will take the place of the final exam.

Academic Integrity
"By enrolling in this course, each student assumes the responsibilities of an active participant in UMBC's scholarly community in which everyone's academic work and behavior are held to the highest standards of honesty. Cheating, fabrication, plagiarism, and helping others to commit these acts are all forms of academic dishonesty, and they are wrong. Academic misconduct could result in disciplinary action that may include, but is not limited to, suspension or dismissal. To read the full Student Academic Conduct Policy, consult the UMBC Student Handbook, the Faculty Handbook, or the UMBC Policies section of the UMBC Directory [or for graduate courses, the Graduate School website]."

Welcome

Welcome to CS 676, a course on information retrieval!

It's not enough for a course to have a web site. Nowadays you must have a blog as well! So this blog will be used for much of the official and unofficial communication related to the course.

Only students in the class, and I, will be able to post to the blog. Readership alone is open to the public.