Homework 2: Retrieval with TF-IDF / BM25 / LSA, Word2Vec (CBOW), and MAP Evaluation
Colab notebook: https://colab.research.google.com/drive/1PcQiQZ0YrkuloSokpMq2WSpQ7LXPqc-l?usp=sharing
This homework has three tasks:
(1) Ranked retrieval with TF-IDF, BM25, and LSA (Truncated SVD) on a sampled subset of 20 Newsgroups — 6 pts (2) Word2Vec (CBOW) with PyTorch trained on the same sampled subset — 7 pts (3) Mean Average Precision (MAP) implementation + evaluate LSA vs CBOW on the test split — 7 pts
BONUS: Solve any task with an LLM — 2 pts
Please complete all tasks in the provided Colab notebook.
Work through each section carefully, following the instructions and filling in the required code or explanations where prompted.
Once you finish:
- Please make sure your notebook is fully executed before you submit it. Even if you copied the starter notebook, you should run all cells (Shift+Enter) so the outputs are visible.
- Download your completed notebook as a
.ipynbfile. - Submit the file via the course submission portal.
- Double-check that all your results and written answers are saved before uploading.
For step-by-step submission guidance, please refer to the Homework Submission Steps.