Exploration of Recommendation Systems
BU's EC 503 Class Project Submission.
 Dataset Used
 Tools Used
 Hyperparameters Used
 DataPreprocessing
 Experiments
 Results
 What we learned?
 References :
Dataset Used

Books Recommendation Dataset

Artificial Dataset
Tools Used

Python Programming language to perform experiments.

Used Sklearn for Modelling SVD.

Used Gridsearch for performing hyperparameter tuning.

Used matplotlib library to build various plots to analyze and showcase the results of experiments.

Pandas and Numpy to perform datapreprocessing
Hyperparameters Used

Training and Testing Split: 80%, 20%

num_of_components or top ‘r’ singular values =250

iterations=20
DataPreprocessing
 Dataset is 97% sparse.
 Ratings: (110)
 Final Shape (500, 551) –> (Users, Books) Rating Matrix

Ensured the Users have at least rated more than 300 books

Ensured the Books had at least 50 ratings from users

Fill Empty/NAN(Not a Number) values with zeros.

Remove duplicates(ensures unique users and books)
Experiments
Experiment 1: Finding Best Filling Method for Dealing with Missing Values in Sparse Matrix
1. Step 1: Create the UserBook Sparse matrix : Fill unknowns with:

zeroes

column mean per user

row mean per book

column median per user

row median per book
2. Step 2: Singular Value Decomposition (SVD): Perform SVD on the sparse matrix attained from Step 1 and get a low rank Approximation by using a fraction of the components (250, acquired through crossvalidation) to reconstruct the UserBook Matrix
3. Step 3: Evaluate Results at every iteration Compare predicted reconstructed matrix & original matrix using RMSE, MSE, MAE, Pearson Coefficient, & Cosine Similarity
4. Step 4: Replace with original values : Substitute the known nonzero values of the original sparse matrix into the output matrix and left the rest untouched.
5. Go to Step 2 and repeat for 5 iterations.
Experiment 2: Increasing Sparsity
1. Step 1: Increase Sparsity Set 300 random elements in the UserBook Sparse Matrix to zero
2. Step 2: Predict Perform SVD & use a fraction of components (250 components(acquired through Cross Validation)) on the final sparse matrix from the above step to get the reconstructed matrix.
3. Step 3: Replace with original values Substitute the known elements of the original sparse matrix into output matrix, but leave the rest untouched.
4. Step 4: Evaluate Results Compare prediction matrix & original matrix using RMSE, MSE, MAE, Pearson Coefficient, & Cosine Similarity
5. Decreasing Sparsity Test Repeat all the steps 2, 3, 4, 5 by First Filling 300 unknown values with mean (column wise)
Experiment 3: Artificial Dataset Creation
1. Create lowrank (250 components) reconstruction using SVD from BooksUsers sparse matrix. (iterations=1)
2. Add zeromean gaussian random noise to every element.
3. Round up to nearest integers in range 110.
4. Randomly drop 20% of rows & columns.
5. Increasing the sparsity on artificial dataset Increase sparsity by 10% in every iteration (randomly setting elements to zero)
6. Perform SVD and evaluate the results.
Algorithim Used :
Results
Experiment 1: Iterative Results of SVD Recommender Prediction Algorithm using Different Filling Methods(**Iterations =5**)
Experiment 2: Increasing & Decreasing Sparsity on UserBooks Sparse Matrix(Iterations =20)
Experiment 3: Increasing Sparsity on Artifical Dataset in Range 0%  90%
What we learned?

As the useritem dataset matrix becomes sparser, the recommendations become less accurate

Different strategies to fill in unknown values in the useritem matrix

Iterative SVD algorithm for recommendation
References :
1. Various Implementations of Collaborative Filtering.
3. SVD