4 May 2016 • on Final

ParaRec Final Report

Summary

We have implemented and optimized a parallel collaborative filtering engine which features:

A compact data structure and related algorithms that we invented (we have not read about any data compression work in collaborative filtering papers)
A multi-GPU, matrix-based solution
Parallel locality sensitive hashing preprocessing algorithm for user clustering

Our deliverables will include:

Performance graph of multiple algorithms we have implemented.

Here is a link to our final presentation slides:

Slides

Background and Approach and Results

We have put up separate writeups for each optimization technique we used. Detailed explanations of the optimization techniques, algorithms, designs, and results can be found in the following posts, please read:

More Results

We measured the the performance using the CycleTimer.h in all previous assignments.
For the matrix solution, the precise setup is in the slides above. For the compressed data structure and nearest neighbor search implementations, they are tested on latedays cluster.
Graphs can be found in those reports.
We have datasets from Movielens ranging from 100k to 10m.
As to what limits our speedup, please read those reports above.

References

List of work

We have done equal amount of work for this project