Matrix factorization, is a technique which is focused on reducing the size of the User-Item Matrix or any Matrix for that matter, by identifying the factors of it.

For example if we have to see below matrix :

$$ \left( \begin{matrix}3 & 4 & 5 \\ 6 & 8 & 10 \end{matrix} \right) = \left( \begin{matrix}1 \\ 2 \end{matrix} \right) \left( \begin{matrix}3 & 4 & 5 \end{matrix} \right) $$

Part final is 2 X 3 matrix, which is the dot product of two Matrix Part-A Part-B.

Now let’s say we have 200 Movies and 100 User, what would be the size of User-Item matrix? See below the size would be 20K rows and column.

Imagine the scale for big organization to store such info which is not just in couple of hundreds, This presents a big data storage cost and performance bottleneck, as size keep increasing. Check below how a single increase in User or Movie has impacted the size of matrix.

Increase in User By 1:

UserMovieUser -Item Matrix
Size
10020020000
10120020200
10220020400
10320020600
10420020800
10520021000

Increase in Movie By 1:

UserMovieUser -Item Matrix Size
10020020000
10020120100
10020220200
10020320300
10020420400
10020520500

But in contrast if we can identify some attributes and the relation of the those attributes with Movie and Users

Now let’s see how we can bring factors in it. Lets say there are some attributes about Movie, then we can reduce the size of matrix to some extent. Let’s say in the above case we have 10 different attribute we want to group of movies and user’s liking. So we will create a matrix of 100 users with 10 features, we call it User Features Matrix. Similarly we will create Movie Feature Matrix as shown below :

  • MATRIX A: User Feature matrix with 10 Features
  • MATRIX B: Movie Feature Matrix with 10 Features
  • Total Size (A + B): Total size of the Matrix created in system
UserFeaturesMATRIX AMovieFeaturesMATRIX BTotal
1001010002001020003000
1011010102011020103020
1021010202021020203040
1031010302031020303060
1041010402041020403080
1051010502051020503100

Example

Let take an example of Movie M1 to M5 on few feature of Comedy and Horror, and give the feature score to these movie depending on the ratings given to these movies or any other criteria which is the heart of this problem, and very iterative process.

Final Result :

Let check the result with actual result :

Then we check the Error : (3-1.8) ^2 + (2-1.3)^2 + (3-2.3)^2…… Get the derivative of Error. To adjust the User and Movie Feature matrix.

At the end this information can give some interesting result which can be used to predict the empty boxses of the matrix by the dot product of the these two marix and we don’t need build the big user-item matrix.

Give some Likes to Authors

0 Comments

Leave a Reply