This post is PART-2 of two posts. You can jump to the first part at PART-1,  which has gone through matrix decomposition using Singular Vector Decomposition (SVD), randomized Singular Vector Decomposition (rSVD) and Nonnegative Matrix Factorization (NMF) mainly for image compression.

This PART-2 will look at video background removal using randomized SVD and robust PCA.

In this post we will try to reproduce the methodology and process learned in python in Chapter 3: Background Removal with Robust PCA, from the free online course: “Computational Linear Algebra for Coders” kindly offered for free by Fast.ai, you can get more info about the course here. All the original material of the Fast.ai course is written in python and can be downloaded here. We will try to reproduce the same results using R code.

This post I will use clear examples about how to use SVD (Singular Value Decomposition), randomized SVD and robust PCA (Principal Component Analysis) applied to the video feed in order to remove video background from surveillance videos.

The main parts of the post include background removal of:

• B/W video using SVD and robust SVD
• B/W video using Randomized robust PCA
• Color video using Randomized robust PCA

You can also get basic ideas about SVD in my previous post.

# Reshaping a video file

First of all, for working with video in R we need to install the Rvision package from Github, you can run the code below or follow instructions on Rvision_Github. This package will allow us to work with video files.

For this post, we will use a 350 frames video color file `video_example.mp4` that must be located on the working directory.

The video dimensions and color space can be obtained using the functions below.

Each frame of the video has a dimension of 240 x 320 pixels. There are also 3 images of 240 x 320 pixels for each frame because we loaded a 3 channel video, one for Red channel, another for Green channel, and the last for Blue channel. So the video in total has 350 frames x 3 channels (RGB) x 240 x 320 pixels = 80640000 numeric values.

Below you can find the video we just loaded.

Next, we will convert each frame into a grayscale in order to simplify the example (you can find the code for video color at the end of this post).

After that, we will reshape each frame of 240 x 320 pixels (grayscale) into a long vector of 240 x 320 = 76800 values.

We will do that along with all the 350 frames, so we will have a vector for each frame.

At the end, we will combine the vectors in order to have a matrix `M` of dimensions: 76800 x 350, where each column represents each frame of the video.

The size in memory of the matrix is:

In the above For_Loop code, we have converted a greyscale video (that is a multidimensional matrix of height x width x frames) into a 2D matrix. Let’s see how it looks if we plot this matrix:

In the above image, the axis ‘y’ (height x width) is huge regarding axis ‘x’ (frames). So it’s not possible to have an idea of how looks this 2D matrix. We need to use rasterImage() function in order to expand ‘x’ axis along the page.

The same matrix is plotted below, where the ‘x’ axis represents the time (each frame), and the ‘y’ axis is the vectorized form of each frame. Now you can have an idea of the movement of the people at the video (the parabolic lines), and how the background seems to be the horizontal lines.

We can check the transformation by reverting the process for a specific frame. We can try to recover frame number 250 by reshaping the column 250 of the matrix `M` from a vector to a matrix. Next, you will find the example.

# Randomized SVD (video B/W)

Once we learned to reshape a video file into a 2D matrix, we can now to apply matrix decomposition methods (see PART-1) for video background removal.

We will start applying Randomized SVD on `M` matrix, for this example we will use the rsvd package with a low-rank decomposition value of k=2 over matrix `M`.

The dimensions of the SVD decomposed matrices are:

Now, we will reconstruct the video using the U, D and V decomposed matrices. For that we will apply the formula:

M_recovery = U · d · VT (see PART-1 for details)

Because we used a low-rank value (k=2) it’s not expected that the reconstructed 2D matrix (M_recovery) will match exactly with the original 2D matrix (M). Instead of that, we will get a matrix that generalizes each frame of the video and will focus on static pixels of the matrix, in order words, ‘M_recovery’ will be focused on the video background, and will avoid objects in movement.

let’s plot the reconstruction of the decomposed matrices, As you can see below, there is no movement on next image:

As we can see, the dimensions of the reconstructed 2D matrix `rSVD_k2_re` match with the original `M` matrix:

If we plot a frame (i.e. frame 250) of the recovery matrix, we will get the background without moving objects or people (below at the left side of the image). We can also subtract the recovery matrix to the original matrix, in order to get only the moving objects or people (below at the right side of the image).

Below we can perform a small analysis of what happens for different k values. As you can see in the images below, the best value for video background removal is k = 2, this is because we don’t want to obtain the reconstructed matrix equal to the original matrix, in fact, we want the opposite effect to remove movement from background, you can go to PART-1 for more details about k value.

# Randomized robust PCA (video B/W)

Now we will perform the same procedure but instead of using rSVD we will use randomized robust principal component analysis. Note that there is a mathematical relation between SVD analysis and Principal Component Analysis (PCA), but this is not the scope of this post, you can search about this on the web.

Robust PCA decomposes a matrix into two matrices L and S, the sum of them results the original matrix:

M = L + S

• M is the original matrix
• L is low-rank
• S is sparse

The term low-rank means that the matrix has a lot of redundant information, so in our example that’s the background, whereas sparse refers to the matrix with mostly zero entries, so in our example that’s the foreground or the moving people (in the case of corrupted video data, sparse matrix captures the corrupted data).

Next we will apply to `M` original 2D video matrix to the rrpca() function from the rsvd package.

Next, we will recover the same frame number 250 from the recovered matrix. On the left side of the image below, you can observe the Low-rank matrix (background). On the right side, there is the sparse matrix (movement).

We can also create a video file if we create an image file for each frame using a For-Loop statement, and then we get all the image files for pack into a .mp4 video file. For doing that you must install ffmpeg, in MAC OSX is as easy as writing on the Terminal:

On Windows check out here.

Executing the code below we will create the images of the frames and then the video in black and white:

Finally, here you will find the decomposed video:

# Randomized robust PCA (color video)

On the code above we simplify the example by working on a greyscale video. Let’s make it a little complex by working with video color.

In this case, we will have a 2D matrix taller, this means that the dimension of the matrix will be 350 x 230400, instead of the 350 x 76800 of the greyscale example. Note that 230400 comes from 240 x 320 x 3 color channels.

We can see below the movement of people on the new 2D color matrix.

We can also recover any frame from the 2D video matrix in order to check the correct reshaping process.

Finally, we will process the `rrpca()` function for the full-color version of our matrix. This will create the `rPCA_k2` object that will contain the low-rank matrix and the sparse matrix.

Next, we can check the background removal and foreground removal for the color frame number 250.

And finally, we will create for the color version a video including all the frames of the original video.

Next, you can find the final video `video_decomposed_color.mp4`.

That’s the final of the PART-2, I hope it was interesting. Leave some comments if you want!

Session Info:

Appendix, all the code:

Share it!: