Este post corresponde a la segunda parte de dos posts. Puedes leer la primera parte en PARTE-1, ésta ha repasado la descomposición de matrices usando Singular Vector Decomposition (SVD), randomized Singular Vector Decomposition (rSVD) y Nonnegative Matrix Factorization (NMF) principalmente para la compresión de imagenes.

Esta segunda parte, analizará cómo realizar la eliminación del fondo de vídeos mediante randomized SVD y robust PCA.

En este post por tanto, trataremos de reproducir la metodología y el proceso aprendido con python en el Capítulo 3: Background Removal with Robust PCA, del curso online gratuito: “Computational Linear Algebra for Coders” ofrecido gratuitamente por Fast.ai. Puedes obtener más información sobre el curso aquí. Todo el material original del curso Fast.ai está escrito en python y se puede descargar aquí. En este post intentaremos reproducir los mismos resultados utilizando el código R.

Utilizaré ejemplos claros sobre cómo utilizar SVD (Singular Value Decomposition), randomized SVD y robust PCA (Principal Component Analysis) aplicado a vídeo para eliminar el fondo de los vídeos cómo lo pueden ser los de videovigilancia.

Las partes principales del post incluyen la eliminación del fondo de:

Vídeo en blanco y negro utilizando SVD y robust SVD
Vídeo en blanco y negro utilizando Randomized robust PCA
Vídeo en color utilizando Randomized robust PCA

También puedes obtener ideas básicas sobre la SVD en mi post anterior.

Transformando el archivo de vídeo

En primer lugar, para trabajar con vídeo en R necesitamos instalar el paquete Rvision de Github, puedes ejecutar el código de abajo o seguir las instrucciones de Rvision_Github. Este paquete nos permitirá trabajar con archivos de vídeo.

install.packages("devtools")
devtools::install_github("swarm-lab/Rvision")

install.packages("devtools")

devtools::install_github("swarm-lab/Rvision")

Para este post, usaremos un archivo de video a color de 350 fotogramas video_example.mp4 que debe estar ubicado en el directorio de trabajo.

path_to_video <- "video_example.mp4"
path_to_video

path_to_video <- "video_example.mp4"

path_to_video

## [1] "video_example.mp4"

1	## [1] "video_example.mp4"

Las dimensiones del vídeo y el espacio de color se pueden obtener utilizando las siguientes funciones.

library(Rvision)
my_video <- video(filename=path_to_video)
##
# video dimensions (height x width x frames):
dim(my_video)

library(Rvision)

my_video <- video(filename=path_to_video)

# video dimensions (height x width x frames):

dim(my_video)

## [1] 240 320 350

1	## [1] 240 320 350

# video codec:
codec(my_video)

# video codec:

codec(my_video)

## [1] ""

## [1] ""

# framerate:
fps(my_video)

# framerate:

fps(my_video)

## [1] 7

## [1] 7

# color space
colorspace(readFrame(my_video, 1))

# color space

colorspace(readFrame(my_video, 1))

## [1] "BGR"

1	## [1] "BGR"

Cada fotograma de vídeo tiene una dimensión de 240 x 320 píxels, además hay 3 imágenes de 240 x 320 píxels para cada fotograma ya que el vídeo es en color y por lo tanto contiene 3 canales (RGB), uno para el canal Rojo, otro para el canal Verde, y el último para el canal Azul. Así, en total disponemos de 350 fotogramas x 3 canales (RGB) x 240 x 320 píxels = 80640000 valores numéricos.

v_height <- dim(my_video)[1]
v_width <- dim(my_video)[2]

v_height <- dim(my_video)[1]

v_width <- dim(my_video)[2]

Abajo puedes ver el vídeo con el qual trabajaremos.

htmltools::tags$video(src=path_to_video, type="video/mp4", autoplay=NA, controls="controls", height=v_height, width=v_width)

1 2	htmltools::tags$video(src=path_to_video, type="video/mp4", autoplay=NA, controls="controls", height=v_height, width=v_width)

A continuación, convertiremos cada fotograma en una escala de grises para simplificar el ejemplo (puedes encontrar el código para trabajar con vídeo a color al final de este post).

Después de eso, transformaremos cada fotograma de 240 x 320 píxels (escala de grises) en un vector largo de 240 x 320 = 76800 valores.

Lo haremos para todos los 350 fotogramas, así que tendremos un vector para cada fotograma.

Al final, combinaremos los vectores-fotogramas para obtener una matriz M de dimensiones: 76800 x 350, donde cada columna representa cada fotograma del vídeo.

# resheaping all the video frames in a 2D matrix
v <- list()
num_frames <- nframes(my_video)
for(i in 1:num_frames){
      if(i/100 == i%/%100) cat("\nframe:",i,"/",num_frames)
      frame <- readFrame(my_video, i)
      frame <- changeColorSpace(frame, "GRAY")
      matrix_frame <- as.matrix(frame)
      vector_frame <- as.vector(matrix_frame)
      v[[i]] <- vector_frame

      #---------------------------------------------------------------------
      # if you want to display each frame procesed activate the 3 lines below
      #frame_temp <- frame
      #drawText(frame_temp, paste0(i,"/",num_frames), 2, 2, font_scale=1.2, thickness=2, color="red")
      #Rvision::display(frame_temp, window_name="display", height=200, width=300)
}

# resheaping all the video frames in a 2D matrix

v <- list()

num_frames <- nframes(my_video)

for(i in 1:num_frames){

if(i/100 == i%/%100) cat("\nframe:",i,"/",num_frames)

frame <- readFrame(my_video, i)

frame <- changeColorSpace(frame, "GRAY")

matrix_frame <- as.matrix(frame)

vector_frame <- as.vector(matrix_frame)

v[[i]] <- vector_frame

#---------------------------------------------------------------------

# if you want to display each frame procesed activate the 3 lines below

#frame_temp <- frame

#drawText(frame_temp, paste0(i,"/",num_frames), 2, 2, font_scale=1.2, thickness=2, color="red")

#Rvision::display(frame_temp, window_name="display", height=200, width=300)

}

## 
## frame: 100 / 350
## frame: 200 / 350
## frame: 300 / 350

## frame: 100 / 350

## frame: 200 / 350

## frame: 300 / 350

Rvision::destroyDisplay("display") # works with: Rvision::display()
release(my_video)

Rvision::destroyDisplay("display") # works with: Rvision::display()

release(my_video)

## Video released successfully.

1	## Video released successfully.

M <- do.call("rbind",v)

# matrix dimensions (frames | height x width):
dim(M)

M <- do.call("rbind",v)

# matrix dimensions (frames | height x width):

dim(M)

## [1]   350 76800

1	## [1] 350 76800

La matriz en memoria ocupa:

# matrix memory size:
print(object.size(M), units="Mb")

# matrix memory size:

print(object.size(M), units="Mb")

## 205.1 Mb

1	## 205.1 Mb

En la secuencia For_Loop anterior, hemos convertido un vídeo en escala de grises (es decir, una matriz multidimensional de alto x ancho x fotogramas) en una matriz 2D. Veamos cómo se ve si pintamos esta matriz:

# plot all the video frames in a 2D matrix
library(imager)
D <- as.cimg(M)
plot(D, interpolate=FALSE, xlab="frame", ylab="height_x_width")

# plot all the video frames in a 2D matrix

library(imager)

D <- as.cimg(M)

plot(D, interpolate=FALSE, xlab="frame", ylab="height_x_width")

En la imagen de arriba, el eje ‘y’ (alto x ancho) es enorme con respecto al eje ‘x’ (fotogramas). Así que no es posible visualizar correctamente la matriz 2D. Necesitamos usar la función rasterImage() para expandir el eje ‘x’ a lo largo de la pantalla.

Así, expandiendo esa matriz a lo largo del eje ‘x’, se observa más fácilmente la idea de movimiento del vídeo. En la imagen siguiente el eje `x’ representa el tiempo (cada fotograma), y el eje `y’ es la forma vectorizada de cada fotograma. Ahora puedes ver claramente el movimiento de la gente en el vídeo (las líneas parabólicas), y cómo el fondo que parece estar formado por líneas horizontales.

plot(c(0, dim(M)[1]), c(0, dim(M)[2]), type="n", xlab="frame", ylab="height_x_width")
rasterImage(as.raster(D), 0, 0, dim(M)[1], dim(M)[2], interpolate=FALSE)

plot(c(0, dim(M)[1]), c(0, dim(M)[2]), type="n", xlab="frame", ylab="height_x_width")

rasterImage(as.raster(D), 0, 0, dim(M)[1], dim(M)[2], interpolate=FALSE)

Podemos comprobar la transformación invirtiendo el proceso para un fotograma específico, y así tratar de recuperar el fotograma número 250 transformando la columna 250 de la matriz M de vector a matriz. A continuación se muestra el ejemplo.

# recovering a frame
REC <- matrix(M[250,], nrow=v_width, ncol=v_height, byrow=TRUE)
library(imager)
plot(as.cimg(REC))

# recovering a frame

REC <- matrix(M[250,], nrow=v_width, ncol=v_height, byrow=TRUE)

library(imager)

plot(as.cimg(REC))

Randomized SVD (video B/N)

Una vez que hemos aprendido a transformar un archivo de vídeo en una matriz 2D, ahora podemos aplicar métodos de descomposición de matrices (ver PARTE-1) para la eliminación del fondo de los vídeo.

Comenzaremos aplicando Randomized SVD en la matriz M, para este ejemplo usaremos el paquete rsvd con un valor de descomposición de bajo rango (low-rank decomposition) de k=2 sobre la matriz M.

library(rsvd)
rSVD_k2 <- rsvd(M, k=2)

library(rsvd)

rSVD_k2 <- rsvd(M, k=2)

The dimensions of the SVD decomposed matrices are:

dim(rSVD_k2$u)

1 2	dim(rSVD_k2$u)

## [1] 350   2

1	## [1] 350 2

NROW(rSVD_k2$d)

1 2	NROW(rSVD_k2$d)

## [1] 2

## [1] 2

dim(rSVD_k2$v)

1 2	dim(rSVD_k2$v)

## [1] 76800     2

1	## [1] 76800 2

Ahora, reconstruiremos el vídeo usando las matrices descompuestas U, D y V. Para ello aplicaremos la fórmula:

M_recovery = U · d · V^T (ver PARTE-1 para más detalles)

Debido a que usamos un valor de bajo rango (k=2) no se espera que la matriz 2D reconstruida (M_recovery) coincida exactamente con la matriz 2D original (M). En lugar de eso, obtendremos una matriz que generaliza cada fotograma del vídeo y se centrará en los píxels estáticos de la matriz, en pocas palabras, ‘M_recovery’ se centrará en el fondo del vídeo, y evitará los objetos en movimiento.

# reconstructing video using decomposed matrix
rSVD_k2_re <- rSVD_k2$u %*% diag(rSVD_k2$d) %*% t(rSVD_k2$v)

# reconstructing video using decomposed matrix

rSVD_k2_re <- rSVD_k2$u %*% diag(rSVD_k2$d) %*% t(rSVD_k2$v)

Si dibujamos la reconstrucción de las matrices descompuestas vemos, en la siguiente imagen, que el resultado es que no hay movimiento:

plot(c(0, dim(rSVD_k2_re)[1]), c(0, dim(rSVD_k2_re)[2]), type="n", xlab="frame", ylab="height_x_width")
rasterImage(as.raster(as.cimg(rSVD_k2_re)), 0, 0, dim(rSVD_k2_re)[1], dim(rSVD_k2_re)[2], interpolate=FALSE)

plot(c(0, dim(rSVD_k2_re)[1]), c(0, dim(rSVD_k2_re)[2]), type="n", xlab="frame", ylab="height_x_width")

rasterImage(as.raster(as.cimg(rSVD_k2_re)), 0, 0, dim(rSVD_k2_re)[1], dim(rSVD_k2_re)[2], interpolate=FALSE)

Como podemos ver, las dimensiones de la matriz 2D reconstruida rSVD_k2_re coinciden con la matriz M original:

# M_original
dim(M)

# M_original

dim(M)

## [1]   350 76800

1	## [1] 350 76800

# M_recovery
dim(rSVD_k2_re)

# M_recovery

dim(rSVD_k2_re)

## [1]   350 76800

1	## [1] 350 76800

Si mostramos un fotograma (por ejemplo el fotograma 250) de la matriz reconstruida, obtendremos una imagen con el fondo pero sin objetos o personas en movimiento (abajo a la izquierda de la imagen). También podemos restar la matriz reconstruida a la matriz original, para obtener así una imagen con sólo los objetos en movimiento o personas y sin el fondo (abajo a la derecha de la imagen).

      # recovering a frame
      FRAME <- 250
      REC_re_bg <- rSVD_k2_re[FRAME,]
      REC_re_mov <- M[FRAME,] - rSVD_k2_re[FRAME,]
      REC_re_mov <- REC_re_mov - min(REC_re_mov) # to avoid negative values
      FRAME_im <- as.cimg(matrix(c(REC_re_bg,REC_re_mov), nrow=v_width*2, ncol=v_height, byrow=TRUE))
      plot(as.cimg(FRAME_im))

# recovering a frame

FRAME <- 250

REC_re_bg <- rSVD_k2_re[FRAME,]

REC_re_mov <- M[FRAME,] - rSVD_k2_re[FRAME,]

REC_re_mov <- REC_re_mov - min(REC_re_mov) # to avoid negative values

FRAME_im <- as.cimg(matrix(c(REC_re_bg,REC_re_mov), nrow=v_width*2, ncol=v_height, byrow=TRUE))

plot(as.cimg(FRAME_im))

A continuación podemos realizar un pequeño análisis de lo que ocurre con los diferentes valores de k. Como se puede ver en las imágenes de abajo, el mejor valor para la eliminación del fondo de vídeo es k = 2, esto es porque no queremos obtener la matriz reconstruida igual a la matriz original, de hecho, queremos el efecto contrario, que se elimine el movimiento del fondo, puede ir a la PARTE-1 para más detalles sobre el valor k.

# changing k value
FRAME <- 250
par(mfrow=c(3,2), mar=c(2,2,0,0))
for(i in c(2,4,6,10,50,100)){
      rSVD_ <- rsvd(M, k=i)
      rSVD_re <- rSVD_$u %*% diag(rSVD_$d) %*% t(rSVD_$v)
      REC_re_bg <- matrix(rSVD_re[FRAME,], nrow=v_width, ncol=v_height, byrow=TRUE)
      REC_re_mov <- matrix(M[FRAME,] - rSVD_re[FRAME,], nrow=v_width, ncol=v_height, byrow=TRUE)
      REC_re_mov <- REC_re_mov - min(REC_re_mov) # to avoid negative values
      plot(imappend(list(as.cimg(REC_re_bg), as.cimg(REC_re_mov)), axis="x"), axes=FALSE)
      text(10,180,paste0("k = ",i), pos=4, col="white", cex=3)
}

# changing k value

FRAME <- 250

par(mfrow=c(3,2), mar=c(2,2,0,0))

for(i in c(2,4,6,10,50,100)){

rSVD_ <- rsvd(M, k=i)

rSVD_re <- rSVD_$u %*% diag(rSVD_$d) %*% t(rSVD_$v)

REC_re_bg <- matrix(rSVD_re[FRAME,], nrow=v_width, ncol=v_height, byrow=TRUE)

REC_re_mov <- matrix(M[FRAME,] - rSVD_re[FRAME,], nrow=v_width, ncol=v_height, byrow=TRUE)

REC_re_mov <- REC_re_mov - min(REC_re_mov) # to avoid negative values

plot(imappend(list(as.cimg(REC_re_bg), as.cimg(REC_re_mov)), axis="x"), axes=FALSE)

text(10,180,paste0("k = ",i), pos=4, col="white", cex=3)

}

Randomized robust PCA (video B/N)

Ahora vamos a realizar el mismo procedimiento pero en lugar de usar rSVD vamos a usar randomized robust principal component analysis. Ten en cuenta que existe una relación matemática entre el SVD y el Principal Component Analysis (PCA), pero no es el alcance de este post profundizar en esa relación, si estas interesado hay contenido muy bueno en Internet.

El Robust PCA descompone una matriz en dos matrices L y S, la suma de ellas da como resultado la matriz original:

M = L + S

M es la matriz original
L es la de bajo rango
S es la “sparse“

El término low-rank significa que la matriz tiene mucha información redundante, así que en nuestro ejemplo ese es el fondo del vídeo, mientras que sparse se refiere a la matriz con la mayoría de las entradas cero, así que en nuestro ejemplo corresponde a el primer plano o la gente en movimiento (en el caso de datos de vídeo corruptos la matriz sparse captura los datos corruptos).

A continuación a la matriz de vídeo 2D original M aplicaremos la función rrpca() del paquete rsvd.

library(rsvd)
# rrpca: robust principal component analysis
t1 <- Sys.time()
rPCA_k2 <- rrpca(M, maxiter=15, tol=1e-9, trace=TRUE)

library(rsvd)

# rrpca: robust principal component analysis

t1 <- Sys.time()

rPCA_k2 <- rrpca(M, maxiter=15, tol=1e-9, trace=TRUE)

## 
##  Iteration: 1  predicted rank = 1  target rank k = 2  Fro. error = 0.542590970126589
##  Iteration: 2  predicted rank = 1  target rank k = 2  Fro. error = 0.11882037441536
##  Iteration: 3  predicted rank = 1  target rank k = 2  Fro. error = 0.118820374415341
##  Iteration: 4  predicted rank = 1  target rank k = 2  Fro. error = 0.117552255682024
##  Iteration: 5  predicted rank = 1  target rank k = 2  Fro. error = 0.0777164622844202
##  Iteration: 6  predicted rank = 1  target rank k = 2  Fro. error = 0.0446984074714729
##  Iteration: 7  predicted rank = 2  target rank k = 3  Fro. error = 0.0301993087182703
##  Iteration: 8  predicted rank = 3  target rank k = 4  Fro. error = 0.0230020435092097
##  Iteration: 9  predicted rank = 5  target rank k = 3845  Fro. error = 0.0178484572936533
##  Iteration: 10  predicted rank = 8  target rank k = 9  Fro. error = 0.0139845535144859
##  Iteration: 11  predicted rank = 11  target rank k = 3851  Fro. error = 0.0104068420648371
##  Iteration: 12  predicted rank = 16  target rank k = 17  Fro. error = 0.00770065204336468
##  Iteration: 13  predicted rank = 22  target rank k = 3862  Fro. error = 0.00558375100833253
##  Iteration: 14  predicted rank = 27  target rank k = 28  Fro. error = 0.00403464775460438
##  Iteration: 15  predicted rank = 32  target rank k = 3872  Fro. error = 0.00286332386069142

## Iteration: 1 predicted rank = 1 target rank k = 2 Fro. error = 0.542590970126589

## Iteration: 2 predicted rank = 1 target rank k = 2 Fro. error = 0.11882037441536

## Iteration: 3 predicted rank = 1 target rank k = 2 Fro. error = 0.118820374415341

## Iteration: 4 predicted rank = 1 target rank k = 2 Fro. error = 0.117552255682024

## Iteration: 5 predicted rank = 1 target rank k = 2 Fro. error = 0.0777164622844202

## Iteration: 6 predicted rank = 1 target rank k = 2 Fro. error = 0.0446984074714729

## Iteration: 7 predicted rank = 2 target rank k = 3 Fro. error = 0.0301993087182703

## Iteration: 8 predicted rank = 3 target rank k = 4 Fro. error = 0.0230020435092097

## Iteration: 9 predicted rank = 5 target rank k = 3845 Fro. error = 0.0178484572936533

## Iteration: 10 predicted rank = 8 target rank k = 9 Fro. error = 0.0139845535144859

## Iteration: 11 predicted rank = 11 target rank k = 3851 Fro. error = 0.0104068420648371

## Iteration: 12 predicted rank = 16 target rank k = 17 Fro. error = 0.00770065204336468

## Iteration: 13 predicted rank = 22 target rank k = 3862 Fro. error = 0.00558375100833253

## Iteration: 14 predicted rank = 27 target rank k = 28 Fro. error = 0.00403464775460438

## Iteration: 15 predicted rank = 32 target rank k = 3872 Fro. error = 0.00286332386069142

cat("\n",round(difftime(Sys.time(), t1, units="mins"),1)," mins")

1 2	cat("\n",round(difftime(Sys.time(), t1, units="mins"),1)," mins")

## 
##  3.8  mins

1 2	## ## 3.8 mins

A continuación, recuperaremos el mismo fotograma núm. 250 de la matriz reconstruida. En el lado izquierdo de la imagen de abajo, se puede observar la matriz de bajo rango L (fondo). En el lado derecho, está la matriz sparse S (movimiento).

# recovering a split frame
FRAME <- 250
REC2_re_bg <- matrix(rPCA_k2$L[FRAME,], nrow=v_width, ncol=v_height, byrow=TRUE)
REC2_re_mov <- matrix(rPCA_k2$S[FRAME,], nrow=v_width, ncol=v_height, byrow=TRUE)
REC2_re_mov <- REC2_re_mov - min(REC2_re_mov) # to avoid negative values
plot(imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)),axis="x"))

# recovering a split frame

FRAME <- 250

REC2_re_bg <- matrix(rPCA_k2$L[FRAME,], nrow=v_width, ncol=v_height, byrow=TRUE)

REC2_re_mov <- matrix(rPCA_k2$S[FRAME,], nrow=v_width, ncol=v_height, byrow=TRUE)

REC2_re_mov <- REC2_re_mov - min(REC2_re_mov) # to avoid negative values

plot(imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)),axis="x"))

También podemos crear un archivo de vídeo si guardamos en un archivo la imagen de cada fotograma, esto se puede hacer fácilmente con una secuencia For-Loop. Luego sólo nos quedará juntar o empaquetar todos los archivos de imagenes en un único archivo de vídeo.mp4. Para ello se debe instalar ffmpeg, en MAC OSX es tan fácil como escribir en el Terminal:

> brew install ffmpeg

1 2	> brew install ffmpeg

Para Windows intenta buscar aquí.

Ejecutando el código de abajo crearemos las imágenes de los fotogramas y posteriormente el vídeo en blanco y negro:

# creating a video
#----------------------
# needs to install: ffmpeg: you can use "> brew install ffmpeg" on Terminal on MAC OS X
# creating frames
for(i in 1:dim(rPCA_k2$S)[1]){
      if(i/100 == i%/%100) cat("\nframe:",i,"/",num_frames)
      #FRAME_im <- as.cimg(matrix(c(rPCA_k2$L[i,],rPCA_k2$S[i,]), nrow=v_width*2, ncol=v_height, byrow=TRUE))
      FRAME_im_bg <- matrix(rPCA_k2$L[i,], nrow=v_width, ncol=v_height, byrow=TRUE)
      FRAME_im_mov <- matrix(rPCA_k2$S[i,], nrow=v_width, ncol=v_height, byrow=TRUE)
      FRAME_im_mov <- FRAME_im_mov - min(FRAME_im_mov) # to avoid negative values
      FRAME_im <- imappend(list(as.cimg(FRAME_im_bg), as.cimg(FRAME_im_mov)), axis="x")
      save.image(FRAME_im, file=paste0("./video_out/bw_",i,".png"))
}

# creating a video

#----------------------

# needs to install: ffmpeg: you can use "> brew install ffmpeg" on Terminal on MAC OS X

# creating frames

for(i in 1:dim(rPCA_k2$S)[1]){

if(i/100 == i%/%100) cat("\nframe:",i,"/",num_frames)

#FRAME_im <- as.cimg(matrix(c(rPCA_k2$L[i,],rPCA_k2$S[i,]), nrow=v_width*2, ncol=v_height, byrow=TRUE))

FRAME_im_bg <- matrix(rPCA_k2$L[i,], nrow=v_width, ncol=v_height, byrow=TRUE)

FRAME_im_mov <- matrix(rPCA_k2$S[i,], nrow=v_width, ncol=v_height, byrow=TRUE)

FRAME_im_mov <- FRAME_im_mov - min(FRAME_im_mov) # to avoid negative values

FRAME_im <- imappend(list(as.cimg(FRAME_im_bg), as.cimg(FRAME_im_mov)), axis="x")

save.image(FRAME_im, file=paste0("./video_out/bw_",i,".png"))

}

# creating video file
# "-pix_fmt yuv420p" args is for Apple Quicktime support
library(imager)
make.video(dname="./video_out/", fname="./video_out/video_decomposed_bw.mp4", pattern="bw_%d.png", fps=7, extra.args="-pix_fmt yuv420p")

# creating video file

# "-pix_fmt yuv420p" args is for Apple Quicktime support

library(imager)

make.video(dname="./video_out/", fname="./video_out/video_decomposed_bw.mp4", pattern="bw_%d.png", fps=7, extra.args="-pix_fmt yuv420p")

A continuación encontrarás el vídeo descompuesto:

htmltools::tags$video(src="./video_out/video_decomposed_bw.mp4", type="video/mp4", autoplay=NA, controls="controls", height=v_height, width=v_width*2)

1 2	htmltools::tags$video(src="./video_out/video_decomposed_bw.mp4", type="video/mp4", autoplay=NA, controls="controls", height=v_height, width=v_width*2)

Randomized robust PCA (vídeo a color)

En código previamente mostrado, simplifica el problema ya que trabaja con vídeo en blanco y negro, intentemos añadir un nivel más de complejidad añadiendo el color a los vídeos!!!.

En este caso, tendremos una matriz 2D más alta, esto significa que la dimensión de la matriz será 350 x 230400, en lugar de 350 x 76800 del ejemplo B/N. Ten en cuenta que 230400 viene de 240 x 320 x 3 canales de color (RGB).

library(Rvision)
my_video <- video(filename=path_to_video)
v <- list()
num_frames <- nframes(my_video)
for(i in 1:num_frames){
      if(i/100 == i%/%100) cat("\nframe:",i,"/",num_frames)
      frame <- readFrame(my_video, i)
      matrix_frame <- as.matrix(frame, nrow=240, ncol=320, byrow=FALSE)
      vector_frame <- as.vector(aperm(matrix_frame, c(2,1,3)))
      v[[i]] <- vector_frame
}

library(Rvision)

my_video <- video(filename=path_to_video)

v <- list()

num_frames <- nframes(my_video)

for(i in 1:num_frames){

if(i/100 == i%/%100) cat("\nframe:",i,"/",num_frames)

frame <- readFrame(my_video, i)

matrix_frame <- as.matrix(frame, nrow=240, ncol=320, byrow=FALSE)

vector_frame <- as.vector(aperm(matrix_frame, c(2,1,3)))

v[[i]] <- vector_frame

}

## 
## frame: 100 / 350
## frame: 200 / 350
## frame: 300 / 350

## frame: 100 / 350

## frame: 200 / 350

## frame: 300 / 350

M <- do.call("rbind",v)
release(my_video)

M <- do.call("rbind",v)

release(my_video)

## Video released successfully.

1	## Video released successfully.

# matrix dimensions (frames | height x width x 3color):
dim(M)

# matrix dimensions (frames | height x width x 3color):

dim(M)

## [1]    350 230400

1	## [1] 350 230400

Podemos ver abajo el movimiento de las personas en la nueva matriz 2D de color, esta vez incluyendo el color!!.

# plot all the video
dim(M)

# plot all the video

dim(M)

## [1]    350 230400

1	## [1] 350 230400

library(imager)
M2 <- array(M, dim=c(dim(M)[1],v_width*v_height,3))
dim(M2)

library(imager)

M2 <- array(M, dim=c(dim(M)[1],v_width*v_height,3))

dim(M2)

## [1]   350 76800     3

1	## [1] 350 76800 3

D <- as.cimg(M2, x=dim(M2)[1], y=dim(M2)[2], cc=3)
plot(c(0, dim(M2)[1]), c(0, dim(M2)[2]), type="n", xlab="frame", ylab="height_x_width")
rasterImage(as.raster(D), 0, 0, dim(M2)[1], dim(M2)[2], interpolate=FALSE)

D <- as.cimg(M2, x=dim(M2)[1], y=dim(M2)[2], cc=3)

plot(c(0, dim(M2)[1]), c(0, dim(M2)[2]), type="n", xlab="frame", ylab="height_x_width")

rasterImage(as.raster(D), 0, 0, dim(M2)[1], dim(M2)[2], interpolate=FALSE)

También podemos reconstruir cualquier fotograma de la matriz de vídeo 2D para comprobar que el proceso de transformación sea correcto.

# recovering an original frame
REC <- array(M2[250,,], dim=c(v_width,v_height,1,3))
dim(REC)

# recovering an original frame

REC <- array(M2[250,,], dim=c(v_width,v_height,1,3))

dim(REC)

## [1] 320 240   1   3

1	## [1] 320 240 1 3

library(imager)
plot(as.cimg(REC))

library(imager)

plot(as.cimg(REC))

Finalmente, utilizaremos la función rrpca() para la versión a todo color de nuestra matriz. Esto creará el objeto rPCA_k2 que contendrá la matriz low-rank y la matriz sparse.

library(rsvd)
# rrpca: robust principal component analysis
t1 <- Sys.time()
rPCA_k2 <- rrpca(M, maxiter=15, tol=1e-9, trace=TRUE)

library(rsvd)

# rrpca: robust principal component analysis

t1 <- Sys.time()

rPCA_k2 <- rrpca(M, maxiter=15, tol=1e-9, trace=TRUE)

## 
##  Iteration: 1  predicted rank = 1  target rank k = 2  Fro. error = 0.543081549596879
##  Iteration: 2  predicted rank = 1  target rank k = 2  Fro. error = 0.121377101431907
##  Iteration: 3  predicted rank = 1  target rank k = 2  Fro. error = 0.12137710143193
##  Iteration: 4  predicted rank = 1  target rank k = 2  Fro. error = 0.119268896142694
##  Iteration: 5  predicted rank = 1  target rank k = 2  Fro. error = 0.0775172016215707
##  Iteration: 6  predicted rank = 1  target rank k = 2  Fro. error = 0.0456250775118491
##  Iteration: 7  predicted rank = 2  target rank k = 3  Fro. error = 0.03159229215968
##  Iteration: 8  predicted rank = 3  target rank k = 4  Fro. error = 0.0242080663974625
##  Iteration: 9  predicted rank = 5  target rank k = 11525  Fro. error = 0.0186652453361901
##  Iteration: 10  predicted rank = 8  target rank k = 9  Fro. error = 0.0145118074287039
##  Iteration: 11  predicted rank = 12  target rank k = 11532  Fro. error = 0.0108247928491098
##  Iteration: 12  predicted rank = 17  target rank k = 18  Fro. error = 0.0079549172034063
##  Iteration: 13  predicted rank = 22  target rank k = 11542  Fro. error = 0.00581362182550793
##  Iteration: 14  predicted rank = 28  target rank k = 29  Fro. error = 0.00424831948772267
##  Iteration: 15  predicted rank = 34  target rank k = 11554  Fro. error = 0.00304577578151052

## Iteration: 1 predicted rank = 1 target rank k = 2 Fro. error = 0.543081549596879

## Iteration: 2 predicted rank = 1 target rank k = 2 Fro. error = 0.121377101431907

## Iteration: 3 predicted rank = 1 target rank k = 2 Fro. error = 0.12137710143193

## Iteration: 4 predicted rank = 1 target rank k = 2 Fro. error = 0.119268896142694

## Iteration: 5 predicted rank = 1 target rank k = 2 Fro. error = 0.0775172016215707

## Iteration: 6 predicted rank = 1 target rank k = 2 Fro. error = 0.0456250775118491

## Iteration: 7 predicted rank = 2 target rank k = 3 Fro. error = 0.03159229215968

## Iteration: 8 predicted rank = 3 target rank k = 4 Fro. error = 0.0242080663974625

## Iteration: 9 predicted rank = 5 target rank k = 11525 Fro. error = 0.0186652453361901

## Iteration: 10 predicted rank = 8 target rank k = 9 Fro. error = 0.0145118074287039

## Iteration: 11 predicted rank = 12 target rank k = 11532 Fro. error = 0.0108247928491098

## Iteration: 12 predicted rank = 17 target rank k = 18 Fro. error = 0.0079549172034063

## Iteration: 13 predicted rank = 22 target rank k = 11542 Fro. error = 0.00581362182550793

## Iteration: 14 predicted rank = 28 target rank k = 29 Fro. error = 0.00424831948772267

## Iteration: 15 predicted rank = 34 target rank k = 11554 Fro. error = 0.00304577578151052

cat("\n",round(difftime(Sys.time(), t1, units="mins"),1)," mins")

1 2	cat("\n",round(difftime(Sys.time(), t1, units="mins"),1)," mins")

## 
##  12.4  mins

1 2	## ## 12.4 mins

A continuación, comprobamos como ha sido la eliminación de fondo y la separación de los objetos en primer plano para el mismo fotograma 250.

# recovering a split frame
FRAME <- 250
REC2_re_bg <- array(rPCA_k2$L[250,], dim=c(v_width,v_height,1,3))
REC2_re_mov <- array(rPCA_k2$S[250,], dim=c(v_width,v_height,1,3))
REC2_re_mov <- REC2_re_mov - min(REC2_re_mov) # to avoid negative values
plot(imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)),axis="x"))

# recovering a split frame

FRAME <- 250

REC2_re_bg <- array(rPCA_k2$L[250,], dim=c(v_width,v_height,1,3))

REC2_re_mov <- array(rPCA_k2$S[250,], dim=c(v_width,v_height,1,3))

REC2_re_mov <- REC2_re_mov - min(REC2_re_mov) # to avoid negative values

plot(imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)),axis="x"))

Y finalmente, crearemos para la versión en color, un vídeo que incluya todos los fotogramas del vídeo original.

# creating a video
#----------------------
# creating frames
library(abind)
for(i in 1:dim(rPCA_k2$S)[1]){
      if(i/100 == i%/%100) cat("\nframe:",i,"/",num_frames)
      REC2_re_bg <- array(rPCA_k2$L[i,], dim=c(v_width,v_height,1,3))
      REC2_re_mov <- array(rPCA_k2$S[i,], dim=c(v_width,v_height,1,3))
      REC2_re_mov <- REC2_re_mov - min(REC2_re_mov) # to avoid negative values
      #FRAME_im <- as.cimg(array(abind(X,Y,along=1), dim=c(v_width*2,v_height,1,3)))
      FRAME_im <- imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)),axis="x")
      save.image(FRAME_im, file=paste0("./video_out/color_",i,".png"))
}

# creating a video

#----------------------

# creating frames

library(abind)

for(i in 1:dim(rPCA_k2$S)[1]){

if(i/100 == i%/%100) cat("\nframe:",i,"/",num_frames)

REC2_re_bg <- array(rPCA_k2$L[i,], dim=c(v_width,v_height,1,3))

REC2_re_mov <- array(rPCA_k2$S[i,], dim=c(v_width,v_height,1,3))

REC2_re_mov <- REC2_re_mov - min(REC2_re_mov) # to avoid negative values

#FRAME_im <- as.cimg(array(abind(X,Y,along=1), dim=c(v_width*2,v_height,1,3)))

FRAME_im <- imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)),axis="x")

save.image(FRAME_im, file=paste0("./video_out/color_",i,".png"))

}

# creating video file
# "-pix_fmt yuv420p" args is for Apple Quicktime support
library(imager)
make.video(dname="./video_out/", fname="./video_out/video_decomposed_color.mp4", pattern="color_%d.png", fps=7, extra.args="-pix_fmt yuv420p")

# creating video file

# "-pix_fmt yuv420p" args is for Apple Quicktime support

library(imager)

make.video(dname="./video_out/", fname="./video_out/video_decomposed_color.mp4", pattern="color_%d.png", fps=7, extra.args="-pix_fmt yuv420p")

A continuación, puedes encontrar el vídeo final: video_decomposed_color.mp4.

htmltools::tags$video(src="./video_out/video_decomposed_color.mp4", type="video/mp4", autoplay=NA, controls="controls", height=v_height, width=v_width*2)

1 2	htmltools::tags$video(src="./video_out/video_decomposed_color.mp4", type="video/mp4", autoplay=NA, controls="controls", height=v_height, width=v_width*2)

Y hasta aquí hemos llegado al final de la PARTE-2.

Espero que haya sido interesante. Deja algunos comentarios u opiniones si quieres!

Session Info:

------------------------------------
Total R execution time:  21.9 mins 
------------------------------------

------------------------------------

Total R execution time: 21.9 mins

------------------------------------

 setting  value                       
 version  R version 3.4.3 (2017-11-30)
 os       macOS High Sierra 10.13.5   
 system   x86_64, darwin15.6.0        
 ui       RStudio                     
 language (EN)                        
 collate  es_ES.UTF-8                 
 tz       Europe/Madrid               
 date     2018-08-18

setting value

version R version 3.4.3 (2017-11-30)

os macOS High Sierra 10.13.5

system x86_64, darwin15.6.0

ui RStudio

language (EN)

collate es_ES.UTF-8

tz Europe/Madrid

date 2018-08-18

------------------------------------
Packages:

1 2	------------------------------------ Packages:

 [1] "abind - 1.4-5 - 2016-07-21 - CRAN (R 3.4.0)"                           
 [2] "bitops - 1.0-6 - 2013-08-17 - CRAN (R 3.4.0)"                          
 [3] "imager - 0.40.2 - 2017-04-24 - CRAN (R 3.4.0)"                         
 [4] "magrittr - 1.5 - 2014-11-22 - CRAN (R 3.4.0)"                          
 [5] "plyr - 1.8.4 - 2016-06-08 - CRAN (R 3.4.0)"                            
 [6] "RCurl - 1.95-4.11 - 2018-07-15 - cran (@1.95-4.)"                      
 [7] "reshape2 - 1.4.2 - 2016-10-22 - CRAN (R 3.4.0)"                        
 [8] "rsvd - 0.9 - 2017-12-08 - CRAN (R 3.4.3)"                              
 [9] "Rvision - 0.3 - 2018-06-29 - Github (swarm-lab/Rvision@65cee21)"       
[10] "RWordPress - 0.2-3 - 2018-03-04 - Github (duncantl/RWordPress@ce6d2d6)"
[11] "stringr - 1.2.0 - 2017-02-18 - CRAN (R 3.4.0)"                         
[12] "XMLRPC - 0.3-1 - 2018-08-17 - Github (duncantl/XMLRPC@add9496)"

[1] "abind - 1.4-5 - 2016-07-21 - CRAN (R 3.4.0)"

[2] "bitops - 1.0-6 - 2013-08-17 - CRAN (R 3.4.0)"

[3] "imager - 0.40.2 - 2017-04-24 - CRAN (R 3.4.0)"

[4] "magrittr - 1.5 - 2014-11-22 - CRAN (R 3.4.0)"

[5] "plyr - 1.8.4 - 2016-06-08 - CRAN (R 3.4.0)"

[6] "RCurl - 1.95-4.11 - 2018-07-15 - cran (@1.95-4.)"

[7] "reshape2 - 1.4.2 - 2016-10-22 - CRAN (R 3.4.0)"

[8] "rsvd - 0.9 - 2017-12-08 - CRAN (R 3.4.3)"

[9] "Rvision - 0.3 - 2018-06-29 - Github (swarm-lab/Rvision@65cee21)"

[10] "RWordPress - 0.2-3 - 2018-03-04 - Github (duncantl/RWordPress@ce6d2d6)"

[11] "stringr - 1.2.0 - 2017-02-18 - CRAN (R 3.4.0)"

[12] "XMLRPC - 0.3-1 - 2018-08-17 - Github (duncantl/XMLRPC@add9496)"

Appendix, all the code:

install.packages("devtools")
devtools::install_github("swarm-lab/Rvision")
path_to_video <- "video_example.mp4"
path_to_video
library(Rvision)
my_video <- video(filename = path_to_video)
## video dimensions (height x width x frames):
dim(my_video)

# video codec:
codec(my_video)

# framerate:
fps(my_video)

# color space
colorspace(readFrame(my_video, 1))
v_height <- dim(my_video)[1]
v_width <- dim(my_video)[2]
htmltools::tags$video(src = path_to_video, type = "video/mp4", autoplay = NA, 
    controls = "controls", height = v_height, width = v_width)
# resheaping all the video frames in a 2D matrix
v <- list()
num_frames <- nframes(my_video)
for (i in 1:num_frames) {
    if (i/100 == i%/%100) 
        cat("\nframe:", i, "/", num_frames)
    frame <- readFrame(my_video, i)
    frame <- changeColorSpace(frame, "GRAY")
    matrix_frame <- as.matrix(frame)
    vector_frame <- as.vector(matrix_frame)
    v[[i]] <- vector_frame

    #---------------------------------------------------------------------
    # if you want to display each frame procesed activate the 3 lines below
    # frame_temp <- frame drawText(frame_temp, paste0(i,'/',num_frames), 2, 2,
    # font_scale=1.2, thickness=2, color='red') Rvision::display(frame_temp,
    # window_name='display', height=200, width=300)
}
Rvision::destroyDisplay("display")  # works with: Rvision::display()
release(my_video)
M <- do.call("rbind", v)

# matrix dimensions (frames | height x width):
dim(M)
# matrix memory size:
print(object.size(M), units = "Mb")
# plot all the video frames in a 2D matrix
library(imager)
D <- as.cimg(M)
plot(D, interpolate = FALSE, xlab = "frame", ylab = "height_x_width")
plot(c(0, dim(M)[1]), c(0, dim(M)[2]), type = "n", xlab = "frame", ylab = "height_x_width")
rasterImage(as.raster(D), 0, 0, dim(M)[1], dim(M)[2], interpolate = FALSE)
# recovering a frame
REC <- matrix(M[250, ], nrow = v_width, ncol = v_height, byrow = TRUE)
library(imager)
plot(as.cimg(REC))
library(rsvd)
rSVD_k2 <- rsvd(M, k = 2)
dim(rSVD_k2$u)
NROW(rSVD_k2$d)
dim(rSVD_k2$v)
# reconstructing video using decomposed matrix
rSVD_k2_re <- rSVD_k2$u %*% diag(rSVD_k2$d) %*% t(rSVD_k2$v)
plot(c(0, dim(rSVD_k2_re)[1]), c(0, dim(rSVD_k2_re)[2]), type = "n", xlab = "frame", 
    ylab = "height_x_width")
rasterImage(as.raster(as.cimg(rSVD_k2_re)), 0, 0, dim(rSVD_k2_re)[1], dim(rSVD_k2_re)[2], 
    interpolate = FALSE)
# M_original
dim(M)

# M_recovery
dim(rSVD_k2_re)
# recovering a frame
FRAME <- 250
REC_re_bg <- rSVD_k2_re[FRAME, ]
REC_re_mov <- M[FRAME, ] - rSVD_k2_re[FRAME, ]
REC_re_mov <- REC_re_mov - min(REC_re_mov)  # to avoid negative values
FRAME_im <- as.cimg(matrix(c(REC_re_bg, REC_re_mov), nrow = v_width * 2, ncol = v_height, 
    byrow = TRUE))
plot(as.cimg(FRAME_im))
# changing k value
FRAME <- 250
par(mfrow = c(3, 2), mar = c(2, 2, 0, 0))
for (i in c(2, 4, 6, 10, 50, 100)) {
    rSVD_ <- rsvd(M, k = i)
    rSVD_re <- rSVD_$u %*% diag(rSVD_$d) %*% t(rSVD_$v)
    REC_re_bg <- matrix(rSVD_re[FRAME, ], nrow = v_width, ncol = v_height, byrow = TRUE)
    REC_re_mov <- matrix(M[FRAME, ] - rSVD_re[FRAME, ], nrow = v_width, ncol = v_height, 
        byrow = TRUE)
    REC_re_mov <- REC_re_mov - min(REC_re_mov)  # to avoid negative values
    plot(imappend(list(as.cimg(REC_re_bg), as.cimg(REC_re_mov)), axis = "x"), 
        axes = FALSE)
    text(10, 180, paste0("k = ", i), pos = 4, col = "white", cex = 3)
}
library(rsvd)
# rrpca: robust principal component analysis
t1 <- Sys.time()
rPCA_k2 <- rrpca(M, maxiter = 15, tol = 1e-09, trace = TRUE)
cat("\n", round(difftime(Sys.time(), t1, units = "mins"), 1), " mins")
# recovering a split frame
FRAME <- 250
REC2_re_bg <- matrix(rPCA_k2$L[FRAME, ], nrow = v_width, ncol = v_height, byrow = TRUE)
REC2_re_mov <- matrix(rPCA_k2$S[FRAME, ], nrow = v_width, ncol = v_height, byrow = TRUE)
REC2_re_mov <- REC2_re_mov - min(REC2_re_mov) # to avoid negative values plot(imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)), axis = "x")) # creating a video #---------------------- # needs to install: ffmpeg: you can use '> brew install ffmpeg' on Terminal
# on MAC OS X creating frames
for (i in 1:dim(rPCA_k2$S)[1]) {
    if (i/100 == i%/%100) 
        cat("\nframe:", i, "/", num_frames)
    # FRAME_im <- as.cimg(matrix(c(rPCA_k2$L[i,],rPCA_k2$S[i,]), nrow=v_width*2,
    # ncol=v_height, byrow=TRUE))
    FRAME_im_bg <- matrix(rPCA_k2$L[i, ], nrow = v_width, ncol = v_height, byrow = TRUE)
    FRAME_im_mov <- matrix(rPCA_k2$S[i, ], nrow = v_width, ncol = v_height, 
        byrow = TRUE)
    FRAME_im_mov <- FRAME_im_mov - min(FRAME_im_mov)  # to avoid negative values
    FRAME_im <- imappend(list(as.cimg(FRAME_im_bg), as.cimg(FRAME_im_mov)), 
        axis = "x")
    save.image(FRAME_im, file = paste0("./video_out/bw_", i, ".png"))
}
# creating video file '-pix_fmt yuv420p' args is for Apple Quicktime support
library(imager)
make.video(dname = "./video_out/", fname = "./video_out/video_decomposed_bw.mp4", 
    pattern = "bw_%d.png", fps = 7, extra.args = "-pix_fmt yuv420p")
htmltools::tags$video(src = "./video_out/video_decomposed_bw.mp4", type = "video/mp4", 
    autoplay = NA, controls = "controls", height = v_height, width = v_width * 
        2)
library(Rvision)
my_video <- video(filename = path_to_video)
v <- list()
num_frames <- nframes(my_video)
for (i in 1:num_frames) {
    if (i/100 == i%/%100) 
        cat("\nframe:", i, "/", num_frames)
    frame <- readFrame(my_video, i)
    matrix_frame <- as.matrix(frame, nrow = 240, ncol = 320, byrow = FALSE)
    vector_frame <- as.vector(aperm(matrix_frame, c(2, 1, 3)))
    v[[i]] <- vector_frame
}
M <- do.call("rbind", v)
release(my_video)

# matrix dimensions (frames | height x width x 3color):
dim(M)
# plot all the video
dim(M)
library(imager)
M2 <- array(M, dim = c(dim(M)[1], v_width * v_height, 3))
dim(M2)
D <- as.cimg(M2, x = dim(M2)[1], y = dim(M2)[2], cc = 3)
plot(c(0, dim(M2)[1]), c(0, dim(M2)[2]), type = "n", xlab = "frame", ylab = "height_x_width")
rasterImage(as.raster(D), 0, 0, dim(M2)[1], dim(M2)[2], interpolate = FALSE)
# recovering an original frame
REC <- array(M2[250, , ], dim = c(v_width, v_height, 1, 3))
dim(REC)
library(imager)
plot(as.cimg(REC))
library(rsvd)
# rrpca: robust principal component analysis
t1 <- Sys.time()
rPCA_k2 <- rrpca(M, maxiter = 15, tol = 1e-09, trace = TRUE)
cat("\n", round(difftime(Sys.time(), t1, units = "mins"), 1), " mins")
# recovering a split frame
FRAME <- 250
REC2_re_bg <- array(rPCA_k2$L[250, ], dim = c(v_width, v_height, 1, 3))
REC2_re_mov <- array(rPCA_k2$S[250, ], dim = c(v_width, v_height, 1, 3))
REC2_re_mov <- REC2_re_mov - min(REC2_re_mov)  # to avoid negative values
plot(imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)), axis = "x"))
# creating a video
#----------------------
# creating frames
library(abind)
for (i in 1:dim(rPCA_k2$S)[1]) {
    if (i/100 == i%/%100) 
        cat("\nframe:", i, "/", num_frames)
    REC2_re_bg <- array(rPCA_k2$L[i, ], dim = c(v_width, v_height, 1, 3))
    REC2_re_mov <- array(rPCA_k2$S[i, ], dim = c(v_width, v_height, 1, 3))
    REC2_re_mov <- REC2_re_mov - min(REC2_re_mov)  # to avoid negative values
    # FRAME_im <- as.cimg(array(abind(X,Y,along=1),
    # dim=c(v_width*2,v_height,1,3)))
    FRAME_im <- imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)), axis = "x")
    save.image(FRAME_im, file = paste0("./video_out/color_", i, ".png"))
}
# creating video file '-pix_fmt yuv420p' args is for Apple Quicktime support
library(imager)
make.video(dname = "./video_out/", fname = "./video_out/video_decomposed_color.mp4", 
    pattern = "color_%d.png", fps = 7, extra.args = "-pix_fmt yuv420p")
htmltools::tags$video(src = "./video_out/video_decomposed_color.mp4", type = "video/mp4", 
    autoplay = NA, controls = "controls", height = v_height, width = v_width * 
        2)

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

install.packages("devtools")

devtools::install_github("swarm-lab/Rvision")

path_to_video <- "video_example.mp4"

path_to_video

library(Rvision)

my_video <- video(filename = path_to_video)

## video dimensions (height x width x frames):

dim(my_video)

# video codec:

codec(my_video)

# framerate:

fps(my_video)

# color space

colorspace(readFrame(my_video, 1))

v_height <- dim(my_video)[1]

v_width <- dim(my_video)[2]

htmltools::tags$video(src = path_to_video, type = "video/mp4", autoplay = NA,

controls = "controls", height = v_height, width = v_width)

# resheaping all the video frames in a 2D matrix

v <- list()

num_frames <- nframes(my_video)

for (i in 1:num_frames) {

if (i/100 == i%/%100)

cat("\nframe:", i, "/", num_frames)

frame <- readFrame(my_video, i)

frame <- changeColorSpace(frame, "GRAY")

matrix_frame <- as.matrix(frame)

vector_frame <- as.vector(matrix_frame)

v[[i]] <- vector_frame

#---------------------------------------------------------------------

# if you want to display each frame procesed activate the 3 lines below

# frame_temp <- frame drawText(frame_temp, paste0(i,'/',num_frames), 2, 2,

# font_scale=1.2, thickness=2, color='red') Rvision::display(frame_temp,

# window_name='display', height=200, width=300)

}

Rvision::destroyDisplay("display") # works with: Rvision::display()

release(my_video)

M <- do.call("rbind", v)

# matrix dimensions (frames | height x width):

dim(M)

# matrix memory size:

print(object.size(M), units = "Mb")

# plot all the video frames in a 2D matrix

library(imager)

D <- as.cimg(M)

plot(D, interpolate = FALSE, xlab = "frame", ylab = "height_x_width")

plot(c(0, dim(M)[1]), c(0, dim(M)[2]), type = "n", xlab = "frame", ylab = "height_x_width")

rasterImage(as.raster(D), 0, 0, dim(M)[1], dim(M)[2], interpolate = FALSE)

# recovering a frame

REC <- matrix(M[250, ], nrow = v_width, ncol = v_height, byrow = TRUE)

library(imager)

plot(as.cimg(REC))

library(rsvd)

rSVD_k2 <- rsvd(M, k = 2)

dim(rSVD_k2$u)

NROW(rSVD_k2$d)

dim(rSVD_k2$v)

# reconstructing video using decomposed matrix

rSVD_k2_re <- rSVD_k2$u %*% diag(rSVD_k2$d) %*% t(rSVD_k2$v)

plot(c(0, dim(rSVD_k2_re)[1]), c(0, dim(rSVD_k2_re)[2]), type = "n", xlab = "frame",

ylab = "height_x_width")

rasterImage(as.raster(as.cimg(rSVD_k2_re)), 0, 0, dim(rSVD_k2_re)[1], dim(rSVD_k2_re)[2],

interpolate = FALSE)

# M_original

dim(M)

# M_recovery

dim(rSVD_k2_re)

# recovering a frame

FRAME <- 250

REC_re_bg <- rSVD_k2_re[FRAME, ]

REC_re_mov <- M[FRAME, ] - rSVD_k2_re[FRAME, ]

REC_re_mov <- REC_re_mov - min(REC_re_mov) # to avoid negative values

FRAME_im <- as.cimg(matrix(c(REC_re_bg, REC_re_mov), nrow = v_width * 2, ncol = v_height,

byrow = TRUE))

plot(as.cimg(FRAME_im))

# changing k value

FRAME <- 250

par(mfrow = c(3, 2), mar = c(2, 2, 0, 0))

for (i in c(2, 4, 6, 10, 50, 100)) {

rSVD_ <- rsvd(M, k = i)

rSVD_re <- rSVD_$u %*% diag(rSVD_$d) %*% t(rSVD_$v)

REC_re_bg <- matrix(rSVD_re[FRAME, ], nrow = v_width, ncol = v_height, byrow = TRUE)

REC_re_mov <- matrix(M[FRAME, ] - rSVD_re[FRAME, ], nrow = v_width, ncol = v_height,

byrow = TRUE)

REC_re_mov <- REC_re_mov - min(REC_re_mov) # to avoid negative values

plot(imappend(list(as.cimg(REC_re_bg), as.cimg(REC_re_mov)), axis = "x"),

axes = FALSE)

text(10, 180, paste0("k = ", i), pos = 4, col = "white", cex = 3)

}

library(rsvd)

# rrpca: robust principal component analysis

t1 <- Sys.time()

rPCA_k2 <- rrpca(M, maxiter = 15, tol = 1e-09, trace = TRUE)

cat("\n", round(difftime(Sys.time(), t1, units = "mins"), 1), " mins")

# recovering a split frame

FRAME <- 250

REC2_re_bg <- matrix(rPCA_k2$L[FRAME, ], nrow = v_width, ncol = v_height, byrow = TRUE)

REC2_re_mov <- matrix(rPCA_k2$S[FRAME, ], nrow = v_width, ncol = v_height, byrow = TRUE)

REC2_re_mov <- REC2_re_mov - min(REC2_re_mov) # to avoid negative values plot(imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)), axis = "x")) # creating a video #---------------------- # needs to install: ffmpeg: you can use '> brew install ffmpeg' on Terminal

# on MAC OS X creating frames

for (i in 1:dim(rPCA_k2$S)[1]) {

if (i/100 == i%/%100)

cat("\nframe:", i, "/", num_frames)

# FRAME_im <- as.cimg(matrix(c(rPCA_k2$L[i,],rPCA_k2$S[i,]), nrow=v_width*2,

# ncol=v_height, byrow=TRUE))

FRAME_im_bg <- matrix(rPCA_k2$L[i, ], nrow = v_width, ncol = v_height, byrow = TRUE)

FRAME_im_mov <- matrix(rPCA_k2$S[i, ], nrow = v_width, ncol = v_height,

byrow = TRUE)

FRAME_im_mov <- FRAME_im_mov - min(FRAME_im_mov) # to avoid negative values

FRAME_im <- imappend(list(as.cimg(FRAME_im_bg), as.cimg(FRAME_im_mov)),

axis = "x")

save.image(FRAME_im, file = paste0("./video_out/bw_", i, ".png"))

}

# creating video file '-pix_fmt yuv420p' args is for Apple Quicktime support

library(imager)

make.video(dname = "./video_out/", fname = "./video_out/video_decomposed_bw.mp4",

pattern = "bw_%d.png", fps = 7, extra.args = "-pix_fmt yuv420p")

htmltools::tags$video(src = "./video_out/video_decomposed_bw.mp4", type = "video/mp4",

autoplay = NA, controls = "controls", height = v_height, width = v_width *

library(Rvision)

my_video <- video(filename = path_to_video)

v <- list()

num_frames <- nframes(my_video)

for (i in 1:num_frames) {

if (i/100 == i%/%100)

cat("\nframe:", i, "/", num_frames)

frame <- readFrame(my_video, i)

matrix_frame <- as.matrix(frame, nrow = 240, ncol = 320, byrow = FALSE)

vector_frame <- as.vector(aperm(matrix_frame, c(2, 1, 3)))

v[[i]] <- vector_frame

}

M <- do.call("rbind", v)

release(my_video)

# matrix dimensions (frames | height x width x 3color):

dim(M)

# plot all the video

dim(M)

library(imager)

M2 <- array(M, dim = c(dim(M)[1], v_width * v_height, 3))

dim(M2)

D <- as.cimg(M2, x = dim(M2)[1], y = dim(M2)[2], cc = 3)

plot(c(0, dim(M2)[1]), c(0, dim(M2)[2]), type = "n", xlab = "frame", ylab = "height_x_width")

rasterImage(as.raster(D), 0, 0, dim(M2)[1], dim(M2)[2], interpolate = FALSE)

# recovering an original frame

REC <- array(M2[250, , ], dim = c(v_width, v_height, 1, 3))

dim(REC)

library(imager)

plot(as.cimg(REC))

library(rsvd)

# rrpca: robust principal component analysis

t1 <- Sys.time()

rPCA_k2 <- rrpca(M, maxiter = 15, tol = 1e-09, trace = TRUE)

cat("\n", round(difftime(Sys.time(), t1, units = "mins"), 1), " mins")

# recovering a split frame

FRAME <- 250

REC2_re_bg <- array(rPCA_k2$L[250, ], dim = c(v_width, v_height, 1, 3))

REC2_re_mov <- array(rPCA_k2$S[250, ], dim = c(v_width, v_height, 1, 3))

REC2_re_mov <- REC2_re_mov - min(REC2_re_mov) # to avoid negative values

plot(imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)), axis = "x"))

# creating a video

#----------------------

# creating frames

library(abind)

for (i in 1:dim(rPCA_k2$S)[1]) {

if (i/100 == i%/%100)

cat("\nframe:", i, "/", num_frames)

REC2_re_bg <- array(rPCA_k2$L[i, ], dim = c(v_width, v_height, 1, 3))

REC2_re_mov <- array(rPCA_k2$S[i, ], dim = c(v_width, v_height, 1, 3))

REC2_re_mov <- REC2_re_mov - min(REC2_re_mov) # to avoid negative values

# FRAME_im <- as.cimg(array(abind(X,Y,along=1),

# dim=c(v_width*2,v_height,1,3)))

FRAME_im <- imappend(list(as.cimg(REC2_re_bg), as.cimg(REC2_re_mov)), axis = "x")

save.image(FRAME_im, file = paste0("./video_out/color_", i, ".png"))

}

# creating video file '-pix_fmt yuv420p' args is for Apple Quicktime support

library(imager)

make.video(dname = "./video_out/", fname = "./video_out/video_decomposed_color.mp4",

pattern = "color_%d.png", fps = 7, extra.args = "-pix_fmt yuv420p")

htmltools::tags$video(src = "./video_out/video_decomposed_color.mp4", type = "video/mp4",

autoplay = NA, controls = "controls", height = v_height, width = v_width *

Share it!:

un blog sobre "data science"

Descomposición de matrices, compresión de imágenes, y cómo quitar el fondo de vídeos usando R [PARTE 2]

Transformando el archivo de vídeo

Randomized SVD (video B/N)

Randomized robust PCA (video B/N)

Randomized robust PCA (vídeo a color)

Leave a Reply Cancel reply