Moving Object Detection by Robust PCA solved via a Linearized Symmetric Alternating Direction Method

Project by Ali Harakeh and Jungwook Lee

Based on Paper by Guyon, Charles, Thierry Bouwmans, and El-Hadi Zahzah [1]

Problem formulation
Proposed solution
Data sources
Solution
Visualization of the results
Analysis and conclusions
References

Problem formulation

The given problem is to assign the pixels of images with "moving" and "static" labels, given a of set of $K$ successive video frames with image size of $m$ by $n$ (pixels) captured with a static camera. There are no assumptions on the characteristics of the initial frame (it can contain moving objects, it is not limited to only background) or to the size and shape of the moving objects. Hence, the problem cannot be solved by background subtraction.

Proposed solution

Robust PCA Problem

Let $D$ be a matrix of all of the pixels in the given video frames. The matrix $D$ is generated as the following:

Unfold the each frame by changing it from an by matrix to a ( $\times$ ) x 1 column vector.
Concatenate all frames to get an ( x ) x matrix $\in R^{(n\times m) \times K}$ .

The observation matrix $D$ is assumed to be represented as $D = L + S$ , where $L$ is a Low Rank matrix and S is a Sparse Matrix with a small portion of non-zero entries. Intuitively, the low rank matrix L represents the background. This is because the value of the each element in every row of $D$ should remain the same if the camera is static. However, this is not always true. When a moving object visits the scene, some values change. These values are represented by the Sparse Matrix S.

LSADM Optimization Formulation

The linearized symmetric alternating direction method aims to solve a constrained optimization problem by solving a series of unconstrained problem with a penalty term to the objective. The original RPCA problem can be formulated as the following:

$min\ (f(x)+g(y))\ s.t.\ x-y = 0$

The following is the augmented Lagrangian function for ADM, where the penalty function to the regular Lagrangian formaultion is added:

$L_{\mu} (x,y;\lambda) = f(x) + g(y) + < \lambda, x-y > + \frac{1}{2\mu}||x-y||^2$

Now one can solve the following optimization problem, which is solved by minimizing with respect to $x$ and $y$ sequentially:

$(x^k, y^k) = argmin_{x,y} L_{\mu}(x,y;\lambda^{k})$

The Symmetric ADM formulation differs by allowing $\lambda$ to be updated after minimizing the augmented lagragrangian w.r.t to $x$ and once more after $y$ .

$x^{k+1} = argmin_{x} L_{\mu}(x,y^k;\lambda^{k})$

$\lambda^{k+\frac{1}{2}} = \lambda^{k} + \frac{1}{\mu}(x^{k+1}-y^k)$

$y^{k+1} = argmin_{y} L_{\mu}(x^{k+1},y^k;\lambda^{k+{\frac{1}{2}}})$

$\lambda^{k+1} = \lambda^{k+\frac{1}{2}} + \frac{1}{\mu}(x^{k+1}-y^{k+1})$

By assuming differentiability, it is possible to linearize the dual variable update functions as such:

$\lambda^{k+\frac{1}{2}} = \nabla f(x^{k+1})$

$\lambda^{k+1} = -\nabla g(y^{k+1})$

The following linearized dual variable can be used to replace the augmented lagrangian function as such.

$L_{f} (x,y) = f(x) + g(y) + < \nabla f(x), x-y > + \frac{1}{2\mu}||x-y||^2$

$L_{g} (x,y) = f(x) + g(y) + < \nabla g(y), x-y > + \frac{1}{2\mu}||x-y||^2$

One can minimizing the above two function to obtain $x^{k+1}$ and $y^{k+1}$ .

RPCA with LSADM

To extract the two matrices for the RPCA problem, the following constrained convex minimization problem is proposed:

$min. \hspace{0.1in} ||L||_{*} + \lambda||S||_1$

$s.t. \hspace{0.1in} D = L + S$

The nuclear norm $||.||_{*}$ is used as a surrogate (convex relaxation) to the non-convex $rank(.)$ function. Minimizing the nuclear norm will generate a low rank solution. For sparsity, minimization is done over the L-1 norm of S. Lambda is a balancing parameter.

Both norms are convex, and the sum of two convex functions is convex. Hence, this is a convex optimization problem that can be solved using any convex solver.

The nuclear norm and the L1 norm are replaced by their smooth approximation derived using nestrov's smoothing technique to make it differentiable for LSADM:

$||L||_* \approx f_\sigma(L) = max(<L,W> - \frac{\sigma}{2}||W||^2_F)\hspace{0.05in}|\hspace{0.05in} ||W||_F \leq 1$

$||S||_1 \approx g_\sigma(S) = max(<S,Z> - \frac{\sigma}{2}||Z||^2_2 \hspace{0.05in}|\hspace{0.05in} ||Z||_\infty \leq \rho)$

Where $W$ is the singular value decomposition of ( $L/\sigma$ ). Alternatively,

$W_{\sigma}(L) = UDiag(min(\gamma, 1))V^T$

And $Z$ is,

$Z_{\sigma}(S) = min(\rho, max(\frac{S}{\sigma}, -\rho))$

The final minimization problem becomes, which is solved using the LSADM formulation:

$min. \hspace{0.1in} f_\sigma(L) + g_\sigma(S)$

$s.t. \hspace{0.1in} D = L + S$

Data sources

We use the data from the Wallflower dataset [2] for demonstration purposes. The dataset can be obtained from the following link: https://www.microsoft.com/en-us/research/project/test-images-for-wallflower-paper/

Solution

The following section explains the implementation of the algorithm for moving object detection by RPCA solved via LSADM.

Data Preprocessing

In this section, the color image is converted to grayscale image and saved for use in the algorithm. The D matrix is also generated in this step.

% Clear all data
close all; clear; clc;

% Sequence with num_frames images that is contained in D.
% define the input matrix
data_start = 0;
data_end = 1000;
num_frames = data_end-data_start;

img_h = 120;
img_w = 160;

% Define input directory
gray_dir = 'dataset/MovedObject_gray/b%05d.bmp';

% Read images, unfold them to a img_h*img_width vector, and concatenate
% them into matrix D.
D=[];
for i = 1:num_frames
    data_path_gray = sprintf(gray_dir, data_start+i-1);
    I = imread(data_path_gray);
    img = reshape(I, 1, [])';
    D = [D,img];
end
D=double(D);

Parameter Initialization

The optimization parameters are generated and initialized in the following section.

L_0 = D; % L = background
S_0 = zeros(size(D)); % S = moving object matrix
Y = zeros(size(D)); % Y = lagrange multipliers
mu = 30/norm(sign(D),2);

% Nesterov's smoothing terms, need to be larger than 0
sigma = 1;
rho = 1;

% Iterator
max_iter = 2;

% Set the initial values
L = L_0;
S = S_0;

% Stopping Criterion
e = 1e-7;

Optimization Algorithm

The following section implements the LSADM optimization.

for k = 1:max_iter
    % Isolate for L where X = L/sigma
    X = mu.*Y - S + D;

    % Nesterov's smoothing approximation of nuclear norm
    [U, G, V] = svd(X, 'econ');
    G = diag(G);
    G = G - mu.*(G./max(G, mu+sigma));

    % Save intermediate Low rank matrix
    L_k = L;

    % Reconstruct L matrix using updated singular values
    L = U*diag(G)*V';

    % Evaluate Lagrangian for f(L)
    Y = Y - (L+S-D)./mu;

    % Evaluate update S matrix with new Dual variable
    B = Y - (L-D)./mu;
    S_k = S; % Save intermediate Sparse matrix

    % Nesterov's smoothing for l1 norm
    S = mu.*B - mu.*min(rho, max(-rho, mu.*B./(sigma+mu)));

    % Lagrangian update for g(S)
    Y = Y - (L-S-D)/mu;

    % stop if stopping criteria is met
    d = norm((L+S-D)./mu, 'fro')/norm(D, 'fro');

    if (d < e)
        L = L_k;
        S = S_k;
        break;
    end
end
L = L_k;
S = S_k;

Manual Thresholding S

The thresholding method was not mentioned in the paper and is not in the scope of the given problem. And thus, a threshold is manually selected to visualize the segmentation results.

% Generate mask
F=abs(S);
F(F<0.00017)=0;
F(F~=0) = 1;

Visualization of the results

Here the result of the optimization is shown. 2 video frame intances are picked to demonstrate the results. The binary segmentation mask and the sparse matrix containing the moving object is shown. To prove robustness multiple scenes are used to demonstrate the capability of the algorithm. Lastly, the final figure proves indication that the strict equality contraint is kept throughout the optimization.

Moved Object Scene

In this scene, a man moves into a room and sits down in the chair.

figure(1)
subplot(3,2,1);
Frame12 = D(:,650);
imshow(mat2gray(reshape(Frame12,120,160)))
title('Video Frame at 650');
ylabel('Original')

subplot(3,2,3);
Frame12=reshape(S(:,650),120,160);
imshow(mat2gray(reshape(Frame12,120,160)))
ylabel('Sparse Matrix')

subplot(3,2,5);
Frame12=reshape(F(:,650),120,160);
imshow(mat2gray(Frame12))
ylabel('Binary Mask')

subplot(3,2,2);
Frame12 = D(:,700);
Frame12 = mat2gray(reshape(Frame12,120,160));
imshow(Frame12)
title('Video Frame at 700');

subplot(3,2,4);
Frame12=mat2gray(reshape(S(:,700),120,160));
imshow(Frame12)

subplot(3,2,6);
Frame12=mat2gray(reshape(F(:,700),120,160));
imshow(Frame12)

Camouflage Scene

In this scene, a man moves infront a computer screen.

openfig('camouflage.fig');

Lightswitch Scene

In this scene, a man moves into the room to turn the light off the room.

openfig('lightswitch.fig');

Image Reconstruction Constraint

Here, the low rank and the sparse matrix is added to give proof that the equality constraints are consistent.

subplot(2,1,1);
Frame12 = D(:,650);
imshow(mat2gray(reshape(Frame12,120,160)))
title('Video Frame at 650');
ylabel('Original')

subplot(2,1,2);
Frame12=reshape(F(:,650),120,160);
FrameL=reshape(L(:,650),120,160);
imshow(mat2gray(Frame12+FrameL))
ylabel('L + S');

Analysis and conclusions

We replicated the authors' work and found that our results coincide with the results shown in the paper. However, the threshold used on matrix S is not stated in the paper and we had to estimate it empiracly. Also, we found that the stopping criterion given does not work, d never reaches a value less than 1e-7. In fact, d oscillates around two values, barely changing. It has to be noted that this might be an implementation issue on our side.

The solution has many positive points. For example, it does not need the first frame/frames to be free of moving objects. It is very fast to converge, typically requiring only 2 iterations and thus is super fast. However, the method has a huge drawback, it relies on an empirically determined threshold that is not fixed. Changing the sequence of images require fine tuning the threshold manually.

Lastly in terms of object detection, the current method does not work best when the moving object breaks the sparsity assumption. If the moving object occupies a significant area of the image, the RPCA problem is not well-suited for solving such a problem.

References

Guyon, Charles, Thierry Bouwmans, and El-Hadi Zahzah. "Moving object detection by robust pca solved via a linearized symmetric alternating direction method." International Symposium on Visual Computing. Springer Berlin Heidelberg, 2012.
K. Toyama, J. Krumm, B. Brumitt, and B. Meyers, “Wallflower: Principles and practice of background maintenance,” in International Conference on Computer Vision (ICCV), vol. 1, 1999, pp. 255–261.