問題描述
我想使用 OpenCV 做一些 Structure-from-Motion.到目前為止,我有 basicmatix 和 essentialmatrix.有了基本矩陣,我正在做 SVD 以獲得 R 和 T.
I want to do some Structure-from-Motion using OpenCV. So far I have the fundamentalmatix and the essentialmatrix. Having the essentialmatrix I am doing SVD for getting R and T.
我的問題是我有 2 個可能的 R 解決方案和 2 個可能的 T 解決方案,這導致整體姿勢有 4 個解決方案,其中 4 個解決方案中只有一個是正確的.如何找到正確的解決方案?
My problem is that I have 2 possible solutions for R and 2 possible solutions for T which leads to 4 solutions for the overall pose, where only one of the 4 solutions is the right one. How can I find the correct solution?
這是我的代碼:
private void calculateRT(Mat E, Mat R, Mat T){
Mat w = new Mat();
Mat u = new Mat();
Mat vt = new Mat();
Mat diag = new Mat(3,3,CvType.CV_64FC1);
double[] diagVal = {1,0,0,0,1,0,0,0,1};
diag.put(0, 0, diagVal);
Mat newE = new Mat(3,3,CvType.CV_64FC1);
Core.SVDecomp(E, w, u, vt, Core.DECOMP_SVD);
Core.gemm(u, diag, 1, vt, 1, newE);
Core.SVDecomp(newE, w, u, vt, Core.DECOMP_SVD);
publishProgress("U: " + u.dump());
publishProgress("W: " + w.dump());
publishProgress("vt:" + vt.dump());
double[] W_Values = {0,-1,0,1,0,0,0,0,1};
Mat W = new Mat(new Size(3,3), CvType.CV_64FC1);
W.put(0, 0, W_Values);
double[] Wt_values = {0,1,0-1,0,0,0,0,1};
Mat Wt = new Mat(new Size(3,3), CvType.CV_64FC1);
Wt.put(0,0,Wt_values);
Mat R1 = new Mat();
Mat R2 = new Mat();
// u * W * vt = R
Core.gemm(u, Wt, 1, vt, 1, R2);
Core.gemm(u, W, 1, vt, 1, R1);
publishProgress("R: " + R.dump());
// +- T (2 possible solutions for T)
Mat T1 = new Mat();
Mat T2 = new Mat();
// T = u.t
u.col(2).copyTo(T1);
publishProgress("T : " + T.dump());
Core.multiply(T, new Scalar(-1.0, -1.0, -1.0), T2);
// TODO Here I have to find the correct combination for R1 R2 and T1 T2
}
推薦答案
從兩個相機的基本矩陣重建相對歐幾里得姿勢時存在理論歧義.這種模糊性與以下事實有關:給定圖像中的 2D 點,經典針孔相機模型無法分辨相應的 3D 點是在相機前面還是在相機后面.為了消除這種歧義,您需要知道圖像中的一個點對應關系:因為這兩個 2D 點被假定為位于兩個相機前面的單個 3D 點的投影(因為它在兩個圖像中都可見),這將能夠選擇正確的 R 和 T.
There is a theoretical ambiguity when reconstructing the relative euclidian poses of two cameras from their fundamental matrix. This ambiguity is linked to the fact that, given a 2D point in an image, the classic pinhole camera model cannot tell whether the corresponding 3D point is in front of the camera or behind the camera. In order to remove this ambiguity, you need to know one point correspondence in the images: as these two 2D points are assumed to be the projections of a single 3D point lying in front of both cameras (since it is visible in both images), this will enable choosing the right R and T.
為此,C.Ressl (PDF).下面給出該方法的概要.我將用 x1 和 x2 表示兩個對應的 2D 點,用 K1 和 K2 表示兩個相機矩陣,用 E12 表示基本矩陣.
For that purpose, one method is explained in § 6.1.4 (p47) of the following PhD thesis: "Geometry, constraints and computation of the trifocal tensor", by C.Ressl (PDF). The following gives the outline of this method. I'll denote the two corresponding 2D points by x1 and x2, the two camera matrices by K1 and K2 and the essential matrix by E12.
我.計算基本矩陣 E12 = U * S * V'
的 SVD.如果 det(U) <0
設置 U = -U
.如果 det(V) <0
設置 V = -V
.
i. Compute the SVD of the essential matrix E12 = U * S * V'
. If det(U) < 0
set U = -U
. If det(V) < 0
set V = -V
.
二.定義 W = [0,-1,0;1,0,0;0,0,1]
,R2 = U * W * V'
和 T2 = U 的第三列
三.定義 M = [ R2'*T2 ]x
、X1 = M * inv(K1) * x1
和 X2 = M * R2' * inv(K2)* x2
四.如果 <代碼>X1(3) * X2(3) <0,設置R2 = U * W' * V'
并重新計算M
和X1
iv. If X1(3) * X2(3) < 0
, set R2 = U * W' * V'
and recompute M
and X1
v.如果 <代碼>X1(3) <代碼0 設置 T2 = -T2
六.定義 P1_E = K1 * [ I |0 ]
和 P2_E = K2 * [ R2 |T2]
符號 '
表示轉置,符號 [.]x
在步驟 iii 中使用.對應于斜對稱算子.在 3x1 向量上應用斜對稱算子 e = [e_1;e_2;e_3]
結果如下(參見 維基百科關于跨產品的文章):
The notation '
denotes the transpose and the notation [.]x
used in step iii. corresponds to the skew-symetric operator. Applying the skew-symmetric operator on a 3x1 vector e = [e_1; e_2; e_3]
results in the following (see the Wikipedia article on cross-product):
[e]x = [0,-e_3,e_2; e_3,0,-e_1; -e_2,e_1,0]
最后,請注意 T2
的范數將始終為 1,因為它是正交矩陣的列之一.這意味著您將無法恢復兩個攝像頭之間的真實距離.為此,您需要知道場景中兩點之間的真實距離,并將其考慮在內以計算相機之間的真實距離.
Finally, note that the norm of T2
will always be 1, since it is one of the column of an orthogonal matrix. This means that you won't be able to recover the true distance between the two cameras. For that purpose, you need to know the true distance between two points in the scene and take that into account to calculate the true distance between the cameras.
這篇關于基本矩陣分解:驗證 R 和 T 的四種可能解決方案的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!