I have a calibrated camera (intrinsic matrix and distortion coefficients) and I want to know the camera position knowing some 3d points and their corresponding points in the image (2d points).

I know that

`cv::solvePnP`

could help me, and after reading this and this I understand that I the outputs of solvePnP`rvec`

and`tvec`

are the rotation and translation of the object in camera coordinate system.So I need to find out the camera rotation/translation in the world coordinate system.

From the links above it seems that the code is straightforward, in python:

`found,rvec,tvec = cv2.solvePnP(object_3d_points, object_2d_points, camera_matrix, dist_coefs) rotM = cv2.Rodrigues(rvec)[0] cameraPosition = -np.matrix(rotM).T * np.matrix(tvec)`

I don’t know python/numpy stuffs (I’m using C++) but this does not make a lot of sense to me:

- rvec, tvec output from solvePnP are 3×1 matrix, 3 element vectors
- cv2.Rodrigues(rvec) is a 3×3 matrix
- cv2.Rodrigues(rvec)[0] is a 3×1 matrix, 3 element vectors
- cameraPosition is a 3×1 * 1×3 matrix multiplication that is a.. 3×3 matrix. how can I use this in opengl with simple
`glTranslatef`

and`glRotate`

calls?

**Answer**

If with “world coordinates” you mean “object coordinates”, you have to get the inverse transformation of the result given by the pnp algorithm.

There is a trick to invert transformation matrices that allows you to save the inversion operation, which is usually expensive, and that explains the code in Python. Given a transformation `[R|t]`

, we have that `inv([R|t]) = [R'|-R'*t]`

, where `R'`

is the transpose of `R`

. So, you can code (not tested):

```
cv::Mat rvec, tvec;
solvePnP(..., rvec, tvec, ...);
// rvec is 3x1, tvec is 3x1
cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t(); // rotation of inverse
tvec = -R * tvec; // translation of inverse
cv::Mat T = cv::Mat::eye(4, 4, R.type()); // T is 4x4
T( cv::Range(0,3), cv::Range(0,3) ) = R * 1; // copies R into T
T( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // copies tvec into T
// T is a 4x4 matrix with the pose of the camera in the object frame
```

**Update:** Later, to use `T`

with OpenGL you have to keep in mind that the axes of the camera frame differ between OpenCV and OpenGL.

OpenCV uses the reference usually used in computer vision: X points to the right, Y down, Z to the front (as in this image). The frame of the camera in OpenGL is: X points to the right, Y up, Z to the back (as in the left hand side of this image). So, you need to apply a rotation around X axis of 180 degrees. The formula of this rotation matrix is in wikipedia.

```
// T is your 4x4 matrix in the OpenCV frame
cv::Mat RotX = ...; // 4x4 matrix with a 180 deg rotation around X
cv::Mat Tgl = T * RotX; // OpenGL camera in the object frame
```

These transformations are always confusing and I may be wrong at some step, so take this with a grain of salt.

Finally, take into account that matrices in OpenCV are stored in row-major order in memory, and OpenGL ones, in column-major order.

**Attribution***Source : Link , Question Author : nkint , Answer Author : ChronoTrigger*