ChArUco Pose Estimation
A ChArUco board is basically a chessboard combined with ArUco markers.
An ArUco marker is a square fiducial marker with a binary ID pattern inside it:
1
2
3
4
5
6
marker id=51
+---------+
| black |
| pattern |
| id bits |
+---------+
An ArUco board lets OpenCV detect square marker IDs and corners directly. A ChArUco board goes one step further: OpenCV detects the ArUco markers first, then uses the known board layout to infer the chessboard intersection corners between the markers. Its pipeline is roughly:
- detect ArUco markers
- use marker IDs to understand board layout
- interpolate/find chessboard corners
- use those corners for calibration or pose estimation
This matters because marker corners are useful, but they are not always the most accurate features. The marker border is thick, and the internal binary pattern is designed for robust ID decoding, not necessarily for the best subpixel corner localization.
ChArUco corners are chessboard intersections, so they are usually better 2D features for pose estimation.
The goal of the whole pipeline is to build accurate 2D-to-3D correspondences:
1
known 3D board point <--> detected 2D image point
Then solvePnP estimates the pose that best explains those correspondences.
Step 1 - Detect ArUco Corners
When OpenCV detects ArUco markers, it returns marker IDs and four image corners per marker:
1
2
3
4
5
6
7
marker_corners, marker_ids, rejected = detector.detectMarkers(gray)
# marker_ids:
# [51, 56, 58, ...]
# marker_corners:
# 4 image points per marker
Each detected marker gives four 2D pixel corners:
1
2
3
4
id=51 corner 0
id=51 corner 1
id=51 corner 2
id=51 corner 3
For ChArUco, the detector gives both the raw ArUco detections and the interpolated ChArUco corners:
1
2
3
4
5
6
7
8
9
charuco_corners, charuco_ids, marker_corners, marker_ids = (
charuco_detector.detectBoard(gray)
)
# marker_corners, marker_ids:
# raw ArUco marker detections
# charuco_corners, charuco_ids:
# detected/interpolated ChArUco chessboard corners
The runtime flow is:
1
2
3
4
5
6
7
8
9
10
11
camera image
↓
detect ArUco markers
↓
use marker IDs to determine board position/orientation
↓
interpolate visible ChArUco chessboard corners
↓
match ChArUco corner IDs to known 3D board points
↓
solve board pose with solvePnP / solvePnPRansac
So ChArUco depends on ArUco. You usually do not manually choose “ArUco or ChArUco” at runtime. If you call CharucoDetector.detectBoard(), the ArUco detection step is part of the ChArUco detection process.
Step 2 - Matching Image Points to Board Points
For pose estimation, the important outputs are:
1
2
charuco_ids
charuco_corners
Each ChArUco corner ID corresponds to a known 3D point on the physical board.
At runtime, this call:
1
2
3
4
object_points, image_points = self.board.matchImagePoints(
charuco_corners,
charuco_ids,
)
means:
1
2
3
For each detected ChArUco corner ID:
find the known 3D location of that corner on the board
pair it with the detected 2D pixel location in the image
The result is a matched list:
1
object_points[i] <--> image_points[i]
For example:
1
2
3
4
5
object_points[0] = [0.045, 0.045, 0.000] meters
image_points[0] = [612.3, 381.7] pixels
object_points[1] = [0.090, 0.045, 0.000] meters
image_points[1] = [655.8, 379.9] pixels
The object points are 3D coordinates in the board frame. Since the board is planar, their z values are usually zero:
1
X_board = [x, y, 0]
The image points are 2D pixel coordinates:
1
u, v = pixel location in the camera image
Step 4 - What solvePnP Solves
Given:
1
2
3
4
3D board points
matching 2D image points
camera intrinsics K
distortion coefficients D
solvePnP estimates:
1
rvec, tvec
These describe the rigid transform from the board frame to the camera frame.
In equation form:
1
X_camera = R * X_board + t
Then OpenCV projects the camera-frame 3D point into the image:
1
X_board -> X_camera -> pixel coordinate
The solver chooses R and t so that the projected pixels land close to the detected ChArUco pixels:
1
projected pixel ≈ detected ChArUco pixel
This is the reprojection-error idea.
Pose Direction: The Important Part
The subtle but important part is the direction of the returned pose.
OpenCV returns:
1
rvec, tvec = transform from object/board frame to camera frame
Meaning:
1
X_camera = R @ X_board + t
Which is not directly:
1
the camera origin expressed in the board frame
What RANSAC Adds
Plain solvePnP() uses all point correspondences. If one or two ChArUco corners are wrong, the final pose can be pulled away from the correct solution.
solvePnPRansac() is more robust because it tries to find a pose that agrees with most of the points while rejecting outliers.
A simplified mental model is:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
best_pose = None
best_inliers = []
for trial in range(100):
# 1. sample a small subset of 2D/3D matches
# 2. solve a candidate pose from that subset
# 3. project all 3D points into the image
# 4. measure reprojection error for every point
# 5. keep points whose error is small enough
if len(inliers) > len(best_inliers):
best_pose = candidate_pose
best_inliers = inliers
return best_pose, best_inliers
The reprojectionError threshold controls how strict the inlier test is.
For example, if:
1
max_reproj_error_px = 2.0
then a point is considered an inlier if:
1
distance(projected_pixel, detected_pixel) < 2 px
Mental model:
1
2
3
4
5
solvePnP:
fit pose using all points
solvePnPRansac:
find a pose while rejecting bad point matches
What solvePnPRefineLM() Does
After RANSAC finds a good inlier set, you can refine the pose:
1
2
3
4
5
6
7
8
cv2.solvePnPRefineLM(
object_points[inlier_indices],
image_points[inlier_indices],
camera_matrix,
dist_coeffs,
rvec,
tvec,
)
This runs a local nonlinear optimization. It slightly adjusts rvec and tvec to reduce reprojection error on the inlier points.
Important Warning About Custom Board Frames
matchImagePoints() returns object points in the coordinate frame used by the OpenCV board object.
So if self.board is a normal OpenCV CharucoBoard, the returned object_points are in OpenCV’s board frame. They are not automatically converted into your custom bottom-left checker frame.
This means:
1
2
3
4
object_points, image_points = self.board.matchImagePoints(
charuco_corners,
charuco_ids,
)
does not automatically use your custom board_point_to_checker() convention. You can use matchImagePoints() as-is:
1
2
3
4
object_points, image_points = self.board.matchImagePoints(
charuco_corners,
charuco_ids,
)
Then the returned pose means:
1
OpenCV board frame -> camera frame
or:
1
T_camera_board_opencv
This is easiest if you only need a consistent board pose and do not care where the board origin is placed physically.
OpenCV Version Warning
If you generate and print a ChArUco board with one OpenCV version, then detect it with another, make sure the board pattern convention is compatible.
This is especially worth checking for boards generated before OpenCV 4.6.0. If the physical printed board was generated with the older convention, you may need to enable the legacy pattern setting when constructing the board.
The practical rule is:
1
The printed board layout and the OpenCV board object must agree.
If they do not agree, marker detection may still work, but the ChArUco corner IDs or board geometry can be inconsistent, which can break pose estimation.