Robotics - [Bugs - 1] SLAM Related "Epic Bugs"

G2O Optimization Vertex Updates, Compiler-Specific Bugs

Posted by Rico's Nerd Cluster on February 1, 2025

Below is a list of bugs that took me multiple hours, if not days, to troubleshoot and analyze. These are the “epic bugs” that are worth remembering for my career.

Epic Bug 1: G2O Optimization Didn’t Update Vertex

Summary

This bug took several hours (if not days) to debug. It appeared as if G2O was ignoring the optimization — the vertex (pose) remained unchanged despite running optimizer.optimize(1)

After ruling out common culprits like:

  • Vertex not being added correctly
  • Edges not referencing the right vertex
  • Incorrect Jacobian implementation
  • G2O configuration/setup issues

I eventually traced the issue to a subtle but critical misunderstanding in the error term formulation.

Context

In a point-to-line 2D ICP formulation, the error term e is typically calculated as the distance from a point to a line. The simplified (but effective) version of that is:

\[\begin{gather*} \begin{aligned} & e = ap_x + bp_y + c \end{aligned} \end{gather*}\]

Where a, b, and c define a line (ax + by + c = 0), and $(p_x, p_y)$ is the point.

In my case, the point came from source_cloud, expressed in the body frame. However, the line coefficients a, b, c were fit in the map frame, using nearest neighbors from the target cloud.

The Mistake

I precomputed the error term and stored it inside a struct:

1
_error[0] = point_line_data_vec_ptr_->at.error_;

And upstream, this was assigned as:

1
ret.at(idx).error_ = source_pt * point_coord; // Wrong!

The problem? This source_pt is in the body frame, and using it in the error term implies that optimization is being done relative to the body frame, not the map/submap frame. Because the error is now invariant to pose changes, optimization has no gradient — G2O doesn’t change the pose, even if the edges are correctly wired.

What Threw Me Off

  • Point-to-line distances are frame-invariant.
  • But scaled error terms like ap_x + bp_y + c are not.
  • That mistake causes the optimizer to think the current pose is already optimal — so it just stays put.

It was like optimizing with the body frame assumed to be the map frame — a silent bug with no crash or warning, just no progress.

The Fix

Don’t precompute error_ using body frame coordinates. Instead, compute e = ax + by + c dynamically in computeError() using the transformed map-frame point.

The corrected version is now:

1
2
3
4
5
6
7
8
9
10
11
12
13
class EdgeICP2D_PT2Line : public g2o::BaseUnaryEdge<1, double, VertexSE2> {
    .... 
    void computeError() override {
        auto *pose = static_cast<const VertexSE2 *>(_vertices[0]);
        double r   = source_scan_objs_ptr_->at(point_idx_).range;
        double a   = point_line_data_vec_ptr_->at(point_idx_).params_[0];
        double b   = point_line_data_vec_ptr_->at(point_idx_).params_[1];
        double c   = point_line_data_vec_ptr_->at(point_idx_).params_[2];
        double theta = source_scan_objs_ptr_->at(point_idx_).angle;
        Vec2d pw = pose->estimate() * Vec2d{r * cos(theta), r * sin(theta)};
        _error[0] = a * pw.x() + b * pw.y() + c;
    }
}

Reference to the fixed code

Lessons Learned

  • Optimized pipelines are hard to debug. To maximize vectorization, it’s tempting to parallely calculate and store intermediate results. However, if something goes wrong downstream, especially when we have a conceptual math error,it may stem from a silent assumption upstream.
  • Coordinate frames matter: Even when the math looks simple, subtle frame mismatches can render your optimizer useless.
  • Scaled point-to-line errors are not frame invariant: If you’re using ap_x + bp_y + c, you must express the point in the same frame as the line.
  • Verbose mode helps: G2O’s setVerbose(true) didn’t show errors, but the chi² staying constant was a hint that nothing was being optimized.

Epic Bug 2: Compiler Bugs

Here I’m not writing about “giant” bugs, but small tricky ones.

Cannot Find Overloaded Operators

I had a bug where an operator << is defined in namespace1. Because I spent most of my time developing within this namespace, I forgot that I should have included namespace1 in its test, where namespaces are clearly indicated.

Non-Dependent static_assert in if constexpr Always Fails In Older Compiler

In gcc 14.2, if constexpr can be evaluated properly. But in the snippet below, it cannot be evaluated properly in gcc 10.1. Here is a proposal for the fix in new compiler

Here is a code snippet, but I’m posting here anyways

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <type_traits>
inline constexpr bool always_false = false;

template <typename T>
inline constexpr bool templated_always_false = false;

// this compiles in gcc 14.2
template <typename Foo>
void my_func() {
    if constexpr (std::is_same_v<Foo, int>) {
        // do something
    } else {
        // This line is fine because it is dependent on a template parameter, which forces evaluation in if constexpr?. 
        // So use it in older compilers
        static_assert(templated_always_false<Foo>, "Unsupported Foo type");

        static_assert(false, "Unsupported Foo type");
    }
}
  • Use gcc --version to check your compiler’s version!

PCL Is A Worm Hole

  • PCL does not support in-place filtering. Use a tmp cloud instead:
    1
    2
    3
    4
    
      voxel_filter_.setInputCloud(local_map_);
      voxel_filter_.filter(*tmp_cloud_);
      // swap or copy the filtered result back into local_map_
      local_map_->swap(*tmp_cloud_);