Skip to content

Walkthrough

This module relies on:

  • https://github.com/leskovec/pyC_part.II

The structure is deliberately like:

  1. write a small C++ “science kernel” that uses external libraries (say Eigen + GSL)
  2. build it as a normal C++ library (so you can also call it from a C++ program)
  3. wrap the same library with pybind11 so Python can call it too
  4. use a small Python script to import the module and inspect the returned data

The key teaching point is not just “Eigen and GSL work”, but:

When you bind library code, you usually want to bind your own small API, not the whole library. Here, our API is a single function get_my_eigvals(n) that returns a small result struct.


1) The C++ API: a header that defines what we expose

The library header lib/lib.h defines:

  • a small struct result that stores outputs in STL containers
  • the function we want to call from both C++ and Python: get_my_eigvals(n)
#pragma once

#include <iostream>

// eigen: matrix and eigenvalues
#include <Eigen/Dense>
#include <Eigen/Eigenvalues>

// gsl: random number generator
#include <GSL/gsl_rng.h>

// struct to store the result
struct result {
    std::vector<double> eigval;
    std::vector<std::vector<double>> matrix;
};

// wrapper function that does all the work
result get_my_eigvals(const int n);

Why the result struct uses std::vector

Python doesn’t always know how to handle all types of containers; while it knows Eigen, it does not know GSL, so safe way for general use is to have pybind11 automatically convert STL containers to Python containers.

So we store:

  • eigval as std::vector<double> → becomes a Python list[float]
  • matrix as std::vector<std::vector<double>> → becomes a Python “list of lists”

This is a very common beginner-friendly strategy:

do your numerics in Eigen/GSL, but serialize the output into STL containers at the boundary.


2) The numerical kernel: Eigen + GSL in lib/lib.cxx

Here is the implementation. This file is where the “science work” happens.

#include "lib.h"


result get_my_eigvals(const int n){
    Eigen::MatrixXd A = Eigen::MatrixXd::Zero(n,n);

    // use GSL to generate random numbers
    gsl_rng * r = gsl_rng_alloc (gsl_rng_taus);
    gsl_rng_set(r, 1);

    // populate A with random numbers from GSL
    for (int i=0; i<n; i++){
        for (int j=0; j<n; j++){
            A(i,j) = gsl_rng_uniform (r);
        }
    }

    // make a self-adjoint matrix
    Eigen::MatrixXd B = A.transpose()*A;

    // solve the eigensystem
    Eigen::SelfAdjointEigenSolver<Eigen::MatrixXd> eigsys(B);

    // store the eigenvalues and the matrix to result and return
    // note, python won't know what to do with Eigen Vectors and Matrices,
    // but it can handle stl containers like std::vector

    result res;
    res.eigval.resize(n);
    for(int i=0; i<n; ++i){
      res.eigval[i] = eigsys.eigenvalues()(i);
    }
    res.matrix.resize(n);
    for(int i=0; i<n; ++i){
      res.matrix[i].resize(n);
      for(int j=0; j<n; ++j){
        res.matrix[i][j] = B(i,j);
      }
    }

    return res;
}

Let’s explain what it does in the same order as the code (for information purposes).

Step A — construct an n×n matrix A

Eigen::MatrixXd A = Eigen::MatrixXd::Zero(n,n);

We start with zeros and then fill A with random numbers.

Step B — generate random numbers with GSL

gsl_rng * r = gsl_rng_alloc (gsl_rng_taus);
gsl_rng_set(r, 1);
A(i,j) = gsl_rng_uniform (r);
  • gsl_rng_* creates a random number generator object
  • gsl_rng_uniform(r) produces numbers in [0, 1)
  • the seed is fixed (1) so the example is reproducible (important for teaching)

Step C — build a symmetric positive semi-definite matrix B = Aᵀ A

Eigen::MatrixXd B = A.transpose()*A;

This is the trick used in the slides: Aᵀ A is guaranteed to be symmetric and positive semi-definite, so it is a good input for a self-adjoint eigensolver.

Step D — solve the eigensystem

Eigen::SelfAdjointEigenSolver<Eigen::MatrixXd> eigsys(B);

Because B is symmetric, we use the self-adjoint solver.

Step E — copy results into STL containers for Python

The code then copies:

  • eigenvalues into res.eigval
  • the matrix B into res.matrix

This is the crucial “bridge” move. Yes, it is a copy — but it is very explicit, easy to understand, and works reliably.


3) Building the library: lib/CMakeLists.txt

We treat the Eigen/GSL code as a library first. That’s what add_library(lib …) + add_subdirectory(lib) accomplish.

Here is the library’s CMake file:

target_sources(lib PUBLIC
    "${CMAKE_CURRENT_SOURCE_DIR}/lib.h"
    "${CMAKE_CURRENT_SOURCE_DIR}/lib.cxx"
)

find_package(Eigen3 3.4 REQUIRED NO_MODULE)
target_link_libraries(lib PUBLIC Eigen3::Eigen)

find_package(GSL REQUIRED)
target_link_libraries(lib PUBLIC GSL::gsl GSL::gslcblas)

target_include_directories(lib PUBLIC "${CMAKE_CURRENT_SOURCE_DIR}" 
                                       "${Eigen_INCLUDE_DIRS}"  
                                       "${GSL_INCLUDE_DIRS}")
target_compile_definitions(lib PUBLIC BASE_VERSION=1.0)

What this does:

  • compiles lib.h + lib.cxx into a static library target called lib
  • finds Eigen and GSL
  • links them into lib
  • exports include directories so anything that links lib can include "lib.h"

This structure is “real project structure”: the core numerical code is not trapped inside a Python binding.


4) A pure C++ client: main.cxx

Before binding anything, the repo also shows how we would use the same function from a C++ program.

#include "lib.h"
#include <iostream>

int main(int argc, char* argv[]){

    if (argc < 2 ){
      std::cout << "Usage: " << argv[0] << " n" << std::endl;
      std::cout << "n: number of eigenvalues to compute" << std::endl;
      return 1;
    }

    int n = atoi(argv[1]);
    std::cout<< "computing eigenvalues of a " 
             << n << "x" << n << " matrix" 
             << std::endl;
    result r = get_my_eigvals(n);
    std::cout << "eigenvalues: ";
    for(int i=0; i<n; ++i){
      std::cout<< r.eigval[i] << " ";
    } 
    std::cout << std::endl;
    return 0;
}

That’s good scientific/engineering practice:

  • test the kernel without Python
  • keep a known-good reference executable
  • only then wrap it

5) The pybind11 module: eigvals.cxx

Let's take a look at the “eigvals.cxx” module.

#include <pybind11/pybind11.h>
#include <pybind11/stl.h>

#include <sstream>

#include "lib.h"

namespace py = pybind11;

PYBIND11_MODULE(eigvals, m) {
  m.doc() = "get eigenvalues from a positive definite random matrix";
  // add the results struct
  py::class_<result>(m, "result")
    .def(py::init<>())
    .def_readwrite("eigval", &result::eigval)
    .def_readwrite("matrix", &result::matrix);

  // add the get my evals function
  m.def("get_my_eigvals", &get_my_eigvals, "get eigenvalues from a positive definite random matrix", py::arg("n"),py::return_value_policy::reference);
}

Let’s unpack the binding layer.

The module definition

PYBIND11_MODULE(eigvals, m) { ... }

This produces a Python module you import as:

import eigvals

Exposing the result struct as a Python class

py::class_<result>(m, "result")
  .def(py::init<>())
  .def_readwrite("eigval", &result::eigval)
  .def_readwrite("matrix", &result::matrix);

This is the class-binding analogue of m.def(...):

  • it creates a Python-visible class named result
  • it exposes two public members (eigval, matrix) as Python attributes

Because we included pybind11/stl.h, the STL containers are automatically converted.

Exposing the function

m.def("get_my_eigvals", &get_my_eigvals, ...);

This makes eigvals.get_my_eigvals(n) available in Python. Since get_my_eigvals returns a result object by value, pybind11 materializes a Python-visible result instance from that returned C++ object. The binding also explicitly specifies:

py::return_value_policy::reference

Pedagogically, what matters here is the intent:

  • we return a struct (not just a float)
  • Python receives an instance of the bound result class
  • its members become Python containers

(For production bindings you’d be more careful about lifetime; for a workshop this is fine.)


6) The “no-copying .so around” trick: source_py/test.py

In earlier modules you copied the built .so into a Python folder. In this module we introduce a better practice: install the module and add it to sys.path.

Here is the repo’s test.py:

import sys

sys.path.insert(0, '/path/to/pyC_part.II/install/pymodule')

import eigvals as ev


n=4
a = ev.get_my_eigvals(n)

print("matrix:")
for i in range(n):
    print(a.matrix[i][:])

print("eigenvalues:")
print(a.eigval)

print(type(a))
print(type(a.matrix))
print(type(a.eigval))

The important line is:

sys.path.insert(0, ".../install/pymodule")

That corresponds to the install rule in the top-level CMakeLists.txt:

install(TARGETS eigvals DESTINATION ${CMAKE_INSTALL_PREFIX}/pymodule)

So the workflow is:

  • build + install into an install/ prefix
  • Python imports from install/pymodule/

In practice it is much better than manually copying .so files around.


What you should take away

This module is “long” because it teaches a real workflow:

  • core numerical code in a library (lib)
  • two front ends: a C++ executable and a Python module
  • a small, explicit API (get_my_eigvals(n) returning a result struct)
  • STL containers at the boundary (because they map cleanly to Python)

Once you can structure projects like this, you can wrap much more serious scientific libraries without turning your binding code into a mess.