Skip to content

Content

This module relies on:

  • https://github.com/leskovec/pyC_part.I

In Module 03 we mostly bound functions. In this module we do the next big thing: we bind a C++ class.

The teaching repo implements a small C++ class VectorInt that stores integers in a std::vector<int>, and then exposes it to Python as a class called PyVectorInt.

Why this design is useful for beginners:

  • you get to see how a C++ object becomes a Python object
  • you learn the “Python dunder methods” (__len__, __getitem__, __iter__, …) that make the class feel natural
  • you learn how to expose overloaded methods, enums, and even a callback

The core theme is still the same: a thin binding layer maps C++ → Python. But now the map is richer than a single m.def(...) call.


The C++ class: a vector hidden behind an object

The class declaration (from vector_int.h) is short but instructive:

#pragma once

#include <vector>
#include <string>

enum Access { READONLY, READWRITE };

class VectorInt {
  using iterator = typename std::vector<int>::iterator;

public:
  VectorInt();
  VectorInt(const size_t n);

  iterator begin();
  iterator end();

  void push_back(const int val);
  void push_back(const int val, const size_t n);

  int get(const size_t idx) const;
  size_t size() const;

  bool is_empty() const;
  void clear();
  std::string to_string() const;

private:
  std::vector<int> _vec;

public:
  Access _access;
};

A few things to notice immediately:

  • VectorInt is a normal C++ class. There is nothing “Python-ish” in it.
  • The actual data lives in _vec, a private std::vector<int>.
  • There is a tiny “policy knob” _access of type Access (an enum).
  • The class provides iteration (begin(), end()), which will matter when we expose __iter__ in Python.

If you only remember one idea from this module, make it this:

We are not binding std::vector<int> directly. We are binding a C++ class that uses std::vector<int> internally.

That is a very common pattern in real projects: hide implementation details behind a stable interface.


What the C++ methods actually do

The implementation (vector_int.cpp) is deliberately simple.

Constructors

VectorInt::VectorInt() : _vec(), _access(Access::READWRITE) { }

VectorInt::VectorInt(const size_t n) : _vec(n), _access(Access::READWRITE) { }
  • Default constructor: start with an empty vector.
  • Sized constructor: allocate a vector with n default-initialized integers.
  • Both default the access policy to READWRITE.

Push back

void VectorInt::push_back(const int val) {
  _vec.push_back(val);
}

void VectorInt::push_back(const int val, const size_t n) {
  for (size_t i = 0; i < n; ++i) {
    _vec.push_back(val);
  }
}

This is a perfect example of why classes are nice for teaching bindings:

  • You can expose the “natural” method name push_back to Python
  • But you must also deal with the fact that C++ allows overloads (same name, different signatures)

We’ll handle that in the binding layer.

Index access and size

int VectorInt::get(const size_t idx) const {
  return _vec[idx];
}

size_t VectorInt::size() const {
  return _vec.size();
}

These are the methods we’ll reuse for Python __getitem__ and __len__.

String representation

std::string VectorInt::to_string() const {
  std::stringstream ss;
  ss << "[";
  for (size_t i = 0; i < size() - 1; ++i) {
    ss << get(i) << ", ";
  }
  ss << get(size() - 1) << "]";
  return ss.str();
}

This constructs a human-readable string like [1, 2, 3]. In Python terms, we’ll use this as both __repr__ (for debugging) and __str__ (for printing).


The binding file: turning a C++ class into a Python class

The binding code lives in module.cpp. This is the “bridge” file.

Headers: why we include these

#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
#include <pybind11/functional.h>

#include "vector_int.h"
namespace py = pybind11;
  • pybind11/pybind11.h: core binding API
  • pybind11/stl.h: type conversions for STL types (lists ↔ vectors, etc.)
  • pybind11/functional.h: needed when binding std::function (for callbacks)

Even if you never use callbacks, it’s good to know that functional.h is the “opt-in” header for Python↔C++ callables.


Binding a class with py::class_

The basic form is:

py::class_<VectorInt>(m, "PyVectorInt", py::dynamic_attr())

Read this as:

  • “Expose the C++ type VectorInt
  • … inside module m
  • … under the Python name PyVectorInt.”

The extra py::dynamic_attr() is a Python quality-of-life feature: it allows users to attach new attributes to the instance at runtime (like they can with many pure-Python classes).

Example of what this enables in Python:

v = vecint.PyVectorInt()
v.label = "trial run 7"   # dynamically added attribute

If you omit dynamic_attr, that kind of assignment may fail depending on how the class is bound.


Making the class feel Pythonic with dunder methods

Constructors → __init__

.def(py::init<>())              // PyVectorInt()
.def(py::init<const size_t>())  // PyVectorInt(n)

This gives Python two ways to construct the object:

  • empty vector
  • pre-sized vector

Printing: __repr__ and __str__

.def("__repr__", &VectorInt::to_string)
.def("__str__",  &VectorInt::to_string)

In Python:

  • print(v) calls __str__
  • the interactive prompt uses __repr__

Here we keep them identical, which is fine for a teaching example.

Indexing: v[i]

.def("__getitem__", &VectorInt::get)

Now Python can do:

v[0]

Note: the C++ get() currently uses _vec[idx] and does not range-check. For real code you would typically add bounds checking and raise an exception if idx is invalid.

Length: len(v)

.def("__len__", &VectorInt::size)

Now Python can do:

len(v)

Iteration: for x in v: ...

This is where things become interesting:

.def("__iter__", [](VectorInt& vec) {
    return py::make_iterator(vec.begin(), vec.end());
}, py::keep_alive<0,1>())

What’s going on here?

  • make_iterator(begin, end) creates a Python iterator object that walks the C++ iterators.
  • keep_alive<0,1>() is important: it tells pybind11 to keep the C++ container (vec) alive as long as the Python iterator exists.

Without keep_alive, it’s easier to accidentally end up with a dangling iterator if the original object goes out of scope.


Exposing regular methods (including overloads)

Overloaded push_back

C++ allows overloads, but Python doesn’t pick overloads by type in the same way. So in bindings we often explicitly select the overload we mean:

.def("push_back",
     static_cast<void (VectorInt::*)(const int)>(&VectorInt::push_back),
     "Insert one element", py::arg("elem"))

.def("push_back",
     static_cast<void (VectorInt::*)(const int, const size_t)>(&VectorInt::push_back),
     "Insert elem n times", py::arg("elem"), py::arg("n"))

The static_cast<...> is the crucial piece: it disambiguates which overload you are binding.

In Python, both appear under the same name:

v.push_back(7)
v.push_back(7, 10)   # insert 10 copies

Other methods

These are straightforward bindings:

.def("is_empty", &VectorInt::is_empty)
.def("clear",    &VectorInt::clear)
.def("size",     &VectorInt::size)

The repo also binds length as a read-only property:

.def_property_readonly("length", &VectorInt::size)

So both of these work in Python:

v.size()
v.length

Binding a public data member: .def_readwrite

The class has a public member _access. The binding uses:

.def_readwrite("access", &VectorInt::_access)

That creates a Python attribute v.access that is both readable and writable.

This is a teaching-friendly way to show: you can expose data members too, not just methods. In real projects, you’ll often prefer a property (def_property) so you can validate on write, but def_readwrite is a great first step.


Binding an enum: Access

The repo exposes the C++ enum to Python:

py::enum_<Access>(m, "Access")
  .value("READONLY",  Access::READONLY)
  .value("READWRITE", Access::READWRITE);

So Python users can write:

v.access = vecint.Access.READONLY

This is much better than inventing “magic integers” in Python.


A callback example: calling Python from C++

The binding also includes a method that accepts a callable:

.def("doForAllElements",
     [](VectorInt& vec, std::function<void(int)>& f) {
         for (int elem : vec) {
             f(elem);
         }
     });

This is a big conceptual milestone:

  • Python passes a function (or lambda) into C++
  • C++ iterates over its internal vector
  • C++ calls back into Python for each element

In Python, it looks like:

def printer(x):
    print("element:", x)

v.doForAllElements(printer)

This is not the fastest way to do numerical work (calling Python repeatedly is expensive), but it is extremely useful for demonstrating:

  • how std::function acts as the “callable type” on the C++ side
  • why you included pybind11/functional.h
  • how easy it is to compose C++ containers with Python-side behavior

A good teaching point here is the performance instinct:

If you call Python once per element, you are paying the boundary cost many times. Prefer chunked kernels for performance, but understand callbacks as a tool.


Takeaway

This module is much more than “bind a vector”. It shows the core patterns you’ll reuse for real classes:

  • constructors → py::init
  • Python behavior via dunder methods (__len__, __getitem__, __iter__, …)
  • overload resolution with static_cast
  • properties and public members (def_property_readonly, def_readwrite)
  • enums (py::enum_)
  • callbacks (std::function + pybind11/functional.h)

Once you can read and write a binding like this, you’re ready to expose more complex scientific objects (clean APIs, few boundary crossings, and predictable memory behavior).