Symbolica 1.0: Symbolic mathematics in Rust + two new open-source crates – Symbolica

Support Symbolica

This blog post uses the computational framework Symbolica. Symbolica is free for hobbyists and a single-core instance is free for non-commercial use. Would you like to have new features added? Your organization can purchase a license and support the Symbolica project!

Introduction

Today marks the release of Symbolica 1.0 🎉🎉🎉! Symbolica is a library for Rust and Python that can do symbolic and numeric mathematics. It also marks the release of the MIT-licensed crates Numerica and Graphica that were extracted from Symbolica, totalling 18.5k lines of open-sourced code.

In this blog post I want to introduce the new packages and go over some design decisions. But first, what is Symbolica?

What can you do with Symbolica?

With Symbolica you can easily manipulate expressions with symbols as if they were numbers. For example, you can solve a linear system with parameters:

\[ \pmatrix { t & 2\\ 3 & 1+t }\pmatrix{ x\\ y } =\pmatrix{ 3\\ 4 } \]

Here is what this looks like in your language of choice:

from symbolica import *
t = S('t')
res = Matrix.from_nested([[t, 2], [3, 1 + t]]).solve(Matrix.vec([3, 4]))
print(res)

Create a matrix with rational polynomials as entries (you could also use the general expression type Atom, but rational polynomials will be the fastest choice):

use symbolica::{
    atom::{Atom, AtomCore},
    domains::rational_polynomial::RationalPolynomialField,
    domains::{integer::Z, rational::Q},
    parse,
    tensors::matrix::Matrix,
};

fn main() {
    let t = parse!("t");
    let data = [t.clone(), 2.into(), 3.into(), t + 1];
    let rhs: [Atom; 2] = [3.into(), 4.into()];
    let res = Matrix::from_linear(
        data.into_iter()
            .map(|x| x.to_rational_polynomial(&Q, &Z, None))
            .collect(),
        2,
        2,
        RationalPolynomialField::<_, u8>::new(Z),
    )
    .unwrap();
    let rhs = Matrix::new_vec(
        rhs.into_iter()
            .map(|x| x.to_rational_polynomial(&Q, &Z, None))
            .collect(),
        RationalPolynomialField::new(Z),
    );

    let res = res.solve(&rhs).unwrap();
    println!("{}", res);
}

which yields \[ x = \frac{-5+3t}{-6+t+t^2}, y = \frac{-9+4t}{-6+t+t^2} \]

You could now substitute \(t\) with numbers (exact or approximate), or just keep the expression as is, and perform further manipulations such as series expansions¹ or create a very fast numerical evaluator compiled as ASM/SIMD/CUDA/C++ (see this blog post).

¹ Or factor the denominators and find out that the system has no solution at \(t=2\) and \(t=-3\).

² It is not an easy task to extend the functionality as the algorithms are vastly different!

Most other packages that solve linear systems are purely numeric (supporting only floating point numbers) and do not support the parameter \(t\)². On the other hand, most computer algebra systems (Mathematica, Maple, etc) that can manipulate symbols force their users to code their program logic in a proprietary language. With Symbolica you can deeply embed symbolic representations of data inside your program logic in Rust or Python.

Introducing Numerica and Graphica

Symbolica is source available, and completely free to use for hobbyists. For academics, the use of one core per device is free. As of version 1.0, I split off features from Symbolica into separate crates and repositories and released them with the MIT license. The crates are numerica and graphica.

Numerica

Numerica is an open-source mathematics crate that provides high-performance number types, such as error-tracking floats and finite field structs.

Here is a (non-complete) summary of features:

Abstractions over rings, Euclidean domains, fields and floats (see Section 4.3)
High-performance Integer with automatic up-and-downgrading to arbitrary precision types
Rational numbers with reconstruction algorithms
Fast finite field arithmetic
Error-tracking floating point types
Generic dual numbers for automatic (higher-order) differentiation
Matrix operations and linear system solving
Numerical integration using Vegas algorithm with discrete layer support

Solving linear systems

Here is an example of solving a linear system over the rational numbers with Numerica:

let a = Matrix::from_linear(
    vec![
        1.into(), 2.into(), 3.into(),
        4.into(), 5.into(), 16.into(),
        7.into(), 8.into(), 9.into(),
    ],
    3, 3, Q,
)
.unwrap();

let b = Matrix::new_vec(vec![1.into(), 2.into(), 3.into()], Q);

let r = a.solve(&b).unwrap();

which yields \(\pmatrix{-\frac{1}{3}, \frac{2}{3}, 0}\). See here for an example of solving systems over finite fields.

Error-tracking floats

When doing inexact computations with floating point numbers, a dramatic loss of precision can occur due to large cancellations. Numerica can track this using ErrorPropagatingFloat, which wraps around a floating point type such as f64 or Float (arbitrary precision). In the following example we use it to track what happens when we compute \[ \frac{e^x-1.}{x} \] for small x = 1e-50 where both x and 1. have 60 decimal digits of precision (≈200 binary digits):

let x = ErrorPropagatingFloat::new(Float::with_val(200, 1e-50), 60.);
let r = (x.exp() - x.one()) / x; // x.one() creates 1 with the same precision as `x`
assert_eq!(r.get_precision(), Some(10.205999132796238));

There are only 10 precise digits left! Thankfully, in this case there is a way to prevent precision loss with Numerica. If the 1 is exact, you can use an overloaded addition of floats and integers/rationals:

let x = ErrorPropagatingFloat::new(Float::with_val(200, 1e-50), 60.);
let r = (x.exp() - 1) / x;
assert_eq!(r.get_precision(), Some(59.69897000433602));

The precision degradation was prevented!

Graphica

Graphica is an open-source graph crate for Rust that enables the generation, manipulation and canonization of multi-edge graphs with mixed directed and undirected edges and arbitrary node and edge data. It uses a custom version of McKay’s graph canonization algorithm (the algorithm implemented in nauty) to write graphs in their canonical form. When two graphs are in canonical form, checking if two graphs a and b are the same (“isomorphic”), reduces to checking if their representation is literally the same. Thus, you can put canonical graphs in a HashSet and just check if a (canonized) graph has been seen before. Graphica also supports generation of graphs by specifying external nodes and the set of allowed vertex signatures. For example:

let g = HalfEdge::undirected("g");
let q = HalfEdge::incoming("q");
let gs = Graph::<_, &str>::generate(
    &[(1, g), (2, g)],
    &[vec![g, g, g], vec![q.flip(), q, g], vec![g, g, g, g]],
    GenerationSettings::new().max_loops(2),
)
.unwrap();

let r = gs.keys().next().unwrap().to_mermaid();
println!("{}", r);

yields

graph TD;
  0[" "];
  1[" "];
  2[" "];
  3[" "];
  4["1"];
  5["2"];
  0 ---|"g"| 1;
  0 -->|"q"| 3;
  1 -->|"q"| 0;
  2 -->|"q"| 1;
  2 ---|"g"| 5;
  3 -->|"q"| 2;
  3 ---|"g"| 4;

which is a Feynman diagram for quantum chromodynamics (gs are gluons and qs are quarks, which flow in a loop).

Design of Symbolica

A safe global state

Symbolica uses a global state to store all defined symbols (variable and function names inside mathematical expressions) and their properties. Mutable global states can be tricky, but Symbolica has some safety features. Most notably, the state is append-only: once a symbol is defined, it can never be redefined. As a result, previously constructed expressions can never be invalidated by changes to the state.

What about name collisions? A dependency could define a symbol with conflicting properties. This problem is solved by using namespaces. The way to define symbols is through a macro that assigns the current crate name as the default namespace:

let (x, y) = symbol!("x", "y");
let x_other = symbol!("someproject::x");

This code defines (or fetches if it was already defined) the symbols yourcrate::x, yourcrate::y and someproject::x and stores the file and line it was defined at. If a symbol is redefined with different attributes, the resulting error will display where the symbol was first defined.

Parsing an expression uses the same logic, defining new symbols it encounters in the current crate’s namespace. For example,

let e = parse!("(x+1)^2");

parses as (yourcrate::x+1)^2.

Here is a sketch of the implementation of symbol! that wraps a symbol with a namespace:

macro_rules symbol! {
    ($e: expr) => $crate::atom::NamespacedSymbol {
        symbol: format!("{}::{}", env!("CARGO_CRATE_NAME"), $e),
        namespace: env!("CARGO_CRATE_NAME"),
        file: file!(),
        line: line!(),
    }
}

See my blog post for more details about global states.

Shipping to Python

Symbolica provides fully typed Python bindings using the pyo3 crate. A global, shared state is a major challenge: in Rust, there is only one Symbolica instance³ shared among a crate and its dependencies, but in Python there is one instance per shared object library.

³ if a semver compatible version is used by all the dependencies

Each shared object library (module), has its own independent instance of Symbolica and therefore its own state and symbols. This is a problem if a module exposes Symbolica types in its API: you cannot pass a symbolica.Expression from symbolica.so into a function from module1.so that is supposed to take a symbolica.Expression as these are different types and have incompatible representations.

The solution to this problem is symbolica-community, a small open-source project that glues several Rust crates together and creates one Python module out of it. It is this community-enhanced Symbolica that is published on PyPi. Each community module must implement SymbolicaCommunityModule:

pub trait SymbolicaCommunityModule {
    /// The name of the submodule. Must be used in all defined Python structures, as such:
    /// ```
    /// #[pyclass(module = "symbolica.community.NAME")]
    /// struct MyPythonStruct {}
    /// ```
    fn get_name() -> String;

    /// Register all classes, functions and methods in the submodule `m`.
    fn register_module(m: &Bound<'_, PyModule>) -> PyResult<()>;
}

and register all functions, classes, etc in the module provided as an argument to register_module.

The symbolica-community library essentially consists of these two functions:

fn register_extension<T: SymbolicaCommunityModule>(m: &Bound<'_, PyModule>) -> PyResult<()> {
    let child_module = PyModule::new(m.py(), &T::get_name())?;
    T::register_module(&child_module)?;
    m.add_submodule(&child_module)?;

    m.py().import("sys")?.getattr("modules")?.set_item(
        format!("symbolica.community.{}", T::get_name()),
        &child_module,
    )?;
    Ok(())
}

#[pymodule]
fn core(m: &Bound<'_, PyModule>) -> PyResult<()> {
    symbolica::create_symbolica_module(m)?;
    register_extension::<example_extension::CommunityModule>(m)?;
    // add multiple extensions here
    Ok(())
}

With this setup, the community contributions can easily be accessed from Python:

from symbolica.community.example_extension import *

Type hinting and documentation is automatically gathered using the pyo3-stub-gen crate.

In the future, symbolica-community could be completely auto-generated from a list of crates, which expose a struct that implements SymbolicaCommunityModule.

Abstracting mathematics

Many algorithms can be performed for a whole class of types instead of a specific type. An example is row reduction, used in Gaussian elimination. Here is an example of a single row reduction step:

\[ \pmatrix { 3 & 2\\ 1 & 5 }\rightarrow \pmatrix{ 3 & 2\\ 0 & 5 - 2 \frac{1}{3} } = \pmatrix{ 3 & 2\\ 0 & \frac{13}{3} } \] To subtract a scaled version of row one from row two, we need types that have a multiplication, subtraction and division operator. We call these types fields.

Let’s build some abstractions. First we define a set, that has an associated type that represents the type of the elements of the set. For example, the set of integers has Integer as its element type.

pub trait Set {
    type Element;
}

Next, we define a ring, which is a set with an addition and multiplication operator.

Nominative typing

Rust uses nominative typing where types have to explicitly implement a trait. Some other languages like Python and C++ rely on structural typing, where a type that implements addition and multiplication could be considered a ring. This may lead to surprising bugs, as not everything that walks and quacks like a duck is a duck!

pub trait Ring: Set {
    fn add(&self, a: &Self::Element, b: &Self::Element) -> Self::Element;
    fn sub(&self, a: &Self::Element, b: &Self::Element) -> Self::Element;
    fn mul(&self, a: &Self::Element, b: &Self::Element) -> Self::Element;
}

We can keep going! A Euclidean domain is a ring⁴ with a quotient-and-remainder function quot_rem.

⁴ I am skipping some steps and conditions in the domain hierarchy.

Division with remainder

For example, \(\text{quot\_rem}(7,3) = (2, 1)\) as \(7 = 2 \cdot 3 + 1\).

In Rust, it looks like:

pub trait EuclideanDomain: Ring {
    fn quot_rem(&self, a: &Self::Element, b: &Self::Element) -> (Self::Element, Self::Element);
    fn gcd(&self, a: &Self::Element, b: &Self::Element) -> Self::Element;
}

Finally, we define a field, for which every non-zero element has an inverse:

pub trait Field: EuclideanDomain {
    fn div(&self, a: &Self::Element, b: &Self::Element) -> Self::Element;
    fn inv(&self, a: &Self::Element) -> Self::Element;
}

A simple example that uses the abstractions above is an integer ring:

struct IntegerRing {}

impl Set for IntegerRing {
    type Element = Integer;
}

impl Ring for IntegerRing {
    fn add(&self, a: &Integer, b: &Integer) -> Integer { a + b }
}

impl EuclideanDomain for IntegerRing {
    fn quot_rem(&self, a: &Integer, b: &Integer) -> (Integer, Integer) {}
    fn gcd(&self, a: &Integer, b: &Integer) -> Integer {}
}

The IntegerRing is empty as two integers can be added without any extra knowledge. This is different for finite fields, where the modulus must be known⁵. Note that the IntegerRing does not implement Field, as for example the integer 2 does not have an inverse.

⁵ One could store the modulus inside every finite field element, but this wastes quite some space.

We can now construct a Matrix that is generic over rings:

struct Matrix<R: Ring> {
    data: Vec<R::Element>,
    n_rows: usize,
    n_cols: usize,
    ring: R,
}

This definition of Matrix is essentially the same as the one in Numerica. Now we can implement functions on it, such as element-wise addition:

impl<R: Ring> Matrix<R> {
    fn add(&self, other: &Matrix<R>) -> Matrix<R> {
        assert_eq!(self.n_rows, other.n_rows);
        assert_eq!(self.n_cols, other.n_cols);

        let mut data = Vec::with_capacity(self.data.len());
        for (e1, e2) in self.data.iter().zip(&other.data) {
            data.push(self.ring.add(e1, e2));
        }
        Matrix {data, n_rows: self.n_rows, n_cols: self.n_cols, ring: self.ring.clone() }
    }
}

And we can also implement row_reduce, which requires a field:

impl<R: Field> Matrix<R> {
    fn row_reduce(&mut self) {
        // implementation here
     }
}

The same Matrix::add implementation can be used for all rings. In Numerica, some example rings are Integer, FiniteField, and in Symbolica we have (among others) MultivariatePolynomial and Atom. Matrix::row_reduce requires a field, for example a FiniteField, Rational or RationalPolynomial.

Symbolica is packed with structures generic over rings. For example, there is MultivariatePolynomial<R: Ring>, which is itself an element of PolynomialRing. The ring traits are now open source and part of Numerica.

If you want to learn more about computer algebra, for example how to solve non-linear systems or factor polynomials, see my lectures notes for the Math for Machines course.

Conclusion

It was an exciting journey to Symbolica 1.0. The Rust API evolved quite a bit to a version that looks quite similar to the Python API (while being much more flexible):

(x, y) = S('x', 'y', is_scalar=True) # Python

let (x, y) = symbol!("x", "y"; Scalar); // Rust

Rust has been tremendously helpful to get this project off the ground (see this video for more design talk). Not having to worry about memory corruption, the ease of zero-cost abstractions and pyo3 for Python binding generation have sped up development.

Of course there are some still rough edges. It is not possible yet to have mutually exclusive traits, leading to some limitations related to overlapping blanket implementations (see here for a nice workaround). Then there are some overzealous trait overlap errors when using associated types, an ICE, and lack of specialization (some methods implemented on Ring can be performed slightly faster for rings that are also Field). For all these problems, I could find acceptable workarounds.

Want to learn more? Join the discussion on the Symbolica Discord or Zulip!

Support Symbolica

Symbolica is free for hobbyists and a single-core instance is free for non-commercial use. Would you like to use additional features or have unrestricted access? Your organization can purchase a license and support the Symbolica project!