Expressions

Representation

A mathematical expression in Symbolica can consist of rational numbers, variables, functions, and the operations addition, multiplication, and exponentiation. An example Symbolica expression is:

\[ f(x,y)^2 + x \cdot y + 1\]

Division and negation are represented through a power of -1 and a multiplication by -1 respectively:

\[ \frac{1}{x} \rightarrow x^{-1},\qquad -x = -1*x \]

The addition and multiplication operator are viewed as n-ary operators, instead of binary operators. What that means is all factors or summands are treated at the same level. In a tree representation it looks like:

flowchart TD
    Mul --> Mulb{Mul}
    Mul --> a
    Mulb --> b
    Mulb --> c
Figure 1: Binary operator
flowchart TD
    Mul --> a
    Mul --> b
    Mul --> c
Figure 2: N-ary operator

We can represent any Symbolica expression as a tree, where we refer to each node as an atom. For example, the expression

\[\frac{4}{3} (x+1) f(x,y^z,3)\] can be viewed as:

flowchart TD
    Mul --> Num{Num 4/3}
    Mul --> Add
    Add --> Varx{Var x}
    Add --> Num2{Num 1}
    Mul --> Fn{Fn f}
    Fn --> Varxx{Var x}
    Fn --> Pow{Pow}
    Pow --> Vary{Var y}
    Pow --> Varz{Var z}
    Fn --> Num3{Num 3}

Note that although we can represent the expression by a tree, this is not the way Symbolica represents, them as this would take too much memory. Symbolica uses a custom linear, compressed expression that is approximately 8 times smaller than Mathematica’s and 3 times smaller than Form’s. This way, many terms can fit in memory at the same time. See Memory Representation for more details.

Emulation using functions

Even though Symbolica does not (yet) natively support vectors or other mathematical constructs you may want to use, functions are flexible enough to capture these construct. For example, a vector with an index could be represented as vec(p,mu), representing \(p^\mu\), a dot product \(p_1 \cdot p_2\) could be represented as dot(p1,p2), etc.

Creating Symbolica expressions

In the following example we create a Symbolica expression f(x)*(1+y) via string parsing:

from symbolica import Expression
e = Expression.parse('f(x)*(1+y)')
print(e)
use symbolica::atom::Atom;

fn main() {
    let input = Atom::parse("f(x)*(1+y)").unwrap();
    println!("{}", input);
}

The parsing rules are liberal. Whitespace characters (space, tab, newline) are allowed to appear between tokens. A variable name and function name must start with a non-numerical unicode character and may be followed by any non-whitespace character. A number must consist of numerical characters (0-9) and may contain digit separators _ (underscore) or (thin space, U+2009). The multiplication operator may be implicit, which means that it may be inserted when applicable. Here are some examples of valid input:

2x
3(2+x)
x^2y
2 3_000_123
x2    y

Atoms / expressions can also be constructed from combinations of variables, functions and numbers:

from symbolica import Expression
x, y = Expression.vars('x', 'y')
f = Expression.fun('f')
e = f(x)*(1+y)
print(e)

To define a single variable, use Expression.var and for a single function Expression.fun. To define multiple variables / functions at the same time, use Expression.vars / Expression.funs.

use the fun! macro to create Symbolica functions:

use symbolica::{
    fun,
    atom::{Atom, FunctionBuilder},
    state::State,
};

fn main() -> Result<(), String> {
    let x = Atom::parse("x")?;
    let y = Atom::parse("y")?;
    let f_id = State::get_symbol("f")?;

    let f = fun!(f_id, x, y, Atom::new_num(2));

    let xb = (-(&y + &x + 2) * &y * 6).npow(5) / &y * &f / 4;

    println!("{}", xb);

    Ok(())
}

Function attributes

Functions can be defined to be symmetric or antisymmetric in their arguments. This has the effect that the function arguments will be sorted according to the canonical form. For example, f(3,2,1) where f is symmetric:

from symbolica import Expression
f = Expression.fun('f', is_symmetric=True)
e = f(3,2,1)
print(e)
use symbolica::{
    fun,
    atom::{Atom, FunctionBuilder},
    state::{FunctionAttribute, State},
};

fn main() -> Result<(), String> {
    let f = State::get_symbol_with_attributes("f", &[FunctionAttribute::Symmetric])?;
    let a = fun!(
        f,
        Atom::new_num(3),
        Atom::new_num(2),
        Atom::new_num(1)
    );

    println!("{}", a);
    Ok(())
}

yields f(1,2,3). Antisymmetric functions will also be sorted and will obtain a minus sign for every swapped argument.

Functions can also be defined as linear, which will linearize every argument. Combining the attributes symmetric and linear gives the familiar dot product:

from symbolica import Expression
f = Expression.fun('f', is_symmetric=True, is_linear=True)
e = Expression.parse("f(3*x+y,x+2*y+z)")
print(e)
use symbolica::{
    atom::Atom,
    state::{FunctionAttribute, State},
};

fn main() {
    let _f = State::get_symbol_with_attributes("f", &[FunctionAttribute::Linear, FunctionAttribute::Symmetric]);

    let out = Atom::parse("f(3*x+y,x+2*y+z)").unwrap();

    println!("{}", out);
}

which yields f(x,y)+3*f(x,z)+f(y,z)+3*f(x,x)+2*f(y,y)+6*f(x,y).

Once attributes have been defined for functions, they cannot be changed anymore as this would invalidate previously built expressions. When parsing a new function, the default attributes will be assigned. Therefore, make sure to define all function attributes before parsing any expressions.

Walking the expression tree

The expression tree can be navigated by looping through the nodes and leafs:

from symbolica import *

def walk_tree(a: Expression, depth: int = 0):
    if a.get_type() in [AtomType.Var, AtomType.Num]:
        print(' '*depth, a)
    if a.get_type() == AtomType.Fn:
        print(' '*depth, 'Fun', a.get_name())
        for arg in a:
            walk_tree(arg, depth + 1)
    if a.get_type() == AtomType.Mul:
        print(' '*depth, 'Mul')
        for arg in a:
            walk_tree(arg, depth + 1)
    if a.get_type() == AtomType.Add:
        print(' '*depth, 'Add')
        for arg in a:
            walk_tree(arg, depth + 1)
    if a.get_type() == AtomType.Pow:
        print(' '*depth, 'Pow')
        walk_tree(a[0], depth + 1)
        walk_tree(a[1], depth + 1)


e = Expression.parse('x^2*y + f(1,2,3)')
walk_tree(e)
use symbolica::{
    id::AtomTreeIterator,
    atom::{Atom, AtomView},
};

fn tree_walk(a: AtomView, depth: usize) {
    match a {
        AtomView::Num(_) | AtomView::Var(_) => println!("{:indent$}{}", "", a, indent = depth),
        AtomView::Fun(f) => {
            println!("{:indent$}Fun {}", "", f.get_symbol(), indent = depth);
            for a in f.iter() {
                tree_walk(a, depth + 1);
            }
        }
        AtomView::Pow(p) => {
            println!("{:indent$}", " ", indent = depth);
            let (base, exp) = p.get_base_exp();

            tree_walk(base, depth + 1);
            tree_walk(exp, depth + 1);
        }
        AtomView::Mul(m) => {
            println!("{:indent$}Mul", "", indent = depth);
            for a in m.iter() {
                tree_walk(a, depth + 1);
            }
        }
        AtomView::Add(a) => {
            println!("{:indent$}Add", "", indent = depth);
            for a in a.iter() {
                tree_walk(a, depth + 1);
            }
        }
    }
}

fn main() {
    let expr: Atom = Atom::parse("x^2*y + f(1,2,3)").unwrap();
    tree_walk(expr.as_view(), 0);
}

which yields:

 Add
  Fun f
   1
   2
   3
  Mul
   Pow
    x
    2
   y

To modify parts of the expression, use pattern matching with replacement.

Output formatting

Symbolica can format its output in a customizable way. For example, it can write Mathematica-compatible output or Latex output. Here is an example:

from symbolica import Expression
a = Expression.parse('128378127123 z^(2/3)*w^2/x/y + y^4 + z^34 + x^(x+2)+3/5+f(x,x^2)')
print(a.pretty_str(multiplication_operator=' ', num_exp_as_superscript=True, number_thousands_separator='_'))
print(a.pretty_str(latex=True))
use symbolica::{printer::AtomPrinter, atom::Atom};

fn main() {
    let n =
        Atom::parse("128378127123 z^(2/3)*w^2/x/y + y^4 + z^34 + x^(x+2)+3/5+f(x,x^2)").unwrap();

    let mut p = AtomPrinter::new(n.as_view());
    p.print_opts.number_thousands_separator = Some('_');
    p.print_opts.multiplication_operator = ' ';
    p.print_opts.num_exp_as_superscript = true;
    println!("{}", p);

    let mut p = AtomPrinter::new(n.as_view());
    p.print_opts.latex = true;
    println!("{}", p);
}

which yields:

z³⁴+x^(x+2)+y⁴+f(x,x²)+128_378_127_123 z^(2/3) w² x⁻¹ y⁻¹+3/5
$$z^{34}+x^{x+2}+y^{4}+f(x,x^{2})+128378127123 z^{\frac{2}{3}} w^{2} \frac{1}{x} \frac{1}{y}+\frac{3}{5}$$

Canonical form

An expression has a normal form or canonical form, which is one preferred way in which it is written. For example, the normal form of 1+1 is 2, and the normal form of x*x is x^2. The procedure of normalizing an expression is performed after every operation, so that you should never see a non-canonical expression. Each computer algebra system has its own rules to decide which form is the normal form (or may not even write expression in a unique form!).

The (recursive) rules for normalization for Symbolica are defined per atom. For a

  • Variable
    • Always normalized
  • Number
    • Remove the common divisor between the numerator and denominator
  • Sum
    • Normalize and sort all summands
    • Remove summands that are 0
    • Add summands that are equal apart from their coefficient (\(x+ 2x \rightarrow 3x\))
  • Product
    • Normalize and sort all factors
    • If any factor is 0, return the number 0
    • Remove factor of 1
    • Convert equal factors apart from their power into a power (\(x x^a \rightarrow x^{a+1}\))
  • Power
    • Normalize the base and exponent
    • Normalize \(\text{num}^\text{num}\), \(0^a \rightarrow 1\), and \(1^a \rightarrow 1\)
  • Function
    • Normalize all arguments
    • Flatten arguments of built-in function arg(...)
    • Sort all arguments when the function is symmetric

The comparison of two atoms, which is used for sorting, is well-defined, however the convention could change between Symbolica versions.

Note that normalization is supposed to be quick, since it happens often. This means that expensive operations, such as expansion and factorization are not considered. For example, \((x+1)^2\) is a Symbolica normal form and \(1+2x+x^2\) is as well, even though they are mathematically equivalent.

Term streaming

In many scenarios an expression contains many terms, where each term is relatively small. Also often, symbolic operations can be performed term by term. The list of common operations that cannot be done term per term is short:

  • Factorization
  • GCD computation
  • Horner scheme construction
  • Pattern match \(a+b\)

The following operations can be done term per term:

  • Most pattern matches, such as \(y f(x)\)
  • Expansion
  • Differentiation
  • Integration

When using some global memory, operations such as finding the maximum term, or counting the number of terms that satisfy a certain condition can also be done term-by-term.

With this in mind, we realize that for many operations the expression does not need to be in memory as a whole at the same time: only one term needs to be in memory. Once it is processed, it can be written to disk. We call this term streaming and it allows for huge expressions to be processed.

Symbolica can convert an expression into a stream and it provides an easy way to perform operations on terms in parallel by using a map. The input of the map is one term and the output should be an expression. In the example below a derivative is applied to each term, in parallel.

from symbolica import Expression
x, y = Expression.vars('x', 'y')
e = x +x**2 + x*y + y**2 + x*y**4
e = e.map(Transformer().derivative(x))
print(e)

The Transformer() can be viewed as a Python lambda function

lambda x: x.derivative()

and is explained more here.

use symbolica::{
    state::State,
    atom::Atom,
    streaming::TermStreamer,
};

fn main() {
    let x = State::get_symbol("x");
    let input = Atom::parse("x + x^2 + x^y + y^2 + x*y^4").unwrap();

    let mut stream = TermStreamer::new_from(input);

    stream = stream.map(|workspace, term| {
        let mut out = Atom::new();
        term.derivative_into(x, workspace, &state, &mut out);
        out
    });

    let res = stream.to_expression();
    println!("\t+ {}", res.printer(&state));
}

By default, map uses all system cores.

Global information

Information that involves comparison between terms can be collected using appropriate synchronized structures, like atomics or mutexes. Currently, this is only possible in Rust. Note that due to Rust’s borrow checker, there can be no memory problems such as race conditions.

In the example below, the maximum numerical argument of the function f over all terms is determined (in parallel).

Not available yet.

Obtain the maximal function argument using an atomic integer.

use std::sync::atomic::{AtomicI64, Ordering};

use symbolica::{
    coefficient::CoefficientView,
    id::{Condition, Match, MatchSettings, Pattern},
    atom::{Atom, AtomView},
    state::State,
    streaming::TermStreamer,
};

fn main() {
    let x_ = State::get_symbol("x_");
    let input = Atom::parse("f(1)+f(2)+f(3)+f(4)").unwrap();
    let pattern = Pattern::parse("f(x_)").unwrap();

    let stream = TermStreamer::new_from(input);

    let max_argument = AtomicI64::new(0);

    // map every term in the expression
    stream.map(|_, term| {
        if let Some((_, _, _, m)) = pattern
            .pattern_match(
                term.as_view(),
                &Condition::default(),
                &MatchSettings::default(),
            )
            .next()
        {
            if let Some(Match::Single(n)) = m.get(x_) {
                if let AtomView::Num(nn) = n {
                    if let CoefficientView::Natural(a, 1) = nn.get_coeff_view() {
                        max_argument.fetch_max(a, Ordering::SeqCst);
                    }
                }
            }
        }

        term
    });

    println!("Max argument: {}", max_argument.load(Ordering::SeqCst));
}