Expressions

Representation

A mathematical expression in Symbolica can consist of rational numbers, variables, functions, and the operations addition, multiplication, and exponentiation. An example Symbolica expression is:

\[ f(x,y)^2 + x \cdot y + 1\]

Division and negation are represented through a power of -1 and a multiplication by -1 respectively:

\[ \frac{1}{x} \rightarrow x^{-1},\qquad -x = -1*x \]

The addition and multiplication operator are viewed as n-ary operators, instead of binary operators. What that means is all factors or summands are treated at the same level. In a tree representation it looks like:

flowchart TD
    Mul --> Mulb{Mul}
    Mul --> a
    Mulb --> b
    Mulb --> c
Figure 1: Binary operator
flowchart TD
    Mul --> a
    Mul --> b
    Mul --> c
Figure 2: N-ary operator

We can represent any Symbolica expression as a tree, where we refer to each node as an atom. For example, the expression

\[\frac{4}{3} (x+1) f(x,y^z,3)\] can be viewed as:

flowchart TD
    Mul --> Num{Num 4/3}
    Mul --> Add
    Add --> Varx{Var x}
    Add --> Num2{Num 1}
    Mul --> Fn{Fn f}
    Fn --> Varxx{Var x}
    Fn --> Pow{Pow}
    Pow --> Vary{Var y}
    Pow --> Varz{Var z}
    Fn --> Num3{Num 3}

Note that although we can represent the expression by a tree, this is not the way Symbolica represents, them as this would take too much memory. Symbolica uses a custom linear, compressed expression that is approximately 8 times smaller than Mathematica’s and 3 times smaller than Form’s. This way, many terms can fit in memory at the same time. See Memory Representation for more details.

Emulation using functions

Even though Symbolica does not (yet) natively support vectors or other mathematical constructs you may want to use, functions are flexible enough to capture these construct. For example, a vector with an index could be represented as vec(p,mu), representing \(p^\mu\), a dot product \(p_1 \cdot p_2\) could be represented as dot(p1,p2), etc.

Creating Symbolica expressions

In the following example we create a Symbolica expression f(x)*(1+y) via string parsing:

from symbolica import Expression
e = Expression.parse('f(x)*(1+y)')
print(e)
use symbolica::atom::{Atom, AtomCore};

fn main() {
    let input = Atom::parse("f(x)*(1+y)").unwrap();
    println!("{}", input);
}

The parsing rules are liberal. Whitespace characters (space, tab, newline) are allowed to appear between tokens. A variable name and function name must start with a non-numerical unicode character and may be followed by any non-whitespace character. Functions may have parentheses or square brackets. A number must consist of numerical characters (0-9) and may contain digit separators _ (underscore) or (thin space, U+2009). The multiplication operator may be implicit, which means that it may be inserted when applicable. Here are some examples of valid input:

f[5,y] + f(x)
2x
3(2+x)
x^2y
2 * 3_000_123
x2    y

Whitespace between two numbers (for example 5 6 or 5\n6) will result in a parsing error instead of an implicit multiplication (5*6), as it is very likely that the user intended this to be one number (56).

Floating point numbers

Floating point numbers can be parsed in decimal notation or scientific notation:

1.3
1.4e-4

The precision of a floating point – the number of precise decimal digits - can be specified by appending it after a backtick to the float. For example 5.6e-2`1.2, denotes that the float 5.6e-2 has 1.2 significant decimal digits.

Floating point numbers without an explicit precision are assumed to have at least the same precision as 64-bit double floating point numbers (53 binary digits, or 15.9 decimal digits). During computations in Symbolica, the precision of floating points is tracked. For more advanced precision tracking, see the ErrorPropagatingFloat type.

Using Python’s decimal class, you can parse a specific number of stable digits, e.g. Decimal('0.01234') has 4 stable decimal digits.

Manual construction

Atoms / expressions can also be constructed from combinations of variables, functions and numbers:

from symbolica import Expression
x, y, f = Expression.symbol('x', 'y', 'f')
e = f(x)*(1+y)
print(e)

use the fun! macro to create Symbolica functions and the symb! macro to quickly define symbols:

use symbolica::{
    fun,
    atom::{Atom, AtomCore},
};

fn main() -> Result<(), String> {
    let (x, y, f_id) = symb!("x", "y", "f");

    let f = fun!(f_id, x, y, 2);

    let xb = (-(y + x + 2) * y * 6).npow(5) / y * &f / 4;

    println!("{}", xb);

    Ok(())
}
Shorthand notation

In Python you can use S instead of Expression.symbol, E instead of Expression.parse and N instead of Expression.num!

Symbol attributes

Symbols can have special properties that alters their behaviour. Most of these properties apply when the symbol is used as a function. For example, a symbol can be (anti)symmetric. This has the effect that the function arguments will be sorted according to the canonical form. For example, f(3,2,1) where f is symmetric:

from symbolica import Expression
f = Expression.symbol('f', is_symmetric=True)
e = f(3,2,1)
print(e)
use symbolica::{
    fun,
    atom::{Atom, FunctionBuilder},
    state::{FunctionAttribute, State},
};

fn main() -> Result<(), String> {
    let f = Symbol::new_with_attributes("f", &[FunctionAttribute::Symmetric])?;
    let a = fun!(
        f,
        Atom::new_num(3),
        Atom::new_num(2),
        Atom::new_num(1)
    );

    println!("{}", a);
    Ok(())
}

yields f(1,2,3). Antisymmetric functions will also be sorted and will obtain a minus sign for every swapped argument.

Symbols can also be defined as linear, which will linearize every argument. Combining the attributes symmetric and linear gives the familiar dot product:

from symbolica import Expression
f = Expression.symbol('f', is_symmetric=True, is_linear=True)
e = Expression.parse("f(3*x+y,x+2*y+z)")
print(e)
use symbolica::{
    atom::Atom,
    state::{FunctionAttribute, State},
};

fn main() {
    let _f = Symbol::new_with_attributes("f", &[FunctionAttribute::Linear, FunctionAttribute::Symmetric]);

    let out = Atom::parse("f(3*x+y,x+2*y+z)").unwrap();

    println!("{}", out);
}

which yields f(x,y)+3*f(x,z)+f(y,z)+3*f(x,x)+2*f(y,y)+6*f(x,y).

Important

Once symbols have been defined, their attributes cannot be changed anymore as this would invalidate previously built expressions. When parsing an expression, the default attributes will be assigned to previously undefined symbols. Therefore, make sure to define symbols with special attributes before parsing any expressions.

Walking the expression tree

The expression tree can be navigated by looping through the nodes and leafs:

from symbolica import *

def walk_tree(a: Expression, depth: int = 0):
    if a.get_type() in [AtomType.Var, AtomType.Num]:
        print(' '*depth, a)
    if a.get_type() == AtomType.Fn:
        print(' '*depth, 'Fun', a.get_name())
        for arg in a:
            walk_tree(arg, depth + 1)
    if a.get_type() == AtomType.Mul:
        print(' '*depth, 'Mul')
        for arg in a:
            walk_tree(arg, depth + 1)
    if a.get_type() == AtomType.Add:
        print(' '*depth, 'Add')
        for arg in a:
            walk_tree(arg, depth + 1)
    if a.get_type() == AtomType.Pow:
        print(' '*depth, 'Pow')
        walk_tree(a[0], depth + 1)
        walk_tree(a[1], depth + 1)


e = Expression.parse('x^2*y + f(1,2,3)')
walk_tree(e)
use symbolica::{
    id::AtomTreeIterator,
    atom::{Atom, AtomView},
};

fn tree_walk(a: AtomView, depth: usize) {
    match a {
        AtomView::Num(_) | AtomView::symbol(_) => println!("{:indent$}{}", "", a, indent = depth),
        AtomView::symbol(f) => {
            println!("{:indent$}Fun {}", "", f.get_symbol(), indent = depth);
            for a in f.iter() {
                tree_walk(a, depth + 1);
            }
        }
        AtomView::Pow(p) => {
            println!("{:indent$}", " ", indent = depth);
            let (base, exp) = p.get_base_exp();

            tree_walk(base, depth + 1);
            tree_walk(exp, depth + 1);
        }
        AtomView::Mul(m) => {
            println!("{:indent$}Mul", "", indent = depth);
            for a in m.iter() {
                tree_walk(a, depth + 1);
            }
        }
        AtomView::Add(a) => {
            println!("{:indent$}Add", "", indent = depth);
            for a in a.iter() {
                tree_walk(a, depth + 1);
            }
        }
    }
}

fn main() {
    let expr: Atom = Atom::parse("x^2*y + f(1,2,3)").unwrap();
    tree_walk(expr.as_view(), 0);
}

which yields:

 Add
  Fun f
   1
   2
   3
  Mul
   Pow
    x
    2
   y

To modify parts of the expression, use pattern matching with replacement.

Output formatting

Symbolica can format its output in a customizable way. For example, it can write Mathematica-compatible output or Latex output. Here is an example:

from symbolica import Expression
a = Expression.parse('128378127123 z^(2/3)*w^2/x/y + y^4 + z^34 + x^(x+2)+3/5+f(x,x^2)')
print(a.pretty_str(multiplication_operator=' ', num_exp_as_superscript=True, number_thousands_separator='_'))
print(a.pretty_str(latex=True))
use symbolica::{printer::AtomPrinter, atom::Atom};

fn main() {
    let n =
        Atom::parse("128378127123 z^(2/3)*w^2/x/y + y^4 + z^34 + x^(x+2)+3/5+f(x,x^2)").unwrap();

    let mut p = AtomPrinter::new(n.as_view());
    p.print_opts.number_thousands_separator = Some('_');
    p.print_opts.multiplication_operator = ' ';
    p.print_opts.num_exp_as_superscript = true;
    println!("{}", p);

    let mut p = AtomPrinter::new(n.as_view());
    p.print_opts.latex = true;
    println!("{}", p);
}

which yields:

z³⁴+x^(x+2)+y⁴+f(x,x²)+128_378_127_123 z^(2/3) w² x⁻¹ y⁻¹+3/5
$$z^{34}+x^{x+2}+y^{4}+f(x,x^{2})+128378127123 z^{\frac{2}{3}} w^{2} \frac{1}{x} \frac{1}{y}+\frac{3}{5}$$

Canonical form

An expression has a normal form or canonical form, which is one preferred way in which it is written. For example, the normal form of 1+1 is 2, and the normal form of x*x is x^2. The procedure of normalizing an expression is performed after every operation, so that you should never see a non-canonical expression. Each computer algebra system has its own rules to decide which form is the normal form (or may not even write expression in a unique form!).

The (recursive) rules for normalization for Symbolica are defined per atom. For a

  • Variable
    • Always normalized
  • Number
    • Remove the common divisor between the numerator and denominator
  • Sum
    • Normalize and sort all summands
    • Remove summands that are 0
    • Add summands that are equal apart from their coefficient (\(x+ 2x \rightarrow 3x\))
  • Product
    • Normalize and sort all factors
    • If any factor is 0, return the number 0
    • Remove factor of 1
    • Convert equal factors apart from their power into a power (\(x x^a \rightarrow x^{a+1}\))
  • Power
    • Normalize the base and exponent
    • Normalize \(\text{num}^\text{num}\), \(0^a \rightarrow 1\), and \(1^a \rightarrow 1\)
  • Function
    • Normalize all arguments
    • Flatten arguments of built-in function arg(...)
    • Sort all arguments when the function is symmetric

The comparison of two atoms, which is used for sorting, is well-defined, however the convention could change between Symbolica versions.

Note that normalization is supposed to be quick, since it happens often. This means that expensive operations, such as expansion and factorization are not considered. For example, \((x+1)^2\) is a Symbolica normal form and \(1+2x+x^2\) is as well, even though they are mathematically equivalent.