Term streaming
In many scenarios an expression contains many terms, where each term is relatively small. Also often, symbolic operations can be performed term by term. The list of common operations that cannot be done term per term is short:
- Factorization
- GCD computation
- Horner scheme construction
- Pattern match \(a+b\)
- …
The following operations can be done term per term:
- Most pattern matches, such as \(y f(x)\)
- Expansion
- Differentiation
- Integration
- …
When using some global memory, operations such as finding the maximum term, or counting the number of terms that satisfy a certain condition can also be done term-by-term.
With this in mind, we realize that for many operations the expression does not need to be in memory as a whole at the same time: only one term needs to be in memory. Once it is processed, it can be written to disk. We call this term streaming and it allows for huge expressions to be processed.
Symbolica can convert an expression into a stream and it provides an easy way to perform operations on terms in parallel by using a map
. The input of the map
is one term and the output should be an expression. In the example below a derivative is applied to each term, in parallel.
from symbolica import *
= Expression.symbol('x', 'x_', 'f')
x, x_, f = Expression.parse('x+ f(x) + 2*f(y) + 7*f(z)')
e
= TermStreamer(n_cores=2)
s
s.push(e)= s.map(Transformer().replace_all(f(x_), f(x) + x))
e print(s.to_expression())
More about Transformer
s can be read here.
let input = Atom::parse("x+ f(x) + 2*f(y) + 7*f(z)").unwrap();
let pattern = Pattern::parse("f(x_)").unwrap();
let rhs = Pattern::parse("f(x) + x").unwrap();
let mut stream = TermStreamer::<CompressorWriter<_>>::new(TermStreamerConfig {
: 4,
n_cores: ".".to_owned(),
path: 40,
max_mem_bytes});
.push(input);
stream
// map every term in the expression
= stream.map(|x| pattern.replace_all(x.as_view(), &rhs, None, None).expand());
stream
let res = stream.to_expression();
println!("\t+ {}", res);
Normalization
Since every term is processed independently, there is no merging of terms (for example x + 2x = 3x
). To force the normalization of the expression, the user can call normalize
on the term stream. This will sort the entire expression using a merge sort (potentially disk-based).
Global information
Information that involves comparison between terms can be collected using appropriate synchronized structures, like atomics or mutexes. Currently, this is only possible in Rust. Note that due to Rust’s borrow checker, there can be no memory problems such as race conditions.
In the example below, the maximum numerical argument of the function f
over all terms is determined (in parallel).
Not available yet.
Obtain the maximal function argument using an atomic integer.
use std::sync::atomic::{AtomicI64, Ordering};
use symbolica::{
coefficient::CoefficientView,
id::{Condition, Match, MatchSettings, Pattern},
atom::{Atom, AtomCore, AtomView},
streaming::TermStreamer,
};
fn main() {
let x_ = Symbol::new("x_");
let input = Atom::parse("f(1)+f(2)+f(3)+f(4)").unwrap();
let pattern = Pattern::parse("f(x_)").unwrap();
let mut stream = TermStreamer::default();
.push(input);
stream
let max_argument = AtomicI64::new(0);
// map every term in the expression
.map(|_, term| {
streamif let Some((_, _, _, m)) = pattern
.pattern_match(
.as_view(),
term&Condition::default(),
&MatchSettings::default(),
).next()
{
if let Some(Match::Single(n)) = m.get(x_) {
if let AtomView::Num(nn) = n {
if let CoefficientView::Natural(a, 1) = nn.get_coeff_view() {
.fetch_max(a, Ordering::SeqCst);
max_argument}
}
}
}
term});
println!("Max argument: {}", max_argument.load(Ordering::SeqCst));
}