🦀 Scientific Computing Benchmark: Rust 🦀 vs Zig ⚡ vs the father C 👴
by Daniel Boros
Jul 7
6 min read
521 views
I've always been a big fan of low-level, hardware-oriented programming languages — especially when it comes to scientific computing. As the maintainer of stochastic-rs, I’ve long wanted to try writing similar code in Zig and C.
I have to admit: I'm not an expert in either Zig or C. That’s exactly why this benchmark is meant to show what happens if someone (like me) who’s used to Rust goes online, reads a few docs, browses some StackOverflow threads, and just tries to implement a basic algorithm "quick and dirty."
Instead of yet another trivial million-times for loop, I picked something more "scientific": a stochastic process that’s widely known in stochastic calculus and mathematical finance — specifically, the Ornstein–Uhlenbeck (OU) process.
We generate 1,000 trajectories, each of length 500,000. The process is not too complex: it is a simple diffusive stochastic process where each increment comes from a normal distribution and is accumulated using a basic Euler scheme.
Mathematically, the OU process is defined as:
where:
- is the speed of mean reversion,
- is the long-term mean,
- is the volatility,
- is the standard Brownian motion increment.
🚀 Let's see the code
🦀 Rust solution
Before diving in, all tests were run on a MacBook Pro M1. I deliberately didn't use stochastic-rs
(which already contains a lot of optimizations), but instead wrote a straightforward, iterative generator.
All compiler settings were left at their defaults, and I used rand
and rand_distr
crates. The code was run with:
cargo run --release
use rand::SeedableRng;
use rand_distr::{Distribution, Normal};
use std::time::Instant;
const N: usize = 500_000;
const T: f64 = 1.0;
const DT: f64 = T / N as f64;
const THETA: f64 = 1.0;
const MU: f64 = 1.0;
const SIGMA: f64 = 1.0;
const NUM_RUNS: usize = 1000;
fn run_once(rng: &mut rand::rngs::SmallRng) -> u128 {
let start = Instant::now();
let mut gn = [0.0; N - 1];
let normal = Normal::new(0.0, DT.sqrt()).unwrap();
for i in 0..N - 1 {
gn[i] = normal.sample(rng);
}
let mut ou = [0.0; N];
for i in 1..N {
ou[i] = ou[i - 1] + THETA * (MU - ou[i - 1]) * DT + SIGMA * gn[i - 1];
}
std::hint::black_box(&ou);
start.elapsed().as_nanos()
}
fn main() {
let mut total_ns: u128 = 0;
let mut rng = rand::rngs::SmallRng::from_rng(&mut rand::rng());
for _ in 0..NUM_RUNS {
total_ns += run_once(&mut rng);
}
let avg_ns = total_ns / NUM_RUNS as u128;
println!("Average elapsed time over {} runs: {} ns", NUM_RUNS, avg_ns);
println!("Total: {} ns", total_ns);
}
⚡ Zig solution
In Zig, as a "complete beginner," I quickly ran into some module system struggles before I even got the code to compile.
For random number generation, I used zprob. The following code was run with:
zig build run -Doptimize=ReleaseFast
const std = @import("std");
const zprob = @import("zprob");
const num_runs = 1000;
fn run_once(allocator: *std.mem.Allocator) !u64 {
const n = 500_000;
const t = 1.0;
const dt = t / @as(f64, n);
var gn = try allocator.alloc(f64, n - 1);
defer allocator.free(gn);
var ou = try allocator.alloc(f64, n);
defer allocator.free(ou);
const theta = 1.0;
const mu = 1.0;
const sigma = 1.0;
const seed: u64 = @intCast(std.time.microTimestamp());
var prng = std.Random.DefaultPrng.init(seed);
var rand = prng.random();
var normal = zprob.Normal(f64).init(&rand);
var timer = try std.time.Timer.start();
for (0..n - 1) |i| {
const sample = try normal.sample(0.0, std.math.sqrt(dt));
gn[i] = sample;
}
ou[0] = 0.0;
for (1..n) |i| {
ou[i] = ou[i - 1] + theta * (mu - ou[i - 1]) * dt + sigma * gn[i - 1];
}
return timer.read();
}
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
var allocator = gpa.allocator();
var total_ns: u64 = 0;
for (0..num_runs) |_| {
const elapsed_ns = try run_once(&allocator);
total_ns += elapsed_ns;
}
const avg_ns = total_ns / num_runs;
std.debug.print("Average elapsed time over {} runs: {d} ns\n", .{ num_runs, avg_ns });
std.debug.print("Total: {d} ns\n", .{total_ns});
}
👴 C solution
Surprisingly, my C solution went more smoothly than Zig. For sanity (and simplicity), I implemented my own normal distribution generator using the classic Box-Muller method.
The code was compiled with:
gcc -O3 -march=native -o main main.c -lm
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#define N 500000
#define T 1.0
#define DT (T / (double)N)
#define THETA 1.0
#define MU 1.0
#define SIGMA 1.0
#define NUM_RUNS 1000
double rand_uniform() {
return ((double)rand() + 1.0) / ((double)RAND_MAX + 1.0);
}
long run_once() {
double *gn = malloc((N - 1) * sizeof(double));
double *ou = malloc(N * sizeof(double));
if (gn == NULL || ou == NULL) {
printf("Memory allocation failed!\n");
exit(1);
}
srand((unsigned int)time(NULL));
const double dt_sqrt = sqrt(DT);
struct timespec start_time, end_time;
clock_gettime(CLOCK_MONOTONIC, &start_time);
int i = 0;
while (i < N - 1) {
double u1 = rand_uniform();
double u2 = rand_uniform();
double r = sqrt(-2.0 * log(u1));
double theta = 2.0 * M_PI * u2;
double z0 = r * cos(theta);
double z1 = r * sin(theta);
gn[i] = z0 * dt_sqrt;
if (i + 1 < N - 1) {
gn[i + 1] = z1 * dt_sqrt;
}
i += 2;
}
ou[0] = 0.0;
for (int j = 1; j < N; j++) {
ou[j] = ou[j - 1] + THETA * (MU - ou[j - 1]) * DT + SIGMA * gn[j - 1];
}
clock_gettime(CLOCK_MONOTONIC, &end_time);
long seconds = end_time.tv_sec - start_time.tv_sec;
long nanoseconds = end_time.tv_nsec - start_time.tv_nsec;
long elapsed_ns = seconds * 1e9 + nanoseconds;
free(gn);
free(ou);
return elapsed_ns;
}
int main() {
long total_ns = 0;
for (int run = 0; run < NUM_RUNS; run++) {
total_ns += run_once();
}
long avg_ns = total_ns / NUM_RUNS;
printf("Average elapsed time over %d runs: %ld ns\n", NUM_RUNS, avg_ns);
printf("Total: %ld ns\n", total_ns);
return 0;
}
⚔️ The ugly truth
Somewhere deep down, I was hoping Rust would be the fastest — and it actually was! Of course, this result is likely a skill issue on my part, due to my lack of deep knowledge in Zig and C. But again, this benchmark was meant to illustrate what happens when a "regular guy" jumps online and tries to code something after a few Google searches and some docs reading.
📊 Results
Language | Avg (ms) | Total (ms) |
---|---|---|
Rust 🦀 | 3.08 ms | 3077 ms (~3.1 sec) |
Zig ⚡ | 3.58 ms | 3584 ms (~3.6 sec) |
C 👴 | 3.61 ms | 3611 ms (~3.6 sec) |
As you can see, Rust crushed the other two here, averaging about 0.6 seconds faster — roughly 17% faster in this particular setup.
💬 Let's discuss!
Let me know if you'd also like a slightly more casual or more formal version!
All code is available on GitHub: rust-dd/probability-benchmark.
Feel free to join the discussion and let’s talk about what I could (should) have done differently, possible optimizations, or other interesting approaches! Discussion available here: Discussion #1
Conclusion
Although I'm a Rust user (so I clearly can’t claim to be fully unbiased), if I put on my poker face and imagine this was my first time ever writing code, Rust still gave me by far the best developer experience. Next would be C, where — surprisingly — after a bit of Googling and some help from ChatGPT, I was able to compile the code relatively easily. However, this was definitely not the case with Zig: module installation was quite a struggle at first, and since Zig is still a young language, even large language models don’t provide very relevant support yet.