Tag
Rust
Rust is my primary language for systems that must be both fast and correct. I use it for the Polybius trading engine: async Tokio runtime, zero-copy WebSocket feeds, and a DuckDB-backed ledger.
Blog
June 1, 2026
An FFT for every GPU: ferrum-gpu and gpufft
Two Python packages, one idea: GPU FFTs that don't care whose GPU you own. gpufft wraps cuFFT and VkFFT for cross-vendor transforms today (NVIDIA, AMD, Intel, Apple); ferrum-gpu writes the kernels in pure Rust, compiled to PTX by cuda-oxide, within 1.3-3.7× of cuFFT. Both on PyPI.
March 17, 2026
Price as Geometry: Resolution, Coarse-Graining, and the Structure of Market Noise
A rigorous tour through stationary and non-stationary models of price evolution, with geometric analysis at the forefront. From the random walk null and Black-Scholes as flat geometry, through mean reversion as curved Riemannian diffusion, wavelets, geometric harmonics, and information geometry, anchored throughout by empirical evidence from BTC/ETH millisecond data.
March 9, 2026
BTC/ETH Lead-Lag: Resolution-Dependent Direction Reversal on Binance Spot
Resolution-dependent direction reversal in BTC/ETH lead-lag on Binance spot: ETH leads at 1ms, BTC leads at 100ms, crossover at 15–20ms. January and full year 2025.
Engineering
Pathwise
High-performance SDE simulation toolkit. Rust core with Python bindings via PyO3. Supports Brownian motion, GBM, Ornstein-Uhlenbeck, and custom processes with rayon-parallelised Monte Carlo.
ferrum-gpu
Pure-Rust GPU compute substrate with Python bindings. FFT kernels compile from Rust source straight to PTX via cuda-oxide (no CUDA C in the build) and run on NVIDIA GPUs today; cross-vendor support through spirv-oxide and Vulkan is the v0.2 roadmap. A `no_std` `Backend` trait, typed `Device<B>` and `Buffer<T, B>` facades, and 1D/2D radix-2 Stockham C2C kernels cross-validated against `numpy.fft` across 29 GPU integration tests within 1e-3 to 1e-4 relative error. Published on PyPI.
gpufft
Cross-vendor GPU FFT for Rust, backed by VkFFT on Vulkan and cuFFT on CUDA. A single trait surface runs identically on NVIDIA, AMD, and Intel; buffers and plans are typed at the backend-and-scalar level, so mixing a Vulkan buffer with a CUDA plan, or an `f32` plan with a `Complex64` buffer, is a compile error. `plan_c2c`, `plan_r2c`, and `plan_c2r` at `f32` and `f64` over 1D, 2D, and 3D. Ships a dual-backend manylinux Python wheel cross-validated against `numpy`. Sibling to ferrum-gpu, sharing its FFT API.
Cartan
Riemannian geometry on smooth manifolds. Connection forms, parallel transport, holonomy groups, and geodesic flows. Documentation at cartan.sotofranco.dev.
Volterra
Covariant active nematics solver built on Cartan. Dimensionally agnostic: any type implementing cartan_core::Manifold serves as the simulation domain. Beris-Edwards equations on arbitrary Riemannian geometries via discrete exterior calculus.
von Karman
Multi-precision pseudospectral Navier-Stokes solver. ETD-RK4 time integration with Kassam-Trefethen contour integral, dealiased 3/2 cross product, adaptive CFL. Five initial conditions, energy spectrum diagnostics, HDF5 snapshots, Parquet time series.
Elworthy
JIT compiler that specialises Bismut-Elworthy-Li formulas into SIMD kernels for unbiased Monte Carlo Greeks on non-stationary SDEs. Symbolic AST, Cranelift lowering (scalar and 2-lane F64X2), multi-dimensional Heston driver, pathwise and likelihood-ratio Malliavin parameter Greeks (machine-checked with SymPy). European call price and BEL delta cross-validated against Black-Scholes closed form and the independent blackscholes crate; both agree within four Monte Carlo standard errors. About 22x over a tree-walking interpreter on GBM paths.
Kloeden
Hand-written SIMD C++ vs Rust (LLVM + Cranelift) benchmark companion to pathwise and elworthy. Same Brownian-increment fixture across four impls; single-thread pinned-core throughput on scalar Euler / Milstein / Taylor 1.5 on GBM, plus a digital-delta correctness table showing naive pathwise silently returns 0 in both languages while the Bismut-Elworthy-Li constant-flow weight matches analytic within 4 Monte Carlo standard errors (bitwise-identical between hand-rolled C++, hand-rolled Rust, and elworthy_rt::from_paths). Named after Peter Kloeden.
Mermin
$k$-atic alignment analysis of fluorescence microscopy. Minkowski tensor shape descriptors, multiscale structure tensor fields, topological defect detection via cartan-geo $\mathrm{SO}(3)$ holonomy, persistent homology, spatial statistics, and Landau-de Gennes parameter fitting. Outputs calibrated parameters for forward simulation in volterra. 7 crates on crates.io, Python bindings on PyPI.
inferCNAsc
Copy number variation and ascertainment bias inference for single-cell genomics. Rust compute core with Python bindings via PyO3/maturin. Published on PyPI.
cuda-oxide #117
Merged upstream PR to NVlabs/cuda-oxide. feat(codegen): fma contraction and an opt -O3 pass to match nvcc defaults. Merged 2026-06-18T05:37:45Z.
cuda-oxide #256
Merged upstream PR to NVlabs/cuda-oxide. feat(cargo-oxide): add `emit-ltoir` to build a crate's LTOIR in one step. Merged 2026-06-20T12:56:37Z.
cuda-oxide #257
Merged upstream PR to NVlabs/cuda-oxide. fix(cargo-oxide): rebuild cached backend when its source advances. Merged 2026-06-20T12:57:09Z.