Performance Notes
pyhuge 0.3 runtime is dominated by numerical solver cost.
Practical guidance
- Use
ctfor fast threshold-style path baselines. - Use
mborglassowhen selection quality matters more than raw speed. starsis slower thanric/ebicbecause it resamples repeatedly.- Reuse transformed data from
huge_npn(...)when running multiple methods.
Optional acceleration
pyhuge._native_core accelerates selected kernels (e.g., threshold path/sparsity).
Check availability:
import pyhuge
print(pyhuge.test()["native_extension"])
Benchmark pattern
import time
import numpy as np
from pyhuge import huge
x = np.random.default_rng(0).normal(size=(300, 100))
t0 = time.perf_counter()
fit = huge(x, method="mb", nlambda=10, verbose=False)
print("sec", time.perf_counter() - t0, "path", len(fit.path))
Native vs R parity report
A dedicated script produces reproducible parity metrics against local R huge
(when available):
cd python-package
python scripts/r_parity_report.py --out parity_report.json
Current behavior:
ct + starsparity is evaluated by default.glasso + ebicparity is evaluated via the native C++ backend.
Use the JSON output to track drift after solver or selection changes.