Hatchet News

I am sharing a research-grade, open-source trading execution framework that achieves a median end-to-end decision latency of 890 nanoseconds on commodity hardware.

The project is designed for education, systems research, and latency instrumentation, not for live trading. It focuses on understanding exactly where every nanosecond goes in a trading execution path.

Key features:

- Kernel-bypass networking: Direct userspace access to NICs via custom drivers, 20-50 ns RX latency - Lock-free SPSC/MPSC queues: Zero-copy architecture - SIMD feature extraction: About 40 ns per update using AVX-512 - Deterministic replay: Bit-identical execution paths, SHA-256 verified - Nanosecond-level metrics: Full audit logs and performance dashboard

Technical stack: C++17 and Rust, NUMA-aware memory allocation, cache-line alignment, inline assembly for hot paths.

The framework is modular, allowing experimentation with different NIC drivers, feature extraction pipelines, or order-flow models such as Hawkes processes or Avellaneda-Stoikov logic. Everything is open source and documented.

Links:

Live demo: https://submicro.krishnabajpai.me/ Source code: https://github.com/krish567366/submicro-execution-engine Bare-metal NIC drivers: https://baremetalnic.krishnabajpai.me/

I would welcome feedback from anyone working on low-latency systems, networking, or HFT research.

Some questions for discussion:

- Which part of the execution path is typically hardest to optimize? - What measurement techniques do you trust for sub-microsecond systems?

This project is for research and educational purposes only. It does not connect to exchanges or execute real trades. It is intended as a sandbox for understanding ultra-low-latency execution.

I am happy to answer questions about methodology, performance, or design trade-offs.