Hatchet News

top
new
best
ask
show
jobs

13h

arxiv.org

Lossless LLM compression for efficient GPU inference via dynamic-length float

326

106 CharlesW

Designed and developed by Tommy Chow (GitHub)