Lanbench [exclusive]
Have you used LANBench to optimize your AI server? Share your performance results and tuning tips in the comments below.
This is the story of , the silent sentinel of the local network. LANBench
| Metric | What it measures | Good Threshold (LAN) | | :--- | :--- | :--- | | | Latency from request send to first token back. | < 100ms for streaming. | | Token/s (throughput) | Tokens generated per second across the network. | > 80% of local speed. | | P95 Latency | Worst-case latency for 95% of requests. | < 500ms for interactive use. | | Request Failures | Timeouts or connection resets. | 0% on a healthy LAN. | Have you used LANBench to optimize your AI server
At its core, is a benchmarking framework designed to test Large Language Models (LLMs) and AI inference servers over a Local Area Network (LAN). Unlike traditional benchmarks that run on the same machine as the model (which can mask network latency and serialization overhead), LANBench simulates real-world client-server architectures. | Metric | What it measures | Good