GLM-4.7-Flash benchmarks: 4,398 tok/s on H200, 112 tok/s on RTX 6000 Ada (GGUF)
🔴 r/LocalLLaMA by /u/LayerHot
technical
View Original Post ↗ No analysis available for this story.
This story was indexed before article generation was enabled.
🤖 Classification Details
Detailed benchmark results with specific hardware, configurations, and reproducible metrics. Includes throughput tables, concurrency results, and quantization comparisons.