Hi, I’m an independent ML researcher seeking an arXiv endorsement for cs.LG. My paper introduces QEAC, a closed-loop co-optimization framework for sparse attention and KV cache quantization. Tested on Mistral-7B and TinyLlama-1.1B with strong results (2% perplexity degradation at 4x compression vs 20% for uniform 4-bit).
My endorsement code is: Q9NXPT
Happy to share the PDF. Thank you!