RunpodR
Runpod•7mo ago
Ellroy

Can't deploy Qwen/Qwen2.5-14B-Instruct-1M on serverless

Steps to reproduce:

  1. Use Serverless vLLM quick deploy for Qwen/Qwen2.5-14B-Instruct-1M (image attached)
  2. Proceed with default config.
  3. Try and send a request.
Error:

2025-06-18T12:58:36.147823280Z INFO 06-18 12:58:36 [model_runner.py:1170] Starting to load model Qwen/Qwen2.5-14B-Instruct-1M...
2025-06-18T12:58:36.449947523Z engine.py:116  2025-06-18 12:58:36,449 Error initializing vLLM engine: FlashAttentionImpl.__init__() got an unexpected keyword argument 'layer_idx'


How do I fix this?

I've been trying to troubleshoot this all morning. All help appreciated 🙏
Screenshot_2025-06-18_at_14.05.04.png
Was this page helpful?