trying without the --. I thought that is how the args are being separated
trying without the --. I thought that is how the args are being separated

memory_efficient_attention_forward with inputs: query : shape=(2, 4096, 8, 40) (torch.float16) key : shape=(2, 4096, 8, 40) (torch.float16) value : shape=(2, 4096, 8, 40) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 cutlassF is not supported because: xFormers wasn't build with CUDA support flshattF is not supported because: xFormers wasn't build with CUDA support tritonflashattF is not supported because: xFormers wasn't build with CUDA support triton is not available requires A100 GPU smallkF is not supported because: xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) max(query.shape[-1] != value.shape[-1]) > 32 unsupported embed per head: 40




memory_efficient_attention_forwardcutlassFflshattFtritonflashattFsmallkF