Llama 3.1 via Ollama

tim · 2024-07-25T15:48:39.884Z

You can now use the tutorial on running Ollama on serverless environments (https://docs.runpod.io/tutorials/serverless/cpu/run-ollama-inference) in combination with Llama 3.1. We have tested this with Llama 3.1 8B, using a network volume and a 24 GB GPU PRO. Please let us know if this setup also works with other weights and GPUs.

P

PatrickR•7/25/24, 4:46 PM

Docs on that Docker image are now updated. Thanks for the ping!

T

timOP•7/25/24, 5:32 PM

@PatrickR thank you very much!

M

Madiator2011•7/25/24, 8:21 PM

#Better Ollama - CUDA12 works with gpu

A

aurelium•7/26/24, 4:37 PM

When you say "In the Container Start Command field, specify the Ollama supported model", do you mean literally just pasting the ollama model ID into that field?

P

PatrickR•7/26/24, 5:29 PM

Yes. Like

orca-mini

orca-mini

or

llama3.1

llama3.1

P

PatrickR•7/26/24, 5:49 PM

Also, the Docker image just updated to version 0.9

pooyaharatian/runpod-ollama:0.0.9

pooyaharatian/runpod-ollama:0.0.9

A

aurelium•7/26/24, 6:45 PM

I keep getting JSON decoding errors trying to run queries on it...

P

PatrickR•7/26/24, 8:02 PM

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

Are you passing this?

A

aurelium•7/26/24, 8:07 PM

Yeah:

{
  "delayTime": 117699,
  "error": "{\"error_type\": \"<class 'requests.exceptions.JSONDecodeError'>\", \"error_message\": \"Extra data: line 1 column 5 (char 4)\", \"error_traceback\": \"Traceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 974, in json\\n    return complexjson.loads(self.text, **kwargs)\\n  File \\\"/usr/lib/python3.10/json/__init__.py\\\", line 346, in loads\\n    return _default_decoder.decode(s)\\n  File \\\"/usr/lib/python3.10/json/decoder.py\\\", line 340, in decode\\n    raise JSONDecodeError(\\\"Extra data\\\", s, end)\\njson.decoder.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\\nDuring handling of the above exception, another exception occurred:\\n\\nTraceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\\\", line 134, in run_job\\n    handler_return = handler(job)\\n  File \\\"//runpod_wrapper.py\\\", line 39, in handler\\n    return response.json()\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 978, in json\\n    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)\\nrequests.exceptions.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\", \"hostname\": \"ogp9bh9fndvgck-64411159\", \"worker_id\": \"ogp9bh9fndvgck\", \"runpod_version\": \"1.6.2\"}",
  "executionTime": 61,
  "id": "c4794910-58f5-4179-98a9-0b0779ba0749-u1",
  "status": "FAILED"
}

{
  "delayTime": 117699,
  "error": "{\"error_type\": \"<class 'requests.exceptions.JSONDecodeError'>\", \"error_message\": \"Extra data: line 1 column 5 (char 4)\", \"error_traceback\": \"Traceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 974, in json\\n    return complexjson.loads(self.text, **kwargs)\\n  File \\\"/usr/lib/python3.10/json/__init__.py\\\", line 346, in loads\\n    return _default_decoder.decode(s)\\n  File \\\"/usr/lib/python3.10/json/decoder.py\\\", line 340, in decode\\n    raise JSONDecodeError(\\\"Extra data\\\", s, end)\\njson.decoder.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\\nDuring handling of the above exception, another exception occurred:\\n\\nTraceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\\\", line 134, in run_job\\n    handler_return = handler(job)\\n  File \\\"//runpod_wrapper.py\\\", line 39, in handler\\n    return response.json()\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 978, in json\\n    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)\\nrequests.exceptions.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\", \"hostname\": \"ogp9bh9fndvgck-64411159\", \"worker_id\": \"ogp9bh9fndvgck\", \"runpod_version\": \"1.6.2\"}",
  "executionTime": 61,
  "id": "c4794910-58f5-4179-98a9-0b0779ba0749-u1",
  "status": "FAILED"
}

{
  "delayTime": 117699,
  "error": "{\"error_type\": \"<class 'requests.exceptions.JSONDecodeError'>\", \"error_message\": \"Extra data: line 1 column 5 (char 4)\", \"error_traceback\": \"Traceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 974, in json\\n    return complexjson.loads(self.text, **kwargs)\\n  File \\\"/usr/lib/python3.10/json/__init__.py\\\", line 346, in loads\\n    return _default_decoder.decode(s)\\n  File \\\"/usr/lib/python3.10/json/decoder.py\\\", line 340, in decode\\n    raise JSONDecodeError(\\\"Extra data\\\", s, end)\\njson.decoder.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\\nDuring handling of the above exception, another exception occurred:\\n\\nTraceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\\\", line 134, in run_job\\n    handler_return = handler(job)\\n  File \\\"//runpod_wrapper.py\\\", line 39, in handler\\n    return response.json()\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 978, in json\\n    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)\\nrequests.exceptions.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\", \"hostname\": \"ogp9bh9fndvgck-64411159\", \"worker_id\": \"ogp9bh9fndvgck\", \"runpod_version\": \"1.6.2\"}",
  "executionTime": 61,
  "id": "c4794910-58f5-4179-98a9-0b0779ba0749-u1",
  "status": "FAILED"
}

{
  "delayTime": 117699,
  "error": "{\"error_type\": \"<class 'requests.exceptions.JSONDecodeError'>\", \"error_message\": \"Extra data: line 1 column 5 (char 4)\", \"error_traceback\": \"Traceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 974, in json\\n    return complexjson.loads(self.text, **kwargs)\\n  File \\\"/usr/lib/python3.10/json/__init__.py\\\", line 346, in loads\\n    return _default_decoder.decode(s)\\n  File \\\"/usr/lib/python3.10/json/decoder.py\\\", line 340, in decode\\n    raise JSONDecodeError(\\\"Extra data\\\", s, end)\\njson.decoder.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\\nDuring handling of the above exception, another exception occurred:\\n\\nTraceback (most recent call last):\\n  File \\\"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\\\", line 134, in run_job\\n    handler_return = handler(job)\\n  File \\\"//runpod_wrapper.py\\\", line 39, in handler\\n    return response.json()\\n  File \\\"/usr/local/lib/python3.10/dist-packages/requests/models.py\\\", line 978, in json\\n    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)\\nrequests.exceptions.JSONDecodeError: Extra data: line 1 column 5 (char 4)\\n\", \"hostname\": \"ogp9bh9fndvgck-64411159\", \"worker_id\": \"ogp9bh9fndvgck\", \"runpod_version\": \"1.6.2\"}",
  "executionTime": 61,
  "id": "c4794910-58f5-4179-98a9-0b0779ba0749-u1",
  "status": "FAILED"
}

A

aurelium•7/26/24, 8:09 PM

request:

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

{
  "input": {
    "method_name": "generate",
    "input": {
      "prompt": "why the sky is blue?"
    }
  }
}

P

PatrickR•7/26/24, 9:01 PM

downgrade the docker image to 0.0.7

Aaurelium Yeah: ```json { "delayTime": 117699, "error": "{\"error_type\": \"<class 'r...

T

timOP•7/26/24, 10:17 PM

I also see this error for

0.0.9

0.0.9

, so please use
0.0.8
0.0.8
, as that one is working.

I opened https://github.com/pooyahrtn/RunpodOllama/issues/11 to get this fixed.

GitHub

0.0.9 is broken · Issue #11 · pooyahrtn/RunpodOllama

When using the 0.0.9 of this image, we receive this error: { "delayTime": 14006, "error": "{"error_type": "<class 'requests.exceptions.JSONDecodeEr...

Aaurelium When you say "In the Container Start Command field, specify the Ollama supported...