Cloudflare Developers•13mo ago

Your Worker failed validation because it exceeded startup limits. Global Scope.

I love developing with workers, however I continue to get errors about the startup limits (CPU time). They seem quite random and hard to debug. Pushing multiple times sometimes works. Coming from AWS Lambda, the global scope is no bad place to cache certain data. How should we do that in Cloudflare workers? E.g. Reusing Database Connections etc. Let´s take this code, it wont deploy, however I am not doing anything in the global scope at coldstart:

import {
  AICache,
  authorize,
  CorsResponse,
  OpenAIStreamHandler,
} from "@tm9657/backend-worker";
import GPT3Tokenizer from "gpt3-tokenizer";
import { Configuration, OpenAIApi } from "openai";

export interface Env {
  DATABASE_URL: string;
  OPENAI_KEY: string;
}

type AIRequestBody = {
  prompt: string;
  system?: string;
  top_p?: number;
  frequency_penalty?: number;
  max_tokens?: number;
};

let cache: AICache | null = null;

export default {
  async fetch(
    request: Request,
    env: Env,
    ctx: ExecutionContext
  ): Promise<Response> {
    const auth = await authorize(request);
    if (!auth) return new CorsResponse("Unauthorized", 401).finalize(request);

    const body: AIRequestBody = await request.json();
    if (!body || !body?.prompt)
      return new CorsResponse("Bad Request", 400).finalize(request);
    if (!cache) cache = new AICache();
    cache.init(env.DATABASE_URL);
    let openai: OpenAIApi = new OpenAIApi(
      new Configuration({
        apiKey: env.OPENAI_KEY,
      })
    );

    if (!openai)
      return new CorsResponse("Internal Server Error", 500).finalize(request);

    const usage = await cache.getUsage(auth.sub);
    if (!usage.getHardLimit())
      return new CorsResponse(
        "Monthly API limit hit, please upgrade your subscription!",
        429
      ).finalize(request);

    const tokenizer = new GPT3Tokenizer({ type: "gpt3" });
    const { readable, writable } = new TransformStream();

    const openAIRequest = await openai.createChatCompletion(
      {
        model: "gpt-3.5-turbo",
        messages: [
          { role: "system", content: body.system || "" },
          { role: "user", content: `${body.prompt}` },
        ],
        top_p: body.top_p || 0.05,
        max_tokens:
          4096 - tokenizer.encode(`${body.prompt} ${body.system}`).bpe.length,
        user: auth.sub,
        frequency_penalty: body.frequency_penalty || 1.0,
        stream: true,
      },
      { responseType: "stream" }
    );

    const writableStream = writable.getWriter();

    let total = "";
    const handler = new OpenAIStreamHandler(
      openAIRequest,
      async (message) => {
        total += message;
        await writableStream.write({ total, message });
      },
      async () => {
        await writableStream.close();
        await cache?.updateUsage(
          auth.sub,
          "chatGPT",
          tokenizer.encode(`${body.prompt} ${body.system} ${total}`).bpe.length
        );
      }
    ).promise();

    return new CorsResponse(readable).finalize(request);
  },
};

import {
  AICache,
  authorize,
  CorsResponse,
  OpenAIStreamHandler,
} from "@tm9657/backend-worker";
import GPT3Tokenizer from "gpt3-tokenizer";
import { Configuration, OpenAIApi } from "openai";

export interface Env {
  DATABASE_URL: string;
  OPENAI_KEY: string;
}

type AIRequestBody = {
  prompt: string;
  system?: string;
  top_p?: number;
  frequency_penalty?: number;
  max_tokens?: number;
};

let cache: AICache | null = null;

export default {
  async fetch(
    request: Request,
    env: Env,
    ctx: ExecutionContext
  ): Promise<Response> {
    const auth = await authorize(request);
    if (!auth) return new CorsResponse("Unauthorized", 401).finalize(request);

    const body: AIRequestBody = await request.json();
    if (!body || !body?.prompt)
      return new CorsResponse("Bad Request", 400).finalize(request);
    if (!cache) cache = new AICache();
    cache.init(env.DATABASE_URL);
    let openai: OpenAIApi = new OpenAIApi(
      new Configuration({
        apiKey: env.OPENAI_KEY,
      })
    );

    if (!openai)
      return new CorsResponse("Internal Server Error", 500).finalize(request);

    const usage = await cache.getUsage(auth.sub);
    if (!usage.getHardLimit())
      return new CorsResponse(
        "Monthly API limit hit, please upgrade your subscription!",
        429
      ).finalize(request);

    const tokenizer = new GPT3Tokenizer({ type: "gpt3" });
    const { readable, writable } = new TransformStream();

    const openAIRequest = await openai.createChatCompletion(
      {
        model: "gpt-3.5-turbo",
        messages: [
          { role: "system", content: body.system || "" },
          { role: "user", content: `${body.prompt}` },
        ],
        top_p: body.top_p || 0.05,
        max_tokens:
          4096 - tokenizer.encode(`${body.prompt} ${body.system}`).bpe.length,
        user: auth.sub,
        frequency_penalty: body.frequency_penalty || 1.0,
        stream: true,
      },
      { responseType: "stream" }
    );

    const writableStream = writable.getWriter();

    let total = "";
    const handler = new OpenAIStreamHandler(
      openAIRequest,
      async (message) => {
        total += message;
        await writableStream.write({ total, message });
      },
      async () => {
        await writableStream.close();
        await cache?.updateUsage(
          auth.sub,
          "chatGPT",
          tokenizer.encode(`${body.prompt} ${body.system} ${total}`).bpe.length
        );
      }
    ).promise();

    return new CorsResponse(readable).finalize(request);
  },
};

6 Replies

James•13mo ago

Debugging CPU startup issue time in workers leaves a lot to be desired unfortunately. Profiling is very tough. Almost certainly one of the libraries you're importing gpt3-tokenizer, openai, etc. is doing something in the global scope on startup that's exceeding the ~400ms CPU startup time you get. You could try and load these async via await import() if possible, depending on your use-case.

James•13mo ago

As for reusing database connections, the recommended approach I believe is to use a Durable Object right now: https://developers.cloudflare.com/workers/learning/using-durable-objects/

Using Durable Objects · Cloudflare Workers docs

Durable Objects provide low-latency coordination and consistent storage for the Workers platform through two features: global uniqueness and a …

Felix || TM9657•13mo ago

Thank you for your quick answer James 🙂 I will try to debug it further. The DX just suffers a bit when you have to pray for each function to ship after creating a function for the more limited Workers API. For caching. Is it okay to dynamically populate all the global objects? initialize with null and set them on demand?

James•13mo ago

If you're confident there's no chance of anything leaking between requests when caching in global scope, then yeah that should be fine

Felix || TM9657•13mo ago

chance (good enough)

Unknown User•13mo ago

Message Not Public

Gaming

Programming

Your Worker failed validation because it exceeded startup limits. Global Scope.