Wasp•4w ago

Approach for long running backend operations

Hello all, I wanted to get a feel for the best approach here when it comes to server operations that take a long time to complete - say hours/days. What's the best approach here? I've been thinking of starting a python service that takes care of these ops. Frontend sends request to the service, service does its think via celery, service finishes and updates Wasp's DB via an api request. This means each operation that takes time needs an entity in the DB so we can set its status. This way the web-app is always up to date with operation status, doesn't hang up, and the operation doesn't get interrupted on page refresh or browser exit. Is there a simpler approach here?

15 Replies

kapa.ai•4w ago

Hi @aspiringaiguy. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave: ---------------------- For long-running backend operations (hours/days), your approach of offloading the work to a separate service (like a Python service with Celery) and tracking status in the database is solid and aligns with best practices for web apps. This ensures your main Wasp server remains responsive, and users can track progress without blocking the UI. However, Wasp also has built-in support for background jobs, which might simplify your setup if your requirements fit within its capabilities. Wasp jobs: - Run in the background, separate from the main request/response cycle. - Persist between server restarts. - Can be retried if they fail. - Can be scheduled for the future or set to recur. - Are managed via PostgreSQL using pg-boss, which is suitable for low-volume production use cases and doesn't require extra infrastructure beyond PostgreSQL. You can declare a job in your Wasp app and have it perform long-running work. For example:

job myLongJob {
  executor: PgBoss,
  perform: {
    fn: import { doLongTask } from "@src/workers/longTask"
  }
}

job myLongJob {
  executor: PgBoss,
  perform: {
    fn: import { doLongTask } from "@src/workers/longTask"
  }
}

You'd then trigger this job from your server-side code, and you can store job status in your database (using an entity) to keep the frontend updated on progress or completion. This approach keeps everything within the Wasp ecosystem and avoids the need for a separate Python/Celery service, unless you have requirements (like heavy CPU workloads or advanced orchestration) that Wasp's job system doesn't cover yet. Note that Wasp jobs are not suitable for CPU-heavy workloads, and for more advanced needs, you might still want to use an external service as you described. In summary:
- For most use-cases, Wasp's built-in jobs (with status tracking via entities) are the simplest approach. - For very long-running, CPU-intensive, or complex workflows, your external service approach is appropriate. More details on Wasp jobs and their limitations can be found in the official docs: Wasp Recurring Jobs and Declaring Jobs. ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ---------------------- Sorry @aspiringaiguy, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist: Perhaps try again in 15mins?

aspiringaiguyOP•4w ago

Thanks @kapa.ai . I was under the impression that Jobs were only for recurring tasks and that I couldn't trigger them via backend operations. Thanks for clearing that up. Are there limitations to Jobs that I should be aware of?

kapa.ai•4w ago

Let me check the knowledge base for you again. One moment! :books: ---------------------- You're welcome! It's a common misconception—Wasp Jobs are not limited to recurring tasks. You can trigger them directly from your backend operations or any NodeJS code, not just on a schedule. For example, you can submit a job like this:

import { mySpecialJob } from 'wasp/server/jobs'

const submittedJob = await mySpecialJob.submit({ job: "Johnny" })

import { mySpecialJob } from 'wasp/server/jobs'

const submittedJob = await mySpecialJob.submit({ job: "Johnny" })

Or, if you want to delay execution:

await mySpecialJob
  .delay(10) // delay in seconds
  .submit({ name: "Johnny" })

await mySpecialJob
  .delay(10) // delay in seconds
  .submit({ name: "Johnny" })

Job Definition and Usage Limitations and Considerations: - Executor: Currently, Wasp only supports the PgBoss executor, which requires your app's database to be PostgreSQL. If you use another database (like SQLite), jobs will not work and you'll get an error if you try to use them Declaring Jobs. - CPU-Intensive Work: Jobs run alongside your server code, sharing the same CPU. They are not suitable for CPU-heavy workloads. For such cases, consider using an external worker system pg-boss considerations. - Scaling: Wasp does not yet support running jobs in separate processes or scaling job workers independently from your main server. All jobs run in the same Node.js process as your server pg-boss considerations. - Job Modification: If you change or remove a scheduled job, you may need to manually clean up old schedules in the pgboss schema of your database to avoid stale jobs pg-boss considerations. - Production Suitability: The current job system is recommended for low-volume production use cases. For high-throughput or advanced orchestration, you may need a more robust solution. In summary, Wasp Jobs are flexible and can be triggered on-demand or scheduled, but they are best suited for lightweight, non-CPU-intensive background tasks in apps using PostgreSQL. For more demanding scenarios, an external job system may still be necessary. ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ---------------------- Sorry @aspiringaiguy, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist: Perhaps try again in 15mins?

aspiringaiguyOP•4w ago

I like the fact that Jobs are connected to the DB so the operation status is up to date. How about I use Jobs to trigger long-running jobs in my Python service. So the heavy workload goes to the service but the Job keeps track of it. Backend operation -> starts Job -> Job triggers Python service What do you think @kapa.ai ?

kapa.ai•4w ago

Let me check the knowledge base for you again. One moment! :books: ---------------------- Your approach—using Wasp Jobs to trigger long-running tasks in a Python service, while leveraging the job system to track status—is both reasonable and aligns with how Wasp is designed to be used. How this works in Wasp: - You can define a Job in Wasp that, when executed, calls out to your Python service (for example, by making an HTTP request or spawning a child process). - The Job itself will be tracked in the database via pg-boss, so you can monitor its status (queued, running, completed, failed, etc.). - You can also update your own entities in the database from within the Job to provide more granular or custom status updates for your frontend. Relevant details from the knowledge sources: - Wasp Jobs are not limited to recurring tasks; you can trigger them from backend operations and use them for background processing Job Definition and Usage. - It is possible (and sometimes recommended) to run external processes (like a Python script or service) from your Wasp server code, either by spawning a process or making HTTP requests Discord: adding python apps, Discord: GPTs. - Wasp Jobs persist between server restarts and can be used to track the status of background operations Recurring Jobs. Considerations: - Wasp Jobs themselves run in the Node.js process, so the actual heavy work should be offloaded to your Python service, as you plan. - You can update the status of the operation in your database from within the Job, ensuring your frontend is always up to date. - Make sure to handle error cases and timeouts, since the Job will be waiting for the Python service to respond or complete. Summary:
Your plan is a good fit for Wasp’s current capabilities. Use Jobs to orchestrate and track the status, and delegate the heavy lifting to your Python service. This gives you the best of both worlds: robust status tracking and scalable, language-agnostic background processing. If you need more details on how to call external services from a Wasp Job, let me know! ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ---------------------- Sorry @aspiringaiguy, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist: Perhaps try again in 15mins?

aspiringaiguyOP•4w ago

How would I handle timeouts actually? Or server restarts? The Job persists yes but the connection with my Python service will be severed, no? If there is no straight forward solution here then I prefer the Python service to use celery for long running job (with persistence) and communicate directly with the DB to update status. What do you think @kapa.ai?

kapa.ai•4w ago

Let me check the knowledge base for you again. One moment! :books: ---------------------- You're correct in your concerns: while Wasp Jobs persist across server restarts, the actual execution context (the Node.js process) will be interrupted if the server restarts. This means that if your Job is in the middle of waiting for a response from your Python service (for example, via an HTTP request), that connection will be severed and the Job will not automatically resume from where it left off. Wasp Jobs are tracked in the database, but the in-process execution is not automatically resumed or retried at the exact point of failure after a restart Recurring Jobs, Declaring Jobs. Timeouts:
Wasp does not provide built-in timeout handling for Jobs calling external services. You would need to implement your own timeout logic within the Job function (e.g., using a timeout on your HTTP request to the Python service). If the Job fails or times out, you can use the retry features of pg-boss to attempt the Job again, but this will re-run the whole Job, not resume a partial execution. Server Restarts:
If the server restarts during a Job's execution, the Job will be interrupted. When the server comes back up, pg-boss can retry failed Jobs depending on your configuration, but again, this means the Job will start from the beginning, not from where it left off. Your alternative approach:
If you need robust handling of long-running operations that can survive server restarts and require precise control over execution and persistence, your idea of using Celery in the Python service (with its own persistence and retry mechanisms) and having it update the database directly for status is a solid and production-proven approach. This way, the Python service is responsible for the long-running work and its reliability, and your Wasp app simply reads the status from the database. Summary:
- Wasp Jobs are great for orchestrating and tracking background work, but are limited by the Node.js process lifecycle. - For truly robust, long-running, and restart-resilient jobs, using a dedicated job system like Celery in your Python service (with direct DB updates for status) is preferable. The knowledge sources do not mention any built-in Wasp support for resuming in-progress Jobs after a server restart or for advanced timeout management, so your proposed approach is the more reliable solution for your requirements. ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ---------------------- Sorry @aspiringaiguy, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist: Perhaps try again in 15mins?

aspiringaiguyOP•4w ago

Alright @kapa.ai , how would you recommend the Python service communicate with the DB? I'm assuming this flow: Backend op -> triggers Python service long op -> Python op updates DB with status till completion Would the service do this via API requests to Wasp? So I should expose each DB op I want to do in the service as an API endpoint? Or is there a simpler way to directly communicate with the DB?

kapa.ai•4w ago

Let me check the knowledge base for you again. One moment! :books: ---------------------- Based on the knowledge sources, you have two main options for how your Python service can update the database with operation status: 1. Communicate via Wasp API Endpoints You can expose custom API endpoints in your Wasp app for the Python service to call. This is a recommended approach in the Wasp community, as it allows you to leverage Wasp’s authentication, authorization, and Prisma/TypeScript type safety. You would: - Create an API route in Wasp for each DB operation you want the Python service to perform. - Have your Python service make HTTP requests to these endpoints to update the status. This approach is considered safer and more maintainable, as you’re not exposing your database credentials and you can control access via API keys or other mechanisms. It also allows you to keep your business logic centralized in your Wasp app. Here’s a relevant discussion:

Another idea though: While you can connect directly with external tools to the DB and update the data in Postgres db, there is another approach also: creating an API route with Wasp, and then having those external tools send data to that API route. ... I would advise this approach. ... you are not giving your external tools the credentials for the whole Postgres DB, but only for that one API route -> so limited access. That is also easier to revoke -> if you want, you can change just that API key, not for the whole database. So I would recommend this as an approach!

Frontend React Template integration into OpenSaas template (Discord) 2. Direct Database Access Alternatively, your Python service could connect directly to the database (e.g., PostgreSQL) and update the status itself. This is technically possible, especially since Wasp uses standard databases, but it comes with some caveats: - You must manage database credentials securely. - You lose the benefits of Wasp’s type safety and business logic enforcement. - Schema changes in Wasp may break your Python service if it’s not kept in sync. This approach is less recommended unless you have a strong reason to bypass the Wasp server. --- Summary:
The recommended approach is to expose API endpoints in your Wasp app for your Python service to call when updating operation status. This keeps your architecture secure, maintainable, and aligned with Wasp best practices. Direct DB access is possible but less safe and maintainable. If you need an example of how to set up a custom API in Wasp, let me know! ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ---------------------- Sorry @aspiringaiguy, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist: Perhaps try again in 15mins?

aspiringaiguyOP•4w ago

Thanks @kapa.ai so to summarize: Wasp backend triggers long ops by sending a request to our Python service. The Python service handles resource intensive tasks and job persistence via something like Celery. The service updates Wasp's DB via exposed API ops to keep track of the execution status. This also requires that we create a DB entity for long op tracking and management. Our Wasp application would refer to this entity to know job status or trigger downstream ops based on execution status.

kapa.ai•4w ago

Let me check the knowledge base for you again. One moment! :books: ---------------------- Your summary is accurate and aligns well with the best practices discussed in the Wasp community and documentation: - The Wasp backend triggers long-running operations by sending a request to your Python service. - The Python service handles resource-intensive tasks and job persistence (using something like Celery). - The Python service updates Wasp’s database by making requests to custom API endpoints you expose in your Wasp app. This is the recommended approach, as it keeps your database secure and leverages Wasp’s type safety and business logic Discord: Frontend React Template integration. - You will need to create a database entity in Wasp to track and manage the status of these long operations. Your Wasp application can then refer to this entity to display job status or trigger downstream operations based on execution status Docs: Defining an Entity. This architecture keeps your system maintainable, secure, and robust, while allowing you to handle long-running, resource-intensive tasks outside of the Wasp server process. ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ---------------------- Sorry @aspiringaiguy, I seem to be having a bit of trouble. Don't worry though, I've pinged my support and they will try to fix me up! :technologist: Perhaps try again in 15mins?

martinsos•4w ago

Hey @aspiringaiguy , just to comment on your initial quesiton as I haven't read the whol convo: yes this sounds about right! And it makes sense that backend triggers those yeah, not direclty client. WHile you might have python service write to Wasp's DB directly via SQL queries, you could also (if you want to utilize Prisma) have Python service call a Custom API route on the Wasp's server and pass data to it, and then Wasp server writes it down to the database using Prisma - just an idea, it is a bit more complex and you need to authenticate the microservice against the Wasp's server (although otherwise you nee dto to dhe same against the db I guess).

aspiringaiguyOP•3w ago

Thanks @martinsos, I was thinking of using an api key to authenticate the Python service against my Wasp server and any write/read operations I need to do to the DB would be exposed as API endpoints from Wasp. Is that along your suggestion?

martinsos•3w ago

Yeah that makes sense! THen you have Wasp server as the center, both regarding DB logic and authentication, and python just uses it. How are you going to go about implementing the api key, just a secret that only wasp server and python service know?

aspiringaiguyOP•2w ago

Yeah an env var secret in both servers. Do you have a more secure approach?

Gaming

Programming

Approach for long running backend operations

Did you find this page helpful?