Neon•2y ago

passive-yellow

DB loads insanely slow

Hello, we’re using wundergraph + neon and noticing load times of up to 7 seconds for a simple query. We are on the free plan as we’re testing things out but is it really supposed to be this slow?

34 Replies

stormy-gold•2y ago

Hey Niles - have you run an EXPLAIN ANALYZE on your query?

conscious-sapphire•2y ago

On top of the suggestion by @Mike J , try running the same queries in the SQL editor or via psql vs your application to rule out an application-level bottleneck.

rival-black•2y ago

Hi there, how everyone is going? We noticed those delays even executing some simple queries to the db from pgadmin. 5 seconds to execute a query to a table with only one row.

rival-black•2y ago

This are the Neon instance details:

passive-yellowOP•2y ago

Thanks @rpintos@spacedev.io -- @ShinyPokemon @Mike J, let us know if you need anything else

rival-black•2y ago

This is an other example it took 6 seconds. Executing this several times it takes from 1.5 secs to 6 secs

conscious-sapphire•2y ago

1. Was the query issued against an idle endpoint? In other words, does this only happen on a cold start, or is it consistently slow even if issued again and again? 2. Does EXPLAIN ANALYZE provide any insight? 3. Are you located far from US-east? 4. Feel free to share your project ID so we can test on our end, and see if this requires deeper investigation.

rival-black•2y ago

1. Yes, seems to be in a cold start. Could we have a minimum set of instances awake to prevent that cold start? 2. Will check 3. We are in shout america (Uruguay/Argentina) 4. Sure, https://console.neon.tech/app/projects/tiny-sound-92403171 > tiny-sound-92403171 is the id right?

conscious-sapphire•2y ago

Yep. That is the ID! Ok I think the geography combined with cold start is the issue. You can disable auto suspend to avoid cold starts. This will increase your compute cost, so you might want to do it only on your primary (production) branch, and not dev branches https://neon.tech/docs/guides/auto-suspend-guide

Neon

Configuring Autosuspend for Neon computes - Neon Docs

Neon's Autosuspend feature controls when a Neon compute instance transitions to an Idle state (scales to zero) due to inactivity. This guide demonstrates how to configure the autosuspend setting in yo...

rival-black•2y ago

Ok let me read this and I come back later. Thx!

passive-yellowOP•2y ago

@ShinyPokemon -- Any chance you could help a small startup like ours with some credits to bootstrap Neon into our stack? 🙂 We have $5K in Mongo credits, but would really like to use you guys moving forward

conscious-sapphire•2y ago

DM-ing you.

extended-salmon•2y ago

@Niles @rpintos@spacedev.io Could it be that the slow insert was experienced around
2024-01-24T05:02:24.60551 UTC and 2024-01-23T16:51:09.573237 UTC? At those timestamps, I observe that your endpoint couldn't start from the pools. Instead of the usual 125-250ms coldstart, you experienced a cold start of 5857ms and another one of 7387ms. As described in our blog post Cold Start Just Got Hots (See: https://neon.tech/blog/cold-starts-just-got-hot) , to reduce the cold start duration, it was decided to maintain pools of already started "empty" compute endpoints. When a customer request comes in, we reconfigure one of those already started compute endpoint and provide it to the customer, allowing us to fulfil the request on average in less than 500ms. The real startup duration of the empty compute endpoints is actually longer than this and can take up to a few seconds. When those pools are exhausted, the compute endpoint will start as soon as the request is received, and in this case, the cold start duration will correspond to the normal startup duration of the compute endpoint, which is the problem that you experienced at the timestamps provided above. I will bring this point to the attention of our engineering team and will request an increase in the pool size, which should resolve the problem immediately.

Neon

Cold starts just got hot - Neon

tl;dr> Over the past few months, a bunch of efforts by the engineeringteam have greatly reduced our “cold start time” for compute resourcesfor idle computes that become active. This post explores the problemand how the Neon team worked on this problem. Background: What’s a cold start and why does it matter? Let’s get one thing […]

extended-salmon•2y ago

(I confirm that the relevant pool size was increased 🤗 )

rival-black•2y ago

Hey hey @Yanic. Thx for all this explanation. Regarding the slow experience, it was mostly everytime. Now I have tried several queries and it has improved a lot, so thx for that 🙌 . We also think that once we deploy our server in US, we are gonna have even much better throughput.

extended-salmon•2y ago

I'm glad to read that the performance improved! If you observe a reoccurrence of this issue, please capture the timestamp and feel free to ping me. I would be happy to dig into our logs to help you 🙂

passive-yellowOP•2y ago

Thanks @Yanic -- you guys are awesome! Hey @Yanic @ShinyPokemon -- we upgraded to the "Pro" plan and are still observing extremely slow loading times for basic queries and mutations The Project ID is flat-term-59420511

passive-yellowOP•2y ago

We are seeing times of 10-11 seconds for basic queries

passive-yellowOP•2y ago

@Yanic @ShinyPokemon -- we maxed out the CPU to 7 and deactivated sleep mode and it still takes 2 seconds for basic queries

passive-yellowOP•2y ago

5 seconds for a basic mutation and 3 seconds for a basic query

passive-yellowOP•2y ago

These times are not feasible for us and I'm afraid we have to look at other options if this is Neon's maxed out performance

conscious-sapphire•2y ago

1. It looks like you changed the “default” compute size for new computes. Make sure you change it for your existing main branch compute by following the “Edit a compute endpoint” guide here https://neon.tech/docs/manage/endpoints 2. Looks like you’re showing an API response time screenshot. How long does the query take to execute in the SQL Editor in the Neon Console? That will give you a more realistic indication of whether the issue is with your backend, or the database.

Neon

Manage computes - Neon Docs

A single read write compute endpoint is created for your project's primary branch, by default. To connect to a database that resides in a branch, you must connect via a compute endpoint associated wit...

extended-salmon•2y ago

~~This is the duration of all "start" operation for the endpoint "ep-crimson-king-a5ga1fvc" today.~~ My apologises, I was looking at the project you initially mentioned and not 'flat-term-59420511'. Now looking at the correct project

extended-salmon•2y ago

In the past, when you faced cold start issue, those numbers reached up to 7 seconds in comparison. So, at the moment, you're not facing any coldstart issues two questions: 1) are you using prisma? 2) are you using a pooled connection? If the answer to both question is yes, do you by any chance have "pgbouncer=true" in your connection string? Coldstarts are irrelevant for the project 'flat-term-59420511', the auto-suspend being disabled This being said, I believe that you massively overprovisioned your endpoint which eventually will translate into unecessary costs for you. Till ~2PM UTC, your endpoint was configured with 1/4 of CU (1vCPU and 1GB of mem) Here are the utilisation charts for this EP when running 1/4CU

extended-salmon•2y ago

at 2PM UTC, you bumped up the config to 7 CU for this endpoint (7vCPU and 28GB of mem)

extended-salmon•2y ago

It seems that you slightly increased your workload as well, the mem consumption peaking to nearly the double of the previous consumption but even though, you're only using a tiny fraction of the computing ressources available the contention is clearly not on CPU or memory side

extended-salmon•2y ago

The cache hit ratio is constantly at 100%, meaning that the slowness observed isn't caused either by disk access or network operations to retrieve the data from the pageserver or the cold storage

extended-salmon•2y ago

A (very) quick parsing of the logs for this endpoint also shows that this endpoint is massively over-provisioned for the workload running:

extended-salmon•2y ago

Honnestly, I would suggest that you reduce the number of CU allocated to your endpoint at the earliest. The problem experienced is clearly not caused by either cpu or mem contention. I'm happy that you spend money using our services, but at the moment I feel that you are wasting money and resources, which is not what I would qualify as a nice customer experience. Can you please ping me in DM: 1) your connection string 2) a precise timestamp at which the slow query was experienced 3) if possible, a debug log, or any kind of applicative logs at your disposal I will raise a support case on your behalf and I will dig in our logs to clarify where the problem comes from But to be clear: the behaviour reported is NOT normal and we absolutely can do (much) better than this!

conscious-sapphire•2y ago

You’re a star Yanic! I caught up with the guys at SpaceDev and it looks like Neon’s returning responses in about 2ms. Their database and application regions are different, so that’s adding overhead. They’re going to change region and dig into the application layer some more.

Gaming

Programming

DB loads insanely slow

Did you find this page helpful?