Railway•5mo ago

Problem with Redis after migrations (ECONNRESET)

Hi! I have a serivce with Medusa.js (Node), Redis and Postgres which I've had since september. I've had no problems with this service before, but after I migrated to the new databases I have gotten a problem where the service sometimes stops working (not immediately, but after a while, seemingly at random times). It doesn't crash but the API just times out or gets 404. I've found the logs which I think are the culprit, but I still haven't found a solution yet. It might have something to do with Redis timing out or hitting memory limits perhaps? Just a guess from my part so far. Here are the logs:

Error: read ECONNRESET

at TCP.onStreamRead (node:internal/stream_base_commons:217:20) {

errno: -104,

code: 'ECONNRESET',

syscall: 'read'

}

AbortError: Ready check failed: Redis connection lost and command aborted. It might have been processed.

at RedisClient.flush_and_error (/app/node_modules/redis/index.js:298:23)

at RedisClient.connection_gone (/app/node_modules/redis/index.js:603:14)

at Socket.<anonymous> (/app/node_modules/redis/index.js:227:14)

at Object.onceWrapper (node:events:632:26)

at Socket.emit (node:events:517:28)

at TCP.<anonymous> (node:net:350:12) {

code: 'UNCERTAIN_STATE',

command: 'INFO'

}

[ioredis] Unhandled error event: Error: read ECONNRESET

Error: read ECONNRESET

at TCP.onStreamRead (node:internal/stream_base_commons:217:20) {

errno: -104,

code: 'ECONNRESET',

syscall: 'read'

}

AbortError: Ready check failed: Redis connection lost and command aborted. It might have been processed.

at RedisClient.flush_and_error (/app/node_modules/redis/index.js:298:23)

at RedisClient.connection_gone (/app/node_modules/redis/index.js:603:14)

at Socket.<anonymous> (/app/node_modules/redis/index.js:227:14)

at Object.onceWrapper (node:events:632:26)

at Socket.emit (node:events:517:28)

at TCP.<anonymous> (node:net:350:12) {

code: 'UNCERTAIN_STATE',

command: 'INFO'

}

[ioredis] Unhandled error event: Error: read ECONNRESET

32 Replies

Percy•5mo ago

Project ID: 9b3bd973-ba5c-4ef3-9f43-32f499f7ba19

Rasmus Lian•5mo ago

9b3bd973-ba5c-4ef3-9f43-32f499f7ba19

codico•4mo ago

FYI I've started experiencing this recently as well, with a PG instance - did not used to have these issues before

Rasmus Lian•4mo ago

Hm, really strange. It really messes our product up at the moment

codico•4mo ago

I'm not experiencing it on both my services, so could be an app layer issue but didn't use to happen 🤷‍♂️

Brody•4mo ago

often times this happens when you aren't closing connections and the idle timeout is reached

codico•4mo ago

On my end my websockets are maybe suspicious, i'll take a look at some websocket settings

Brody•4mo ago

for postgres pooled clients specifically this is solved by setting the pool minimum to 0 so that all connections are released and marked as closed

Rasmus Lian•4mo ago

Where do I set that?

Brody•4mo ago

your issue looks to be with redis, either way you would need to reference the documentation for your database client

Rasmus Lian•4mo ago

Ah true, is there a similar pool minimum setting for Redis?

Brody•4mo ago

not sure, you would need to reference the documentation for your database client

Rasmus Lian•4mo ago

That is Medusa then you mean in my case?

Brody•4mo ago

medusa is not a database client, the redis npm package is

Rasmus Lian•4mo ago

Okey, will check there

codico•4mo ago

Thank you for the help Brody! If the connections to the DB are idle that means there are no DB operations? I should be having constant traffic 🤔 maybe I'm having some other underlying issue. Or I am missunderstanding. Nonetheless trying to put min:0 and hoping for the best! 😄

Brody•4mo ago

what tech stack are you using?

codico•4mo ago

nestjs with socketio and typeorm to connect to PG I'm suspicious of socketio as well

Brody•4mo ago

does typeorm have a pool.min setting?

codico•4mo ago

They have a poolSize which is max, but they also accept extra and pass it onto the underlying driver. So that should accept min I believe

Brody•4mo ago

whats the underlying driver in use?

codico•4mo ago

pg and as you say that, realizing that maybe it doesn't have a min setting and I need to use the idleTimeoutMillis or allowExitOnIdle

Brody•4mo ago

allowExitOnIdle seems like what we want

codico•4mo ago

Thank you for the help, I'll try it out!

Brody•4mo ago

let me know how that goes!

meng_socal•4mo ago

the problem inside is the proxy with tcp protocol

Brody•4mo ago

sorry but the issue here does not lie with railway, they have internal monitoring for these kinds of things and nothing has been reported. these errors are due to how the client is handling connections

meng_socal•4mo ago

WHat do you know!

latrapo•3mo ago

@Rasmus Lian I have the same issue with medusa, only started since we migrated. Have you found a solution?

Rasmus Lian•3mo ago

@latrapo Yes I solved it by upgrading Medusa and Redis services I think (cache service and notification provider) and make sure the Redis config is correct. I think that was it, I am afk atm so cant give you more atm

latrapo•3mo ago

Thanks! Will try doing that in the mean time 🙏

Brody•3mo ago

make sure you aren't keeping any idle connections around and are sure to close a connection when done with it

Gaming

Programming

Problem with Redis after migrations (ECONNRESET)