Novu•2y ago

MongoDB : Jobs table - High number of records returned

I noticed that we have 750k records in the jobs table in MongoDB, the vast majority are marked with state of 'complete'. We have noticed that we have a high load on our AWS DocumentDB and were wondering if we could safely delete the completed job records? On the side, why are we seeing around 10k records getting read per minute? This happens non-stop over night when the system is not seeing much use.

8 Replies

dmulliganOP•2y ago

Just answering this for anyone else that has issues with AWS DocumentDB. I was seeing slow performance and in this question was interested to see if the number of records in the database was a factor. While I still need to investigate if records will get cleaned up on their own, I have a feeling they may not as expireAt is not set for jobs or messages, the issue here was missing indexes. There appears to be a compatibly issue when it comes to creating indexes in DocumentDB. When I identified this, and created the missing indexes on the job and message table, I noticed a huge speed increase.

Zac Clifton•2y ago

What you found out is what I was going to recommend, I would also recommend setting up the exprieAt index on

executionDetails

executionDetails

and

Notifications

Notifications

collections. If you figure out why the indexs are not being created write it here and I would be happy to put it in the documentation.

todd•2y ago

If it isn't too lazy on my behalf, could you link me to the code that does this? I am maintaining a "self-hosted" MongoDB Atlas (rather than DocumentDB) and trying to get on top of the migrations. Thanks in advance. thanks @dmulligan for the heads up

Zac Clifton•2y ago

@Gali Ainouz-Baum Would you be able to point us in the right direction?

Gali Baum•2y ago

@Zac Clifton @todd do you mean this ? In the scheme? https://github.com/novuhq/novu/blob/next/libs/dal/src/repositories/execution-details/execution-details.schema.ts#L116

Zac Clifton•2y ago

Yes, thanks you! @todd here is where we create the index

dmulliganOP•2y ago

@todd only a few of the indexes were created to get us our of a hole, I will loop back around and check to see what others need to be created.

db.messages.createIndex({"_subscriberId":1})
db.messages.createIndex({"_environmentId":1})
db.jobs.createIndex({"_environmentId":1})
db.jobs.createIndex({"_subscriberId":1})
db.jobs.createIndex({"_organizationId":1})
db.jobs.createIndex({"_parentId":1})

db.messages.createIndex({"_subscriberId":1})
db.messages.createIndex({"_environmentId":1})
db.jobs.createIndex({"_environmentId":1})
db.jobs.createIndex({"_subscriberId":1})
db.jobs.createIndex({"_organizationId":1})
db.jobs.createIndex({"_parentId":1})

We noticed a speed increase for fetching a non-cached feed for a subscriber from around 10/15 seconds to 50-80ms.

todd•2y ago

Sorry, caught up. Great work all. @Gali Ainouz-Baum @Zac Clifton thanks—I see now where this class of work is done @dmulligan excellent work basically that an index reconciliation needs to happen (code to actuals, add and possibly remove) Would these be fair statements: * the collections are provisioned as part of a bootstrap process (rather than a out of process migration) * provisioning the collections includes schemas and indexes * any changes functional (schema) or performance (index) should be included in the code base (and thus statement one then applies them for all) What I saw as a recommendation was a hand tweak for performance that would break the statements. Because the thing I wonder is why is Novu in production not experiencing what @dmulligan is reporting? 🙂

Gaming

Programming

MongoDB : Jobs table - High number of records returned

Did you find this page helpful?