MongoDB & mongoose | How to automatically remove element from array after certain amount of time?

I'm creating Instagram clone and right now working with stories. I need them to be visible to followers for only 24h, so I was thinking of giving each user activeStories field which would be array of story IDs that user posted within last 24h. But I'm not sure how I would remove them after 24h. Is there a built-in way to handle this or am I gonna have to handle that myself?
11 Replies
Tenkes
Tenkes3mo ago
As much as I can see it's only for deleting whole document after given time. So it would delete whole story and I just want to remove it from user's activeStories field. Because I am still going to use that story after 24h (like letting user see all his stories in history, adding stories to highlights etc.). However maybe I could make stories disappear after 24h, but save them in separate collection for longer use? Maybe something like "storyHistory" or "allStories" or something like that... And just give each document authorId which would be author of those stories? So I'd have something like:
{
authorId: '...',
stories: [
'...',
'...',
'...'
]
}
{
authorId: '...',
stories: [
'...',
'...',
'...'
]
}
And for highlights I was already planning on making separate collection. Not sure how effective this would be and if there's maybe a better way? OR I could keep stories in stories collection, but make separate collection for stories that are posted within last 24h? So I'd have collection named something like activeStories and there I'd save story as soon as it's created, and delete it automatically after 24h with TTL?
glutonium
glutonium3mo ago
hmm.. it does sound lika a viable option to me u can try maybe if u stumble upon a better solution u can update later on?
ErickO
ErickO3mo ago
one small detail here, that deletes stories in the mongo db which is, as you said, ids of stories, but what happens to the actual stories? are they kept in storage or what there's 2 approaches I can think of that I would use in a real app...if the stories are to be deleted from storage I would set up a CRON job that automatically deletes all stories older than 24h AND the references to them in the database the 2nd approach is based on what I see instagram does, which is: - The story is no longer available to be viewed after 24h - The story can be retrieved and saved meaning that you can't see the stories but they AREN'T deleted, the author could still see them and upload them again after some time, in that case since we're not deleting any storage all you really need to do is set the date a story was published, then, when you're retrieving someone's stories you filter out all stories older than 24h
Tenkes
Tenkes3mo ago
Plan was to have two collections: 1. activeStories - fields would be something like authorId and storyId but it would be automatically deleted after 24h. 2. stories which would never be automatically deleted. So when user created a story it would also create activeStory. And when I need stories posted within last 24h I'd get them from activeStories collection, otherwise I'd use stories. However that was alternative option. My main plan was to have activeStories field in stories collection. But I'm not sure how I would automatically remove expired stories from that field. I was thinking of this as well, but I believe it would be more efficient to have all active stories in one place, so I don't have to go through ALL stories and filter the ones posted by one author within last 24h. Instead I'd just go through User.activeStories
ErickO
ErickO3mo ago
generally, I would not recommend doing things like that, when you have fields that are updated after certain conditions it is called a "cascade", the problem with cascades is that it gets ugly pretty fast, imagine 10 stories are uploaded at 10:30 and 24h later 10 stories are deleted, no big deal right? but now imagine a million stories were uploaded at the same time, in 24h a million updates will be made to your database, not so good now. Of course at your scale you needn't worry about this, but something to keep in mind now, remember that databases, even ones like mongo have INDEXES, which make lookups much faster than what you could do with javascript array.filter() or anything like that, YOU aren't going to filter anything, the database will
Tenkes
Tenkes3mo ago
Yeah... haven't thought of that haha. And I don't pay much attention to scale of my app, meaning I'm trying to make them as efficient as possible and I'm always thinking how this would work if I had millions of users/stories... Since I'm not doing it JUST for fun, someone else is going to look at my code as well (employers etc...). Which is why I'm wondering if your way would still be good approach? Like if it had a lot of users and stories? I don't know much of databases so not sure how indexes work? Is it not going to loop through all documents or?
ErickO
ErickO3mo ago
cascades have their uses, is a bit complex for a subject tbh but the main issue here is added complexity when really you can just filter old stories out databases index columns (or in mongodb "fields") with a data structure called a b-tree, b-trees are SUPER efficient at look ups, depending on your settings and what not you could find a specific piece of data among 10 million in like 5 lookups it is not a loop, a loop would make lookups O(n) which is linear time but b-trees take O(log(n)) time
Tenkes
Tenkes3mo ago
I see, I'm gonna look more into that. Thank you for helping :)
ErickO
ErickO3mo ago
B-tree
In this tutorial, you will learn what a B-tree is. Also, you will find working examples of search operation on a B-tree in C, C++, Java and Python.
ErickO
ErickO3mo ago
anyway, TL;DR indexes are fast you are not going to be slowed down by the method I talked about of course you then have to read about how to add indexes to mongo 👍 (it's not hard)