Best container for ~1 million objects

What would be the best container choice for ~1 million objects? The objects would be in order, but I would need the ability to jump to any random object quickly (which makes me think dictionary). But at some point I'd need to add in filtering/search based on the properties in these objects. performance is the main concern. The idea is the user can view the details (properties) of one object a a time, and use a prev/next/jumpto control to go between them. But eventually they want to export these objects after a filtering is applied.
18 Replies
Jimmacle
Jimmacle4mo ago
dictionaries aren't ordered, if your entries are sortable you'd want to use a sorted list did you already rule out using a database? they're designed specifically for storing, querying, and ordering information efficiently
WEIRD FLEX
WEIRD FLEX4mo ago
there is also SortedDictionary, even if it's not so dissimilar from SortedList
Jimmacle
Jimmacle4mo ago
depending on how performant you want the filtering/search to be you'll basically be reinventing a database table with multiple indexes e.g. multiple sorted lists each sorted on different properties of the objects
WEIRD FLEX
WEIRD FLEX4mo ago
iirc one is more performant in inserting items and the other one in reading them, or something like that
Nacho Man Randy Cabbage
hmm, yeah database is probably out. might be able to get away with a sqlite db i figured a dictionary would be good for the random access though. trying to get away without doing any paging, user could want the first object, then the last, then the middle, or anywhere inbetween and it has to be instant. each object will essentially have a primary key that I was gonna use as the dictionary key
Keswiik
Keswiik4mo ago
It's always possible to maintain an in-memory cache to access specific objects, but what is the use case here? What objects are you trying to sort and access? Why do you need instant access to random objects and the ability to sort and filter?
Nacho Man Randy Cabbage
use case is to let the users see the data one object at a time, they are trained to look for discrepancies from one object to the next. sometimes they know where to look but sometimes they spend all day going through one at a time or jumping back etc. Sometimes they know a specific range to look in and want to filter it that way. and the discrepancies isn't something that can be coded for because it is completely different per list of objects
Keswiik
Keswiik4mo ago
And are all of these objects loaded locally on a user's machine? How are they gaining access?
DaVinki
DaVinki4mo ago
Would a dictionary + a linked list work? Wouldn’t be able to arbitrarily jump to elements by index but yes by key and the list can be sorted
Nacho Man Randy Cabbage
a huge binary file
Jimmacle
Jimmacle4mo ago
you could load it into an in-memory sqlite database ultimately you're going to be recreating the functionality of a database anyway
Nacho Man Randy Cabbage
ugh that's a good point
Keswiik
Keswiik4mo ago
How long do these users spend looking at each object? Because it sounds like you don't really need instant access, and a user isn't going to notice the time it takes to fetch data from either a sqlite database or from some web-based service.
Nacho Man Randy Cabbage
depends, they can spend weeks on one file. it's a very specific use case that I can't really go in details on lol.
Keswiik
Keswiik4mo ago
I don't need the specifics, I am mostly asking to get a feel as to how quickly users access these objects and what kind of speed is needed.
Nacho Man Randy Cabbage
if they are going one by one in order (or backwards), they don't want to wait. it's essentially a details screen with a next/prev button. Then there's also a textbox they can enter in a number to jump to, which each of these objects has a unique number of. Now, sometimes they have a specific range of objects they know they want to focus on (each object has a date property they like to look between), so they want to filter all the objects between those two dates first and then do the one by one look. And a final piece of functionality is exporting all or some of the objects to some output after filtering, but performance doesn't matter much here.
Keswiik
Keswiik4mo ago
I'd go with jim's suggestion of sqlite then, it should be more than enough to handle that kind of workflow. It would be simple to implement batched loading of objects to maintain a set of them in memory at any given time, plus all of the sorting and exporting that you mention.
Nacho Man Randy Cabbage
I will try it out. Thanks all!