Call API with large amount of data without lagging user
I have that API which can return about 200.000 words at once. I want to let this stored at a nodejs server, so user can fetch some of these words in a way that won't lag him
Does anyone know how to do it
44 Replies
Context:
I am making an Dictionary API and I want it to return at least 5 words similar to what the user is typing, while he is typing.
I thought about using that API which returns 200k english words, and then make requests to it do such task. I just don't know how to make it return 200k words without slowing down
I thought about hosting the server somewhere so the server has already a constant where the words are stored, or even storing the data on a database and then do the nodejs do the necessary tasks
Idk how to approach it
you don't. If you have 200k potential results, you paginate (so only fetch the first 20-100 results). Something like 99.7% of google searches never go past the first page, so it's pointless to send the user 200k words
But it needs to fetch atleast 5 words while user is typing
5 and 200,000 are not the same
Like this but less
why would you need to fetch 200000 records to show 5?
most APIs won't even give you that many in one go
not in my experience at least
There is one that does
you can't fetch, then load, and then filter 200,000 records without lagging out the user
it's a megabyte's worth of transfer, and then reading that into memory and performing operations on it, there's going to be a hitch at the very least
I still don't know why you want to fetch 200000 records to begin with
I want to store that 200.000 words somewhere so I don't need to fetch it each request to the server
that may be against the TOS of the api
https://random-word-api.herokuapp.com/all
It just retrieves this
okay, so what are you trying to build?
Why the error occurred
no idea, it doesn't work on chrome or firefox
This is the site of the api
https://random-word-api.herokuapp.com/home
I just clicked the link
that doesn't work either for me
Why
I am using chrome on smartphone and it works fine
Weird
it works on my phone, not on my PC
I still don't know what you're trying to build, but I doubt fetching 200k words into the browser's memory is the way to go about it
you could, if you really need to, put up a loading spinner and just have the user wait, it shouldn't take too long on a modern device and internet connection
it's three entire books worth of words though, so 🤷
I dont want to fetch on the browser, I want to let it stored on a server
The frontend code will only fetch the words needed
But those words needs to come from somewhere
so you're only loading the 200k words once, then fetching small sets from the backend?
Yes
200k words/rows in mysql or even sqlite is nothing, so if you're filtering and only sending a small amount of data to the user, there is nothing to worry about
you'll want to look into debouncing so that you don't send a request for every single keypress though
So the way is to store the code on a database?
that sentence doesn't make a lot of sense
Ic
you store data in a database, not code
Yeah, I bugged there
I wanted to type So the way is to store the words on a database
yes
I want the request to be done when the user stops from typing
that's part of what debouncing does too yeah
it basically waits say... 200ms after each keypress before sending the request
so if you're tying at a reasonable speed, it won't send anything because each keypress cancels the wait for the previous one and restarts the timer
For that setTimeout is enough, doesn't it
yup
you store the return value in a variable in the appropriate scope, then on each next keypress you clearTimeout and setTimeout again
Ic
Thx
no worries 🙂
also, cache stuff on the client side
words almost never change meaning
if someone looks for "cookies" frequently, also consider caching the results on your server too
If you want to be able to search through 200k entries for similar words I assuming at a speed fast enough to show results while typing you’ll want a graph database. So you would pre-compile similar words as well as partials tied to each other and then you look at the connected nodes for the most similar however many
You’d be storing a lot more than 200k bits of data but boost speed https://en.m.wikipedia.org/wiki/Space%E2%80%93time_tradeoff
Space–time tradeoff
A space–time trade-off, also known as time–memory trade-off or the algorithmic space-time continuum in computer science is a case where an algorithm or program trades increased space usage with decreased time. Here, space refers to the data storage consumed in performing a given task (RAM, HDD, etc), and time refers to the time consumed in perfo...
For partials you want at least the first 2 letters but ideally 3 or even better 4. Then once you get past the partials you can look at the words linked to the partial.
Be aware that for 4 letters there are over 450k combinations (while all 3 letters is 17.5k, before exclusions) however most of these can be removed, as nothing will be linked to “ojmw” for instance
Bonus points for including how common a word is to how closely linked it is to other words
Adding in fuzzy search is then a little easier as you can fuzzy search for a partial and it cuts down on what you need to search through
Something like this graph would end up exploding for large scale databases but you can tailor how much information you have to how much space you have
that doesnt work well for english
i, a, an, as, it, he, on, at ...
all valid words that are 1-2 letters long
Sorry I should have been clearer, I should have said "up to and including 3 or even better 4", so you would also have all single letters and all double letters
since 2 letter words are an edge-case, and finite, might as well just have those as part of a pre-computed search?
That’s what I meant
then what you said makes sense
OP, essentially if there’s any operation that happens frequently you want to already have that data ready. The data you’re working with doesn’t change much so you can afford to spend time creating the database. Things don’t have to be super rigid either, might be worth if there are any 3 letters that have a lot of matches to split those up into 4 letters to then have less work to do (you might also want to consider things like “th” “sh” “ch” “st” “ph” etc as a single letter, so “the” is considered 2 letters, this allows you to not lose speed when multi-letter phonemes are used