I am trying to understand the necessity of Kafka
Suppose I have a dashboard where we will show the user about the stock changes over a span of time (e.g. 7 days, 1 month, 1 year, 5 years, etc.). We have a database where the stock gets updated every 2 hours.
Can I just read the data from time series db? Is Kafka even necessary in this use case?
9 Replies
Or this
It depends what that script does and how you want to handle it... Kafka has the option to aggregate and manipulate data and then share it to your consumers.
If you already are doing so in your script you can just skip using Kafka. In this use case, to my knowledge, you wouldn't gain anything by adding it, on the contrary it would just be a useless step. However, if you don't do aggregation etc. in the script and just save raw data in the time_series_db, then putting kafka as a layer between your db and your app would make sense.
However, this can also depend on what you want to do. If you want realtime data then Kafka would be a better solution then having to query every X seconds to get the newest data from your Db.
kafka as a whole a is giant broker setup
you can have multiple subscribers to stream data to it
kafka can send into the time_series_db, to consumer2, consumer3, consumer4, etc
if its just reading data and storing the data, a "simpler" queue would be enough
And yes, this is also a valid point, you need to Deploy Kafka somewhere, or use something like Pusher.
kafka infra is heavy af + expensive
cuz usually its a cluster + something to manage it
Atm @JulieCezar is in Upstash. I am purely doubting the necessity for the time being
I wouldn't have the FE read directly from Kafka. Reading from time series db is perfectly fine.
Kafka is useful for things such as when I want to keep two other services up to date with some stream of data. In your case this is the script which gets the stock data.
Then you can have a lambda/server or whatever read from the kafka stream and update an elasticsearch cluster so you can run aggregations or other random access patterns on that stock data.
You can also have another lambda/server read from the kafka stream and update the time series db.
And from there your FE can have all the query patterns it wants to access the data. It can read from your elastic search cluster or it can read from your time series db. Regardless, you have a system where both all your data will be consistent with the data from your original script.
For your case though, unless the write throughput straight to your time series db is a limiting factor, I'd skip kafka as the middleman entirely
thanks. That's a good answer and knowing what next. I have also asked in Upstash server and they responded