"unknown interaction" due to high traffic

When my bot encounters heavy burst loads it will encounter "interaction failed" on the user end, and "unknown interaction" on the bot end. In my button handler logic I defer reply on the first line, so it should respond within the 3s. This would happen for around 15-30 minutes, and the bot would start to actually respond. In some cases it would show the defer reply state and stay there for around 5-10 minutes, and then finally change to the actual response, or in some cases it wouldn't update at all, probably due to the interaction expiring. For some data the first event we used the bot for had around 600 button interactions per minute, which had zero problems, and then the next few events sat at around 2k interactions per minute, that's when the problems started. These are also averaged out at 30minutes, which means the initial interactions would be way more
14 Replies
d.js toolkit
d.js toolkit2mo ago
ganr
ganrOP2mo ago
discord js version of 14.21 and node 22/24
ganr
ganrOP2mo ago
No description
d.js docs
d.js docs2mo ago
tag suggestion for @ganr: Common causes of DiscordAPIError[10062]: Unknown interaction: - Initial response took more than 3 seconds ➞ defer the response *. - Wrong interaction object inside a collector. - Two processes handling the same command (the first consumes the interaction, so it won't be valid for the other instance) * Note: you cannot defer modal or autocomplete value responses
ganr
ganrOP2mo ago
did you read what i wrote
TapRo
TapRo2mo ago
It is possible that there is enough load on the bot that it's not able to defer all interactions within 3 seconds either, that has happened to me before, but I am not 100% sure
ganr
ganrOP2mo ago
that could happen, but I don't think that's the case is doing db read and writes going to slow down the bot so that it can't defer the interactions? i don't think any interactions go through within the first 10 minutes of the high load coming in
ganr
ganrOP2mo ago
also from my memory usage on my server, I can see that it can handle it
No description
ganr
ganrOP2mo ago
the first 10 minutes all interactions would come back as interaction failed anyways I have updated the bot so that the button interaction handler only does a defer reply, and then puts puts the rest of the logic into a queue
async handler(interaction: ButtonInteraction): Promise<void> {

try {
await interaction.deferReply({ flags: MessageFlags.Ephemeral });
} catch (e) {
console.error("[Handler] Failed to defer reply:", e);
return;
}

// Generate unique job ID
const jobId = `${interaction.id}-${randomBytes(8).toString("hex")}`;

// Add job to queue with all necessary data
jobQueue.addJob({
jobId,
interactionToken: interaction.token,
interactionId: interaction.id,
customId: interaction.customId,
userId: interaction.user.id,
guildId: interaction.guildId,
channelId: interaction.channelId,
member: interaction.member!,
retryCount: 0,
createdAt: Date.now(),
});

return console.log(
`[Handler] Job ${jobId} added to queue for user ${interaction.user.id}`
);
}
async handler(interaction: ButtonInteraction): Promise<void> {

try {
await interaction.deferReply({ flags: MessageFlags.Ephemeral });
} catch (e) {
console.error("[Handler] Failed to defer reply:", e);
return;
}

// Generate unique job ID
const jobId = `${interaction.id}-${randomBytes(8).toString("hex")}`;

// Add job to queue with all necessary data
jobQueue.addJob({
jobId,
interactionToken: interaction.token,
interactionId: interaction.id,
customId: interaction.customId,
userId: interaction.user.id,
guildId: interaction.guildId,
channelId: interaction.channelId,
member: interaction.member!,
retryCount: 0,
createdAt: Date.now(),
});

return console.log(
`[Handler] Job ${jobId} added to queue for user ${interaction.user.id}`
);
}
if it were really my other logic slowing the interaction handler down then this should mitigate that the queue is a seperate process i thought interactions don't have ratelimits which are not applied to the interactions endpoints?
ganr
ganrOP2mo ago
No description
ganr
ganrOP2mo ago
In this setup, the button handlers only task is to defer the reply and store all relevant information into redis. A separate server then retrieves this data and performs the other operations, so that any potential slowdowns don’t affect the bot's ability to respond to interactions within 3s. I’ve also implemented retry logic, so any failed attempts are automatically retried after a few seconds. Alternatively you can use something like upstash workflows, or amazon sqs, instead of running your own hopefully this implementation will be tested, as I still don't know if the game community is going to use the bot or not if it does gets used, then i'll update on it's results and i'll also ask for more money :)
Amgelo
Amgelo2mo ago
unfortunately you can't have a button with modals though, and all your button replies are ephemeral if those two things aren't an issue then go ahead
ganr
ganrOP2mo ago
yes for my use case it's fine
ganr
ganrOP2mo ago
this is how my deployment is setup
No description

Did you find this page helpful?