TRPC ratelimiting endpoints

BBenn4/30/2023
I am currently having some problems with a race condition in my TRPC nextJS api.

Essentially what is happening is I have a enforceGuildPermissions method, which basically checks if the user who is making the request has permission to get the data for that guild.

The data is stored in my Redis cache for 3 seconds. This works okay sometimes, but other times because there is 3-4 different trpc requests running for a single page which are guild, role and channel. It causes the last request (channel) to get rate limited by the discord API because they are all running concurrently, this means it doesn't give my caching code chance to update it before the next one runs.

middleware route procedure:
const enforceGuildPermissions = enforceUserLoggedIn.unstable_pipe(
  async ({ ctx, next, rawInput }) => {
    const guildId: unknown = (rawInput as { guildId?: unknown })?.guildId;
    if (!guildId) throw new TRPCError({ code: 'BAD_REQUEST' });

    const webUser = await cache.webUsers.get(ctx.session.user.id);
    let guilds = webUser?.guilds;
    if (!guilds) {
      guilds = await getUserGuilds(ctx.session)

    }

    if (!guilds) throw new TRPCError({ code: 'UNAUTHORIZED' });
    
    // if the user is not in the guild return unauth
    const foundGuild = guilds.find((guild) => guild.id === guildId);
    if (!foundGuild) throw new TRPCError({ code: 'UNAUTHORIZED' });

    return next({
      ctx: {
        session: { ...ctx.session, user: ctx.session.user },
      },
    });
  }
);

export const guildProcedure = t.procedure.use(enforceGuildPermissions);
BBenn4/30/2023
function which needs to be ratelimited (using the redis cache)
export const getUserGuilds = async (
  session: Session
): Promise<CachedUserGuild[] | null> => {
  if (!session.user.accessToken || !session.user.id) return null;

  const webUser = await cache.webUsers.get(session.user.id);
  if (webUser) return webUser.guilds;

  const response = await fetch(discord...)
  const guilds = await response.json();
  if (!response.ok || guilds.length <= 0) return null;
  
  // add guilds to cache
  await cache.webUsers.create(session.user.id, guilds);

  return guilds;
};


A big band aid fix would be to just add an artificial wait:
    if (!guilds) {
      await new Promise((resolve) => setTimeout(resolve, 1000));
      webUser = await cache.webUsers.get(ctx.session.user.id);
      guilds = webUser?.guilds;
      if (!guilds) {
        throw new TRPCError({ code: 'UNAUTHORIZED' });
      }
    }

But obviously this is not very elegant..
BBenn5/2/2023
bump
Nnlucas5/3/2023
This isn’t really a tRPC problem, caching and rate limiting is hard
Nnlucas5/3/2023
Probably simplest to recognise rate limits and retry the task though
A/Kalex / KATT5/3/2023
You can dedupe the call to discord
A/Kalex / KATT5/3/2023
Have a fn to do it passed down through the context and memoize the promise
A/Kalex / KATT5/3/2023
try running that
BBenn5/3/2023
Hi, thank you for the response. But I'm not sure I follow. Since this is hosted on serverless infrastructure (vercel), I don't think this is possible to use a in memory cache to stop this from happening? I may be misunderstanding your solution though.
A/Kalex / KATT5/4/2023
You can use the above to make sure two calls to discord aren't done within the same request
A/Kalex / KATT5/4/2023
if you use that the thing will only be called once per request
A/Kalex / KATT5/4/2023
so batching will be deduped
A/Kalex / KATT5/4/2023
if you have many users you gotta do something fancier where you can synthesise the promise in redis or something
BBenn5/4/2023
Thank you very much, when I get some time I shall look at this.
BBenn5/6/2023
@alex / KATT hi, I had time to do this now. Thank you SO much! This issue was slowing my site down alot, and now it is fast once again!
BBenn5/6/2023
But just so I understand, why does this solution not scale? You say I may have to deal with the promise in redis? Why is using memo not okay?
Nnlucas5/6/2023
I haven’t looked at the solution, but (as a thought experiment) what happens if you have 2, 3, 15 parallel instances of the API running? (Horizontal scaling)
BBenn5/6/2023
Hmm, yes okay. I see what you are saying. This depends on vercel runs nextjs api endpoints, im not entirely sure. For now, this is working just fine. But I should look into this just encase. Thank you very much for the help 🙂