k8s Traefik bouncer: decision not applied
Hi,
On a running k8s v1.28 + Traefik 3.3 existing cluster, I’m trying to integrate Crowdsec and its Traefik bouncer as traefik plugin.
I see traefik log acquisition is correctly done on the agent:
But when I add manualy a ban decision, it’s like the middleware doesn’t check the decisions and let the request pass. What can I do wrong?
Here my traefik relevant configuration part:
The traefik logs show the plugin correctly loaded:
33 Replies
Important Information
This post has been marked as resolved. If this is a mistake please press the red button below or type
/unresolve
© Created By WhyAydan for CrowdSec ❤️
Any help welcome, thanks by advance 🙂
Here the middleware configuration:
Here the
ingressroute
using the middleware:
What is the output of
cscli bouncers list
in the LAPI pod ?Hi blotus, thank you for your assistance:
ok so traefik did not even try to connect to LAPI
nothing in the traefik logs ?
note: I just deleted everything and rebuilt: now I have traefik logs saying the bouncer plugin fails to reach crowdsec service:
when I take a random pod, I can reach the service:
do you have any kind of firewalling / network policies / ... in the cluster ?
we don't restrict which pods can contact LAPI, so if you can resolve, it should work in theory
nothing non standard in rancher2
very weird: so from random pod in the same ns than traefik:
but from the traefik pod:
hum… could be a calico issue?
maybe
For the resolution issue, could be something in the traefik pod (eg, a different search domain from what you have in the other pod for some reason)
you can maybe try with the full service FQDN
indeed, I’ll try with the FQDN service
FYI I have this configuration for traefik:
don’t remember why I added that, it was 6yo
ok, it’s a dns issue at traefik level: FQDN doesn’t work on traefik pod, but using the service IP works
SVC IP are stable so it could be usable, but I’ll try to solve it before turning to raw IP
Thank you blotus, I’ll keep you in touch
So, the DNS issue is solved (I removed all the outdated dns and hostNetwork stuff from the traefik pod definition)
Now the plugin can resolve the service, but get a 403
I guess the plugin don’t get the lapi shared password.
If I understand well, the plugin know nothing of the lapi password, which is passed from the middleware
The last API Pull is still null
the middleware is configured as following:
am I missing something, @blotus ?
most likely a bad API key
how did you generate the one you are using in the bouncer ?
fully random 25 chars of digits and letters
sorry, I meant did you use
cscli bouncers add
manually, passed it to the LAPI pod with an env var, ... ?hum… the same key is in the lapi pod env BOUNCER_KEY_TRAEFIK but I’m pretty sure it’s not the pass given at cscli bouncers add
can I generate a new bouncer key or should I destroy and create a new one?
in the LAPI pod:
cscli bouncers delete TRAEFIK
then restart the pod, it will recreate the bouncer in the DB automatically with the key provided in the env var
I guess you maybe changed the value of the key during your tests ? (the start script cannot detect if the key has changed, so if you change the value in the env var once the bouncer has been created, it will not get updated in the DB)I certainly tried some days ago to change the key, indeed
better 🙂
Thank you @blotus ! So it was a mix of a bad DNS setting in the Traefik pod and a bad key configured
Let me check a manual ban…
A little fear due to the decision pull delay, but it worked! Thanks again
the default delay is 10s, but you can configure it in the bouncer if you want
That’s fine for me. Is it possible to mark this post as solved?
sure
/resolve
Resolving k8s Traefik bouncer: decision not applied
This has now been resolved. If you think this is a mistake please run
/unresolve
Perfect!
Time to configure recaptcha and eventually redis to share the decisions. Have a good night!
Sorry to ping you again, @blotus but I’m surprised of this morning issue: the bouncer no longer was able to reach the LAPI this morning. I didn’t made screen but I saw the previous TRAEFIK bouncer in
cscli bouncer list
but a TRAEFIK@10.42.10.230 appeared, never pulled. Since the lapi key BOUNCER_KEY_TRAEFIK is still in the env, I did the same thing than last time: destroy all the bouncers then restart the LAPI pod. The traefik pod can reach the LAPI but can’t auth {"message":"access forbidden"}
. From the traefik logs in debug, the plugin say:
what could have gone wrong here? 😕
Is your cluster running a hard disk or ssd?
this one was a HDD node, not a SSD
I guess I should try to locate on a SSD node?
Yeah only because it seems a head request is near instant which means no network problems but when reading from database its slow
hum, so putting the db on another thing than sqlite could help as well. Which backend do you support?
the major ones like mariadb, postgres, mysql
but we prefer to say postgres
which is the best of these 3 ^^
postgres
we had problems with mysql and mariadb in past if you want to setup clustering
but postgres has always had no issues
ok, I’ll evict the lapi from hdd nodes, and eventually try to create a PG instance
ok so moved to another node, but there’s no decisions at all: is there a way to force sync of decisions? I guess it was the subscribed blocklists
These will be classed as new entites did you enroll them / accept if they are in console pending?
Unless there is some k8s way to move data but are you using sqlire or pg?
indeed, I just validate the enroll
sqlite for now, but I’ll certainly create a pg instance
it'll be more stable