LBG (Lloyds Banking Group) Infrastructure Engineer (WAF)
Thursday, 2nd July 2026 · 16:00 - 17:30 UK Time (90 mins)
⭐ Your Core STAR Stories (LBG Values)
Lead with the impact. Emphasize "I", not "We". Explicitly bridge to the WAF role at the end.
S/T: BRDs from BAs were poor quality. Easy route was blaming them. I investigated the inputs instead.
A: Found a structural blocker: BAs (younger women/POC) couldn't get Exec time; PMs (older white men) could. I boldly escalated this access imbalance to the Execs to force delegation.
R: Execs delegated, BRD quality spiked, multiple workstreams sped up.
WAF Bridge: "If WAF rules are poorly configured, I don't blame the rule. I investigate the input—did the engineer have access? You fix the system, not the symptom."
S/T: Retail team trying to bypass compliance controls on machines due to commercial friction.
A: Didn't just say "no." Dug into *why*. Reviewed objectives, interpreted data, worked collaboratively to find a middle ground where their commercial goal proceeded securely.
R: Machines remained compliant, risk mitigated, business operated efficiently.
WAF Bridge: "This is WAF engineering. App teams want to deploy fast, security wants to block. My job is finding the safe middle ground so the business moves securely."
S/T: Pulled into multiple SME projects, couldn't personally execute all my sub-team's reviews.
A: Hired independent thinkers. Wrote the risk matrix, built procedures, trained them. Held 1-on-1s but provided coaching/frameworks, not direct answers. Trusted them to execute.
R: Team outperformed others, completing more work in less time. Model was adopted elsewhere.
WAF Bridge: "This is how you scale infra. Write the runbooks, build escalation matrices, train engineers, and trust them on-call. Doing it yourself creates a single point of failure."
S/T: Running my LLM platform, shipping logs via Alloy to Grafana Loki. Discovered `httpx` was logging full request URLs at INFO level, meaning the live Gemini API key (passed as a `?key=` query parameter) was bleeding into centralized logs on every call.
A: Didn't just filter the log (a band-aid). Fixed the root cause: moved the key from the query string to the `x-goog-api-key` header so it structurally couldn't appear in URLs. Added defense-in-depth by suppressing `httpx` INFO logs across all worker entrypoints. Shipped the fix through the GitLab CI pipeline rather than a manual hotfix.
R: Leak immediately stopped, verified in Grafana. This log-suppression and header-auth pattern became standard for all future API integrations.
WAF Bridge: "This is why WAF engineers care about HTTP structure. Query strings are logged by edge routers and WAFs by default; headers aren't. Fixing it at the protocol level rather than just writing a log filter is how you prevent the leak permanently."
S/T: 5,000 affiliate banners non-compliant. Manual email process was failing.
A: Investigated backend, found templates held centrally. Proposed code replacement. Commercial pushed back on "boring" banners, so I collaborated with localization to make them compliant AND commercial. Handed to engineering.
R: All 5,000 replaced via one code change. Zero regulatory action.
WAF Bridge: "This is policy-as-code. Instead of manually updating 5,000 WAF rules, you change the template they inherit from. Fix at the root to eliminate human error."
• Difficult Person: PokerStars Commercial Pushback (finding middle ground on the banners).
• Mistake: Contabo Monitoring (deployed logging but forgot alerts—deploying WAF in detect mode is useless if nobody looks at the alerts).
• Improve Process: LCCP Reg Filter (GitHub tool replacing manual reg searching).
Why This Role Fits You
When they ask "why LBG?" and "why this role?" — here's what's genuine, not generic.
The Panel & Questions to Ask Them
Ask Him: "What is the biggest operational headache the WAF team is facing right now that you're hoping this hire will take off your plate?"
Ask Him: "When a new WAF ruleset rolls out, what does the staging process look like to guarantee zero customer disruption?"
Ask Him: "How much friction are you seeing between strict policy-as-code requirements and the deployment speed expected by SRE teams during emergencies?"
WAF Request Flow
How all traffic passes through the WAF — the end-to-end picture
A WAF (Web Application Firewall) is a bouncer at the club door. Every single request to your website or API passes through it before reaching your backend servers. The bouncer checks each person against a rulebook and either lets them through or throws them out.
The WAF operates as a reverse proxy. The flow:
| Term | Definition |
|---|---|
| Reverse Proxy | A server that sits in front of web servers and forwards client requests to them. |
| TLS Termination | The process of decrypting HTTPS traffic at the proxy/WAF level rather than the final destination server. |
| SNI | Extension to TLS allowing a client to specify which hostname it is connecting to. |
HTTP/S · DNS · TLS Fundamentals
The protocol knowledge Shephard will probe
DNS is the address book of the internet. TLS is the padlock on the envelope. HTTP/S is the letter inside. The WAF reads all of these to decide if the request is legitimate.
DNS & TLS in WAF context
- A/CNAME records point to the WAF IP, not origin.
- Edge CDN WAFs use Anycast routing.
- TLS termination: WAF decrypts inbound, re-encrypts outbound. Holds private key.
- Cipher suites: Negotiate encryption algorithm. Modern: TLS 1.3, ECDHE.
- mTLS: Backend connection where origin also presents a cert.
| Term | Definition |
|---|---|
| Anycast | Routing method where multiple servers share the same IP address; routes to closest server. |
| TTL (Time To Live) | Setting that tells DNS resolver how long to cache a query. |
WAF Vendor Landscape
Edge, on-prem appliance, cloud-native — and what LBG runs
LBG runs Cloudflare/Akamai at the Edge, F5/Imperva in their Data Centers, and GCP Cloud Armor natively in the cloud. You have hands-on with Cloudflare and Cloud Armor.
| Type | Vendors | Your position |
|---|---|---|
| Edge / SaaS | Cloudflare, Akamai | ✓ Cloudflare in production |
| On-prem | F5 BIG-IP, Imperva | No hands-on, frame as transferable IaC |
| Cloud-native | GCP Cloud Armor, AWS WAF | ✓ Cloud Armor badge (Friday) |
Terraform & Configuration Drift
State management, the "3 AM Emergency" fix, and Drift
Terraform keeps a record (the state file) of exactly what it built. If someone goes around you and clicks in the console during an emergency, Terraform notices — it sees the live environment differs from its code. That gap is called Configuration Drift.
If you don't fix the drift, the next automated pipeline will overwrite your emergency fix and bring the attack right back.
Handling the 3 AM Emergency (Drift Management)
- The Break-Glass Action: During a live incident, speed wins. You log into the WAF console and manually block the IP. You do not wait for a CI/CD pipeline when data is at risk.
- The Drift: Real infra no longer matches Code or State.
- The Clean-Up: Once mitigated, backport the fix. Write the code, run
terraform plan(should show 0 changes), and merge. Or useterraform importif you created a completely new resource manually. - Prevention: Run
terraform planon a schedule in GitHub Actions. If it detects drift, it alerts the team so manual changes aren't forgotten.
CI/CD + Policy-as-Code
How WAF rules go from commit to production safely
Every WAF rule change goes through a pipeline — like a spell-checker that runs automatically. Policy-as-code is the spell-checker.
- Terraform plan: Pipeline runs
terraform planand outputs to JSON. - Policy check: OPA (Open Policy Agent) evaluates JSON against rules (e.g., "TLS minimum version must be 1.2").
- Safe Rule Deployment: Never deploy directly into Block mode. Deploy in Log-Only mode → Monitor for false positives → Flip to Block.
Layer 7 DDoS & Application Attacks
Why L7 is harder than L3/4 — and how to defend
Layer 7 attacks are smarter. They send real, valid-looking website requests. The WAF has to figure out the difference between 10,000 real customers and 10,000 bots all doing the same thing.
Bot Protection
Distinguishing legitimate automation from malicious bots
Bot protection is about telling the difference between a real browser and an automated script trying to brute force passwords (credential stuffing).
OWASP API Security Top 10
The attack types the WAF needs to defend against
The ones most relevant for WAF configuration are BOLA (stealing other people's data by guessing IDs), broken authentication (credential stuffing), and resource exhaustion (L7 DDoS).
GCP Cloud Armor ⭐
Google's WAF — you did the badge Friday — lean into this
Cloud Armor is Google's WAF. You write security policies and attach them to your backend services. Rules run in priority order: first match wins.
False Positive / Negative Triage
The ongoing balancing act — and how to diagnose methodically
Tuning a WAF is a permanent balancing act. Too tight = customers can't use your service. Too loose = attackers get through. Make surgical fixes, not blunt rule removals.
Observability & Telemetry
What to monitor, how to alert, LBG's actual stack
If you can't see what your WAF is doing, you're flying blind. Good observability answers: Is it blocking the right things? Is it affecting real customers? How much latency is it adding?
Incident Response & Blast Radius
The "If I do this, then what?" supply-chain mindset
It's never just "find a solution and do it". It's "If I do this, what changes? Can it wait? Does it need to happen now?"
A knee-jerk fix often causes a bigger outage than the actual attack. You have to mitigate the immediate threat, check the dependencies, look for hidden footprints, and then apply a permanent fix.