Final Stage · Candidate Briefing

Interview Day × AI Pioneer

One day, four stages, one job: prove I can drop into an unscoped problem and ship something a domain expert actually uses. PortSwigger is going all-in on AI, and the Pioneer is the roving, tribe-agnostic builder who makes that real. This is the plan — logistics, panel, culture, and the live build.

Date

Wed 10 June2026

Window

10:15 – 15:00BST · breaks built in

Where

Booths ParkKnutsford WA16 8ZS

Role

AI Pioneerroving builder · not tribe-aligned

Stages on the day

task · lunch · swig · final

On the panel

davies · mackay · philpott · craig

SwigFactor traits

culture is the bar

Live build to ship

artifact, not slides

01The Journey · arrive calm

Macclesfield → Booths Park

~20 min drive · A537 westbound via Monk's Heath

SK11 7DG

Brookfield Lane, Macclesfield

A537 W

straight rural road

Monk's Heath

A34 crossroads · AM congestion

A537 → A50

into Knutsford

WA16 8ZS

6 Booths Park, Chelford Rd

Depart

09:35 am

Target arrival

10:10 am

First activity

10:15 am

Buffer built in

35 min

On arrival 220-acre estate

Booths Park isn't one office block — it's parkland with historic and modern buildings. Don't let the scale eat the clock.

Turn in off Chelford Rd.
Follow the estate road to visitor parking — free, onsite.
Sign in at Booths Hall.
Main reception. Vehicle reg is taken here.
Get walked to Springwood.
The wing housing the technical teams & meeting rooms.
Settle for ~5 min.
Water, breathe, phone on silent. Land calm, not cortisol-spiked.

02Run of Day · 10:15 → 15:00

10:15

~90 min

Task-Based Activity

The Live Test with Lucy Davies · Jamie Mackay · Simon Wood · Adam Spencer

The hands-on build — and they gave no brief in advance. Expect a surprise, unscoped problem dropped on me cold, chosen to fit the roving scope of the role. The job isn't a clever demo: it's spotting the agentic opportunity inside someone's messy manual process and showing how I'd turn it into a working tool. See the method below.

Assessing Build instinct Context acquisition Agentic thinking Prompt-injection awareness

Play: Solve what they need, not just what they ask. Talk through trust boundaries — this is a security firm. Ship something real by the end of the slot.

12:00

~60 min

Behavioural Lunch

On-site restaurant host Julian Philpott

Unstructured, relaxed — and the most quietly decisive hour of the day. PortSwigger filters hard for people they actually enjoy being around. This is the "Niceness" test in the wild.

Assessing Niceness Emotional intelligence Cultural fit

Play: Shift focus outward — ask about their experience of "Have fun", the tribes, Movember. Warmth to everyone, catering staff included. Decompress without dropping the antenna.

13:00

~60 min

'SwigFactor' Interview

Culture crucible behavioural · 11 core traits

The dedicated cultural-alignment session. Every story should map to the eleven SwigFactor traits — humility, altruism, tenacity, EQ. Lead with collaborative wins and admitted knowledge gaps, never intellectual superiority.

Assessing Humility & ego-at-the-door Tenacity on unscoped work Winning over a sceptic

Play: Have 3 ready: subordinated my ego to a non-technical expert · refused to quit an undocumented problem · turned an automation-resistant stakeholder into an advocate.

14:00

→ 15:00 wrap

Final Synthesising Interview

Strategic close with Alex Craig

Synthesis of the day: strategic alignment, technical validity, any latent red flags. Open by showing how earlier feedback already moved my thinking — that proves coachability. Reaffirm I thrive roving, with no need for a permanent team or "home".

Assessing Strategic maturity Technical depth Comfort with the roving identity

Play: Close with calibrated questions (see below). Energy without performing — quietly, genuinely lit up by the work.

03The Panel · tailor to each

Alex Craig

Research & Engineering · King's Award DNA

Will judge

Technical validity. Foundational AI mechanics, prompt-injection risk, agentic workflow construction, taming non-deterministic LLM output.

Land it: Be precise and honest about failure modes. Treat security as a first-class design constraint, not an afterthought.

Jamie Mackay

Office of the CEO · Burp / DAST strategy

Will judge

Systemic value. How the roving role feeds the cyclical vision — picking the right internal processes to automate for asymmetric returns.

Land it: Frame impact as freed human capital redeployed to "build better products", not lines of code.

Julian Philpott

Community & culture · Movember lead

Will judge

The human dimension. Niceness, empathy for legacy workflows, integrating into team micro-cultures with zero technological arrogance.

Land it: Show genuine curiosity about people. Respect the expert who knows the messy reality better than I do.

Lucy Davies

People & Talent · day contact

Will judge

Long-term fit through the "immune system" lens — EQ, resilience, and the capacity to handle constant context-switching as a Pioneer.

Land it: Be a continuously-evolving high performer who thrives on freedom but stays culturally aligned.

04SwigFactor · 11 traits · the real bar

01Niceness

Colleagues genuinely enjoy your company.

→ Build instant rapport with sceptical domain experts.

02Humility

Anxious over-achiever, ego at the door.

→ Embed alongside experts, never above them.

03Altruism

Genuine pleasure in helping others.

→ The whole role is altruistic — relieve others' grind.

04Communication

Clear in writing and out loud.

→ Translate AI mechanics into language that drives adoption.

05Emotional Intelligence

Self-aware, adapts to any personality.

→ Navigate the friction of workflows being made obsolete.

06Aptitude for Growth

Learns fast, dives into the unknown.

→ New domain every two weeks — legal to finance.

07Leadership Potential

Contextual, shared leadership.

→ Drive adoption by influence and value, no title.

08Can-do

Glass-half-full, visualises success.

→ Believe an elegant solution exists in the mess.

09Tenacity

Perseveres where there's no playbook.

→ Push through the final 20% that derails automation.

10Energy

Infectious — without performing.

→ Genuinely lit up; pull sceptics into the future.

11Conscientiousness

Pride and meticulous care.

→ Ship reliable, secure, evaluated artifacts.

!Immune system

Triggers on arrogance, command-and-control, info-hoarding, politics.

→ Any one of these can sink technical brilliance.

05The Live Test · build, don't pitch

The task · deliberately unscoped

I won't be told the domain in advance

Call one was contracts, call two was recruitment — and the final task arrived with no brief at all. That's the test: the role is roving and tribe-agnostic, so they'll almost certainly hand me a surprise problem I've never seen. So I don't rehearse a domain. I bring a method that survives any domain — a repeatable, hyper-accelerated cycle I can drop onto whatever they put in front of me.

The one rule: no slides, no abstract strategy. The only proof is a working artifact a domain expert would actually use — built live, measured against how it was done before.

My proof · already built

The Agent Design Suite

I don't walk in empty-handed. I bring a working library of agent flows mapped to the Double Diamond and an EARS spec gate — the exact method below, already running.

Open the toolkit ↗

📅Sprint Mapdouble diamond ◈Flow Boardinteractive kanban ⬡Agent Flowswireframes ⇢Pipelinespec build

The 7-day hyper-accelerated cycle · mapped to Double Diamond + EARS

Days 1–2 · Context & data mining

Rapid immersion

maps to ↓

◆ DISCOVER · diverge

System access, historical logs, tickets, previous work. Define the problem scope and identify the historical ground truth — what "good" already looked like.

Deep ResearchData ExtractionFuzzy Search

Days 3–4 · Seeding & prototyping

Build the judge from EARS

maps to ↓

◆ DEFINE · converge

Curate historical data without hand-creating it. Build an LLM-as-a-Judge straight from EARS requirements, seeding a 100-pair gold dataset from history. Gate = spec.ears.md.

Design & PrototypeEARS Reviewspec.ears.md

Day 5 · Calibrated validation

Expert as final auditor

maps to ↓

◆ DEVELOP · diverge

The domain expert audits only the hardest, edge cases. Align the automated judge's scores to human judgment until they agree ≥95% — then the judge scales.

Build OrchestratorEval Framework

Days 6–7 · Optimise & deploy

Verify ROI, ship

maps to ↓

◆ DELIVER · converge

Final regression checks against the seeded gold dataset, verify the ROI, integrate the full pipeline to production — monitoring, runbooks & handoff included.

MonitoringDocumentationEvals

Evaluation: one centralized loop

CENTRALIZED > SILOED

1One system, not many. Tracing, automated scoring and human review live together — production failures auto-become new CI/CD test cases. Siloed tools cause whack-a-mole regressions.

2Scale human judgment. One domain expert — a "benevolent dictator" — gives binary pass/fail critiques that program the judges every department reuses.

3The 80/20 rule. Automate the 80% that's boilerplate and repetitive; route the 20% of novel complexity to the human expert.

4Watch baseline drift. Centralized telemetry flags review times creeping up — catching operational decay even when technical metrics look pristine.

Three layers of verification

Layer	Method	Cadence	Cost · speed
Unit tests	Deterministic — regex, JSON, schema	Daily	Fast & free
Integrated evals	LLM-as-a-Judge — golden dataset	Every PR	Moderate
Human review	HITL — critique shadowing	Weekly / spot	Slow & costly

Operational success = T_baseline − T_assisted

The metric is the drop in cognitive load and time-in-motion for the expert — not LLM accuracy or token latency.

06The Close · questions & tripwires

Calibrated questions to ask

?When the Pioneer compresses a legacy workflow from five days to five hours, how does PortSwigger prefer to redeploy that freed-up human capacity toward building better products?

?As I drop into domains that haven't cracked AI yet, what cultural resistance points have you seen when introducing automation — and how does leadership like those navigated?

?How do you keep OKRs measuring real outcomes over activity when a sprint's value is qualitative — like trust earned in a sceptical team?

The two retention questions (their lens)

"If they were considering an offer elsewhere, how hard would we try to keep them?"

"Knowing what we now know — would we still hire them?"

Both must land an enthusiastic yes. Everything I say and do today is read against these two.

no arroganceno command-and-controlno info-hoardingno politicsno laziness