4
Stages on the day
task · lunch · swig · final
4
On the panel
davies · mackay · philpott · craig
11
SwigFactor traits
culture is the bar
1
Live build to ship
artifact, not slides
01The Journey · arrive calm
Macclesfield → Booths Park
~20 min drive · A537 westbound via Monk's Heath
SK11 7DG
Brookfield Lane, Macclesfield
A537 W
straight rural road
Monk's Heath
A34 crossroads · AM congestion
A537 → A50
into Knutsford
WA16 8ZS
6 Booths Park, Chelford Rd
Depart
09:35 am
Target arrival
10:10 am
First activity
10:15 am
Buffer built in
35 min
On arrival 220-acre estate

Booths Park isn't one office block — it's parkland with historic and modern buildings. Don't let the scale eat the clock.

  1. Turn in off Chelford Rd.
    Follow the estate road to visitor parking — free, onsite.
  2. Sign in at Booths Hall.
    Main reception. Vehicle reg is taken here.
  3. Get walked to Springwood.
    The wing housing the technical teams & meeting rooms.
  4. Settle for ~5 min.
    Water, breathe, phone on silent. Land calm, not cortisol-spiked.
02Run of Day · 10:15 → 15:00
10:15
~90 min
Task-Based Activity
The Live Test with Lucy Davies · Jamie Mackay · Simon Wood · Adam Spencer
The hands-on build — and they gave no brief in advance. Expect a surprise, unscoped problem dropped on me cold, chosen to fit the roving scope of the role. The job isn't a clever demo: it's spotting the agentic opportunity inside someone's messy manual process and showing how I'd turn it into a working tool. See the method below.
Assessing Build instinct Context acquisition Agentic thinking Prompt-injection awareness
Play: Solve what they need, not just what they ask. Talk through trust boundaries — this is a security firm. Ship something real by the end of the slot.
12:00
~60 min
Behavioural Lunch
On-site restaurant host Julian Philpott
Unstructured, relaxed — and the most quietly decisive hour of the day. PortSwigger filters hard for people they actually enjoy being around. This is the "Niceness" test in the wild.
Assessing Niceness Emotional intelligence Cultural fit
Play: Shift focus outward — ask about their experience of "Have fun", the tribes, Movember. Warmth to everyone, catering staff included. Decompress without dropping the antenna.
13:00
~60 min
'SwigFactor' Interview
Culture crucible behavioural · 11 core traits
The dedicated cultural-alignment session. Every story should map to the eleven SwigFactor traits — humility, altruism, tenacity, EQ. Lead with collaborative wins and admitted knowledge gaps, never intellectual superiority.
Assessing Humility & ego-at-the-door Tenacity on unscoped work Winning over a sceptic
Play: Have 3 ready: subordinated my ego to a non-technical expert · refused to quit an undocumented problem · turned an automation-resistant stakeholder into an advocate.
14:00
→ 15:00 wrap
Final Synthesising Interview
Strategic close with Alex Craig
Synthesis of the day: strategic alignment, technical validity, any latent red flags. Open by showing how earlier feedback already moved my thinking — that proves coachability. Reaffirm I thrive roving, with no need for a permanent team or "home".
Assessing Strategic maturity Technical depth Comfort with the roving identity
Play: Close with calibrated questions (see below). Energy without performing — quietly, genuinely lit up by the work.
03The Panel · tailor to each
AC
Alex Craig
Research & Engineering · King's Award DNA
Will judge
Technical validity. Foundational AI mechanics, prompt-injection risk, agentic workflow construction, taming non-deterministic LLM output.
Land it: Be precise and honest about failure modes. Treat security as a first-class design constraint, not an afterthought.
JM
Jamie Mackay
Office of the CEO · Burp / DAST strategy
Will judge
Systemic value. How the roving role feeds the cyclical vision — picking the right internal processes to automate for asymmetric returns.
Land it: Frame impact as freed human capital redeployed to "build better products", not lines of code.
JP
Julian Philpott
Community & culture · Movember lead
Will judge
The human dimension. Niceness, empathy for legacy workflows, integrating into team micro-cultures with zero technological arrogance.
Land it: Show genuine curiosity about people. Respect the expert who knows the messy reality better than I do.
LD
Lucy Davies
People & Talent · day contact
Will judge
Long-term fit through the "immune system" lens — EQ, resilience, and the capacity to handle constant context-switching as a Pioneer.
Land it: Be a continuously-evolving high performer who thrives on freedom but stays culturally aligned.
04SwigFactor · 11 traits · the real bar
01Niceness
Colleagues genuinely enjoy your company.
Build instant rapport with sceptical domain experts.
02Humility
Anxious over-achiever, ego at the door.
Embed alongside experts, never above them.
03Altruism
Genuine pleasure in helping others.
The whole role is altruistic — relieve others' grind.
04Communication
Clear in writing and out loud.
Translate AI mechanics into language that drives adoption.
05Emotional Intelligence
Self-aware, adapts to any personality.
Navigate the friction of workflows being made obsolete.
06Aptitude for Growth
Learns fast, dives into the unknown.
New domain every two weeks — legal to finance.
07Leadership Potential
Contextual, shared leadership.
Drive adoption by influence and value, no title.
08Can-do
Glass-half-full, visualises success.
Believe an elegant solution exists in the mess.
09Tenacity
Perseveres where there's no playbook.
Push through the final 20% that derails automation.
10Energy
Infectious — without performing.
Genuinely lit up; pull sceptics into the future.
11Conscientiousness
Pride and meticulous care.
Ship reliable, secure, evaluated artifacts.
!Immune system
Triggers on arrogance, command-and-control, info-hoarding, politics.
Any one of these can sink technical brilliance.
05The Live Test · build, don't pitch
The task · deliberately unscoped

I won't be told the domain in advance

Call one was contracts, call two was recruitment — and the final task arrived with no brief at all. That's the test: the role is roving and tribe-agnostic, so they'll almost certainly hand me a surprise problem I've never seen. So I don't rehearse a domain. I bring a method that survives any domain — a repeatable, hyper-accelerated cycle I can drop onto whatever they put in front of me.

The one rule: no slides, no abstract strategy. The only proof is a working artifact a domain expert would actually use — built live, measured against how it was done before.
My proof · already built

The Agent Design Suite

I don't walk in empty-handed. I bring a working library of agent flows mapped to the Double Diamond and an EARS spec gate — the exact method below, already running.

Open the toolkit
The 7-day hyper-accelerated cycle · mapped to Double Diamond + EARS
Days 1–2 · Context & data mining
Rapid immersion
maps to ↓
◆ DISCOVER · diverge

System access, historical logs, tickets, previous work. Define the problem scope and identify the historical ground truth — what "good" already looked like.

Deep ResearchData ExtractionFuzzy Search
Days 3–4 · Seeding & prototyping
Build the judge from EARS
maps to ↓
◆ DEFINE · converge

Curate historical data without hand-creating it. Build an LLM-as-a-Judge straight from EARS requirements, seeding a 100-pair gold dataset from history. Gate = spec.ears.md.

Design & PrototypeEARS Reviewspec.ears.md
Day 5 · Calibrated validation
Expert as final auditor
maps to ↓
◆ DEVELOP · diverge

The domain expert audits only the hardest, edge cases. Align the automated judge's scores to human judgment until they agree ≥95% — then the judge scales.

Build OrchestratorEval Framework
Days 6–7 · Optimise & deploy
Verify ROI, ship
maps to ↓
◆ DELIVER · converge

Final regression checks against the seeded gold dataset, verify the ROI, integrate the full pipeline to production — monitoring, runbooks & handoff included.

MonitoringDocumentationEvals

Evaluation: one centralized loop

CENTRALIZED > SILOED
1One system, not many. Tracing, automated scoring and human review live together — production failures auto-become new CI/CD test cases. Siloed tools cause whack-a-mole regressions.
2Scale human judgment. One domain expert — a "benevolent dictator" — gives binary pass/fail critiques that program the judges every department reuses.
3The 80/20 rule. Automate the 80% that's boilerplate and repetitive; route the 20% of novel complexity to the human expert.
4Watch baseline drift. Centralized telemetry flags review times creeping up — catching operational decay even when technical metrics look pristine.

Three layers of verification

LayerMethodCadenceCost · speed
Unit tests Deterministic — regex, JSON, schema Daily Fast & free
Integrated evals LLM-as-a-Judge — golden dataset Every PR Moderate
Human review HITL — critique shadowing Weekly / spot Slow & costly
Operational success = Tbaseline − Tassisted
The metric is the drop in cognitive load and time-in-motion for the expert — not LLM accuracy or token latency.
06The Close · questions & tripwires

Calibrated questions to ask

?When the Pioneer compresses a legacy workflow from five days to five hours, how does PortSwigger prefer to redeploy that freed-up human capacity toward building better products?
?As I drop into domains that haven't cracked AI yet, what cultural resistance points have you seen when introducing automation — and how does leadership like those navigated?
?How do you keep OKRs measuring real outcomes over activity when a sprint's value is qualitative — like trust earned in a sceptical team?

The two retention questions (their lens)

"If they were considering an offer elsewhere, how hard would we try to keep them?"
"Knowing what we now know — would we still hire them?"

Both must land an enthusiastic yes. Everything I say and do today is read against these two.

no arroganceno command-and-controlno info-hoardingno politicsno laziness