How Scores Are Calculated | Research Methodology

The Scoring Framework

Each assessment produces a set of dimension scores (one per dimension, on a 0-to-100 scale) and a single composite score that summarises the overall result. These numbers are not subjective ratings: they are calculated mechanically from your responses using a consistent formula applied to every participant.

Scores are used for two purposes: assigning your archetype and enabling peer comparison. Understanding how they are calculated helps you interpret what a high or low score actually means.

Question Types and Their Role in Scoring

Five question types appear across the three studies. Each plays a different role in the scoring calculation.

Tradeoff Pairs

The primary scoring input. Each question presents two statements and asks which better describes your experience, on a scale from "Definitely A" to "Definitely B" with "Equal" in the middle. Internally, the responses are coded from -2 (strongly A) to +2 (strongly B).

Neither option is designed to be obviously correct. The forced choice between two valid approaches removes the "agree with everything" bias that undermines most surveys. Your genuine preference is revealed by where you land when both options have merit.

Every tradeoff pair maps to one specific dimension. The dimension score is built primarily from the mean of these paired responses.

Scenario Questions

Realistic workplace situations with three follow-up questions each. Scenarios use the same -2 to +2 response scale as tradeoffs, but they contribute to dimension scores at a reduced weight (0.3 of a tradeoff's weight) because they measure behavioural tendency rather than direct preference. This weight is consistent across all three studies.

Each scenario question is followed by a confidence probe: "How confident are you in this answer?" The confidence rating is not included in dimension scores but is used to detect patterns across the assessment.

MaxDiff Questions

From a list of statements, you select which is most like your experience and which is least like your experience. MaxDiff produces sharper differentiation than rating scales because it forces a genuine ranking rather than allowing everything to cluster near the top.

MaxDiff responses do not feed directly into dimension scores. They are used as signals in archetype assignment, helping to differentiate between profiles that have similar dimension scores but different underlying patterns.

Likert Items

Agreement statements rated on a five-point scale from Strongly Disagree to Strongly Agree. Likert items provide absolute intensity measures that complement the relative comparisons from tradeoff pairs.

Like MaxDiff, Likert responses do not feed directly into dimension scores. They serve as modifiers and tiebreakers in archetype assignment, and they are used to detect paradoxes where stated attitudes contradict measured behaviour.

Confidence Probes

A three-point response collected after each scenario question: Easy (clear, settled orientation), Hard to decide (genuinely torn), or Neither fits (the framework does not match your situation). Six probes are collected per study.

Confidence patterns reveal where you are certain and where you face genuine ambiguity. A cluster of "Hard to decide" responses on a particular dimension signals a real tension point worth exploring.

How Dimension Scores Are Calculated

All dimension scores follow the same general pattern: start with the mean of the tradeoff pairs that map to that dimension, add weighted contributions from relevant scenario questions, then normalise the result to a 0-to-100 scale.

The Calculation in Plain English

Average the tradeoffs. For each dimension, take the mean of the two or three tradeoff responses that belong to it. This gives a raw score between -2 and +2.
Add scenario contributions. Relevant scenario questions contribute at a fraction of their face value (typically 0.3 of a tradeoff). This shifts the raw score slightly based on how you respond to realistic workplace situations.
Normalise to 0-100. The raw score (now covering a wider range, because scenarios extended it) is rescaled so that the minimum possible raw score maps to 0 and the maximum maps to 100.

Scores are always clamped to the 0-100 range. It is not possible to score below 0 or above 100.

What the 0-100 Scale Means

Each dimension has a direction. One end of the scale (low scores, nearer to 0) represents one pattern of responses; the other end (high scores, nearer to 100) represents the opposite pattern. The specific meaning of each pole depends on the study.

A score of 50 means your responses were evenly balanced between the two poles. It is not a neutral or average score in a normative sense: it simply means you did not lean consistently toward either end.

Per-Study Scoring Details

Study 1: Will AI Replace Me? (AI Vulnerability)

Four dimensions are scored from 12 tradeoff pairs (three per dimension) plus six scenario questions. Each dimension covers a different facet of role exposure to AI disruption. Scenario questions contribute at 0.3 weight.

Dimension	Low score means	High score means
Task Exposure	Production-focused work: creating content, running processes	Curation-focused work: selecting, evaluating, synthesising
Skill Replaceability	Routine, pattern-based skills: easier for AI to replicate	Novel, synthesis-based skills: harder for AI to replicate
Adaptation Speed	Individual execution focus: working alone on defined tasks	Coordination focus: enabling others, bridging functions
Organisational Buffer	Explicit, documentable knowledge: can be codified and automated	Tacit, judgement-based knowledge: harder to replicate

Composite: Vulnerability Index. The four dimension scores are combined with different weights to produce a single Vulnerability Index from 0 to 100:

VI = 0.30 x (100 - Task Exposure) + 0.25 x (100 - Skill Replaceability) + 0.25 x (100 - Adaptation Speed) + 0.20 x (100 - Organisational Buffer)

The index is designed so that higher B-pole scores (curation, novelty, coordination, tacit knowledge) reduce vulnerability. Task Exposure carries the largest weight (30%), followed by Skill Replaceability and Adaptation Speed (25% each), and Organisational Buffer (20%). A high Vulnerability Index indicates greater exposure to AI disruption. A low index indicates stronger natural defences.

Dimension scoring detail. Task Exposure and Skill Replaceability each receive contributions from two scenario questions (at 0.3 weight each), giving a raw range of approximately -2.6 to +2.6 before normalisation. Adaptation Speed and Organisational Buffer each receive one scenario contribution, giving a raw range of approximately -2.3 to +2.3. These ranges determine the normalisation bounds.

Study 2: Are We Adopting AI Fast Enough? (AI Adoption)

Three main dimensions are scored from 10 tradeoff pairs plus six scenario questions. A fourth dimension, Future Orientation, is derived separately and excluded from the composite. Scenario questions contribute at 0.3 weight.

Dimension	Low score means	High score means
Usage Depth	AI is embedded in existing tools and processes (IT-selected)	AI is self-selected and used autonomously outside standard workflows
Tool Breadth	AI impact is individual: your own productivity only	AI impact extends to the team: shared workflows, coordination
Integration Level	AI used for predictable, checklist-driven tasks	AI used adaptively for exploration, reasoning, and judgement
Future Orientation	Assumes better AI tools will solve coordination problems automatically	Sees structural design as essential: AI works best when workflows are deliberately built around it

Composite: Simple average. The composite for Study 2 is the straightforward average of Usage Depth, Tool Breadth, and Integration Level:

Adoption Index = (Usage Depth + Tool Breadth + Integration Level) / 3

Future Orientation is excluded because it is diagnostic rather than a direct measure of current adoption depth. A high composite indicates mature, broad AI integration. A low composite indicates early-stage or narrow adoption.

Dimension scoring detail. Usage Depth is scored from 4 tradeoff pairs (T1-T4) plus one scenario at 0.3 weight, giving a normalisation range of -2.3 to +2.3. Tool Breadth (T5-T7) and Integration Level (T8-T10) each receive two scenario contributions at 0.3 weight, giving a normalisation range of -2.6 to +2.6. Future Orientation is derived from the mean of two scenario responses (S1c, S2c) normalised with the standard -2 to +2 range.

Study 3: What's Holding Back My Use of AI? (Structural Friction)

Three friction dimensions are scored from 10 tradeoff pairs plus six scenario questions. Study 3 uses a different scoring logic from the other two studies.

Each tradeoff pair in Study 3 pits two friction types against each other: Activation versus Knowledge, Knowledge versus Decision, or Activation versus Decision. Your response determines how much each friction type receives from that pair. Leaning strongly toward one option does not mean the other is absent: it means one is more prominent than the other in your experience.

Dimension	What it measures
Activation Friction	Barriers to getting started: waiting for approvals, chasing people, coordination overhead before work can begin
Knowledge Friction	Gaps in accessible knowledge: scattered information, expertise trapped in specific people, documentation that does not exist
Decision Friction	Constraints on decisions: reasoning that was never recorded, decisions that get revisited because stakeholders were excluded, conflicting directions

Composite: Maximum friction score. The composite for Study 3 is the highest of the three friction dimension scores, not the average:

Friction Index = max(Activation, Knowledge, Decision)

This design reflects a key insight: the dominant friction type defines the overall friction experience. If Activation Friction is 85 and the other two are 30, the composite is 85. Averaging would obscure the severity of the dominant barrier.

Dimension scoring detail. Each tradeoff pair contributes to both friction types it compares. For a response value v on a pair comparing type X versus type Y: X receives (2 - v) / 4 x 100 and Y receives (2 + v) / 4 x 100. This means both types always receive a contribution from every pair they appear in. Each friction type is scored from 6 contributions (3 pairs as the A-side + 3 pairs as the B-side of different pairs), and the dimension score is the mean of these 6 contributions.

Scenario questions add to friction scores at 0.3 weight. Scenario 1 (stalled project) primarily adds to Activation friction: S1a and S1b add their absolute value (intensity of friction experienced, regardless of direction). S1c splits between Activation (A-leaning) and Decision (B-leaning). Scenario 2 (departing colleague) primarily adds to Knowledge friction: S2a and S2b add their absolute value. S2c splits between Knowledge (A-leaning) and Decision (B-leaning). After scenario additions, scores are clamped to a maximum of 100.

T10 is an intensity moderator that classifies whether friction manifests primarily as time cost (A-leaning) or quality cost (B-leaning). It does not feed into dimension scores directly but informs archetype assignment.

Score Labels

Dimension scores and composite scores receive descriptive labels to make them easier to interpret. These labels are the same across all studies for dimension scores, and study-specific for composite scores.

Dimension Score Labels

Score range	Label	Meaning
0-39	Low	Your responses lean toward the A-pole of this dimension. The B-pole characteristics are not prominent in your profile.
40-69	Moderate	Your responses are balanced or mixed for this dimension. It plays a role in your profile without defining it.
70-100	High	Your responses lean strongly toward the B-pole. This dimension is a defining characteristic of your profile.

Composite Score Labels

See the Benchmarks page for composite score label thresholds and what each level means for each study.

What Scores Are Not

Scores are not judgements of quality or performance. A high Vulnerability Index does not mean you are a poor performer: it means your current role overlaps significantly with AI capabilities and adaptation is strategically important. A low AI Adoption composite does not mean you are behind: it may reflect deliberate choices about where AI adds value for your work.

No single score is a complete picture. Profiles are assigned from patterns across all four dimensions, not from the composite alone. Two people with the same composite can have very different dimension patterns, and very different archetypes as a result. The Archetypes page explains how the combination of dimension scores determines which profile you receive.

Scores are calculated from self-report. The multi-method design (tradeoffs, scenarios, MaxDiff, Likert) reduces but does not eliminate self-report bias. Results are most useful as a structured starting point for reflection, not as a definitive external measurement.

arrow_backMethodology Overview person_pinHow Archetypes Are Assigned radarUnderstanding Dimensions leaderboardPeer Benchmarking