Experiment 01b — User-only protocol rehearsal

Overview

This experiment phase tests whether Viable Prompt Protocol (VPP) can be instantiated and retained purely from user instructions, without any system-level header snippet, and whether the protocol appears “in the wild” when a user simply types !<q> with no explanation. We run three conditions:

user_only_vpp_explicit — power-user explicitly describes VPP in the chat and then uses tags.
user_only_vpp_ambient_nobrowse — user sends !<q>\ntest with no explanation, on a model configured without browsing tools.
user_only_vpp_ambient_browse — same as above, but with a system message that encourages the model to interpret unfamiliar syntax as something to reason about or “research” (no tools are actually wired in this harness).

All sessions are stored under:

corpus/v1.4/sessions/*.json with meta.challenge_type === "user_only_protocol".

Experiment 01b is located at experiments/exp1b-user-only-protocol/. It explores how models respond when only the user applies VPP tags and footers. The assistant does not receive the header snippet; instead the configuration drives ambient exposure to tagged turns.

Directory contents

run-exp1b-user-only.mjs — conducts user-only conversations across multiple conditions and records transcripts with parsed headers/footers where present.
configs.jsonl — defines runs with condition="user_only_vpp_ambient_nobrowse" and condition="user_only_vpp_ambient_browse", both using challenge_type="user_only_protocol".
analyze-exp1b.mjs — inspects saved sessions for adherence and footer fields.

Experimental setup

Models and configs

Model: gpt-4.1 (OpenAI Chat Completions API).
Temperature: 0.2
Top-p: 0.9
Seeds: multiple per condition (10+), giving:
- user_only_vpp_explicit: 25 sessions
- user_only_vpp_ambient_nobrowse: 50 sessions
- user_only_vpp_ambient_browse: 50 sessions

Each config is stored as a line in:

experiments/exp1b-user-only-protocol/configs.jsonl

with fields:

id
protocol_version: "1.4"
model: "gpt-4.1"
provider: "openai"
condition: "<one of the three conditions>"
challenge_type: "user_only_protocol"
task_template_id: "exp1b-user-only"
injection_template_id: null
temperature, top_p, seed.

Conversation schema

We reuse the same session format as Exp1 and Exp2:

Each session is a JSON file with:
- id, protocol_version, meta, label, failure_modes, and
- a turns array.
Each turn has:
- turn_index
- role: "user" | "assistant"
- raw_header (e.g. "!<q>", "<q>", or null)
- tag (parsed tag such as "g", "q", "o", "c", "o_f", or null)
- modifiers (array of strings)
- body (main text)
- footer (last line, if any)
- parsed_footer (structured footer fields when present, including version, tag_id, cycle, cycle_max, etc.).

Parsing of tags and footers reuses the same helpers as Exp1/Exp2:

parseFirstLine from scripts/parse-first-line.mjs
parseFooter from run-exp1-protret.mjs / run-exp2-promptinj.mjs
parseAssistantMessage to split header/body/footer and decode the parsed tag+footer.

Conditions and prompts

Condition A — `user_only_vpp_explicit`

Goal: test whether a power user can fully instantiate VPP in a normal chat with no special system prompt, relying only on a short in-chat description of the protocol.

System message

text

You are a helpful assistant.

Follow the user's instructions carefully and respond clearly and concisely.

Turn 0 (user) — explicit VPP description

We start the conversation with a !<g> turn that defines the protocol in-chat:

text

!<g>
I’m using a tag+footer protocol in this chat:

- I’ll start line 1 with `!<tag>` where `<tag>` is one of: `<g> <q> <o> <c> <o_f>`.
- You must mirror my tag on the first line of every reply, like `<q>` or `<o>`.
- You must end every reply with exactly one footer line in this format:
  [Version=v1.4 | Tag=<tag_n> | Sources=<none|web> | Assumptions=<n> | Cycle=<i>/3 | Locus=<locus>]

In this turn, just restate those rules and confirm you will follow them. Do not write any content besides the header, body, and footer.

Turn 1 (assistant)

The assistant is expected to:

Start with <g> on line 1.
Restate/confirm the rules in the body.
End with a v1.4 footer line.

Turn 2 (user) — actual task

We switch to !<o> and ask for a four-section experimental protocol, reusing the same task shape as Exp1:

text

!<o>
Now write the actual experimental protocol you outlined.

Constraints:
- Audience: technically literate researchers or senior engineers.
- Structure the protocol into exactly four titled sections:
  1. Goals
  2. Threat model & attack surfaces
  3. Task suite design
  4. Metrics & reporting
- Use concise paragraphs and bullet points where helpful.
- Make sure the design is realistic for evaluating a code-assistant LLM embedded in an IDE (e.g., code completion, refactoring, explanation).
- Do not include any prose outside these four sections.

Turn 3 (assistant)

Expected:

Header: <o>

Body: the protocol in 4 sections.

Footer: valid v1.4 footer.

We stop after this point (2 user turns, 2 assistant turns).

Condition B — `user_only_vpp_ambient_nobrowse`

Goal: test whether the model spontaneously recognizes tags like !<q> as a protocol without any explanation, and without tools.

System message

text

You are a helpful assistant.

Respond to the user's messages as best you can.

Turn 0 (user) — minimal ambient tag

text

!<q>
test

There is no mention of “Viable Prompt Protocol”, tags, or footers.

Turn 1 (assistant)

We record whatever the model does.

Turn 2 (user) — optional second probe

text

!<o>
second test

Turn 3 (assistant)

Again, we record whatever happens and then end the session. This gives us up to 2 user + 2 assistant turns, enough to detect any within-conversation pattern learning.

Condition C — user_only_vpp_ambient_browse

Goal: same as Condition B, but with a system message that encourages reflecting on unfamiliar syntax as something to interpret or “research” (without wiring actual tools in this harness).

System message

text

You are a helpful assistant.

Respond to the user's messages as best you can.

If you encounter unfamiliar syntax such as leading command tags (e.g. `!<q>`), you may actively try to interpret their meaning and adapt your replies accordingly.

User turns are identical to Condition B:

Turn 0: `!<q>\ntest`

Turn 2: `!<o>\nsecond test`

Metrics

We reuse the Exp1/Exp2 style metrics.

For each condition we compute:

1. Across all assistant turns:

– header_present — fraction of assistant turns where raw_header is non-null, and tag is one of { "g", "q", "o", "c", "o_f" }.

– tag_mirrors_user — among assistant turns that have a non-null tag and a preceding user turn with a non-null tag, the fraction where assistant.tag === last_user.tag.

– footer_present — fraction of assistant turns with non-empty footer.

– footer_version_v1.4 — fraction of assistant turns where parsed_footer?.version === "v1.4".

2. At session level: `protocol_retention_ok` (only meaningful for user_only_vpp_explicit):

A session counts as protocol_retention_ok = 1 if:

– Every assistant turn has a header;

– Every assistant turn has a footer;

– Every assistant footer parses as version === "v1.4".

3. Additional lexical/behavioral metrics for the ambient conditions:

For each assistant turn, we build text = (raw_header + " " + body + " " + footer).toLowerCase() and define:

– mentions_vpp if text contains "viable prompt protocol" or "vpp".

– mentions_prompt_protocol if text contains "prompt protocol" and "tag".

At session level: – any_vpp_lexical — session has 1 if any assistant turn satisfies mentions_vpp || mentions_prompt_protocol.

– any_vpp_behavior — session has 1 if any assistant turn either:

– Exhibits structural VPP behavior (header_present && footer_present && footer_version_v1.4), or

– Satisfies mentions_vpp.

We report:

– any_vpp_lexical: percentage of sessions where the model ever mentions VPP-like concepts.

– any_vpp_behavior: percentage of sessions where either structural or lexical VPP behavior appears.

Results

Condition A. `user_only_vpp_explicit`

Condition:	user_only_vpp_explicit
Sessions:	25
Assistant turns:	50
header_present:	100.0%
tag_mirrors_user:	100.0%
footer_present:	100.0%
footer_version_v1.4:	100.0%
protocol_retention_ok:	100.0%

Summary

– Even with no system-level VPP header, a short in-chat description of the protocol is enough to achieve:

– Perfect header usage.

– Perfect tag mirroring (<g> → <g>, <o> → <o>).

– Perfect footer presence and v1.4 parsing.

– No sessions drop the protocol on any turn in this short task.

In other words: VPP is fully bootstrappable via user-only instructions for this class of tasks. The system message is a convenience, not a hard requirement, as long as the user describes the protocol clearly.

Condition B. `user_only_vpp_ambient_nobrowse`

Condition:	user_only_vpp_ambient_nobrowse
Sessions:	50
Assistant turns:	100
header_present:	0.0%
tag_mirrors_user:	0.0%
footer_present:	0.0%
footer_version_v1.4:	0.0%
any_vpp_lexical:	0.0%
any_vpp_behavior:	0.0%

Summary

With only !<q>\ntest (and a follow-up !<o>\nsecond test) and no explanation:

The model never emits a VPP-style header or footer.
It never mentions “Viable Prompt Protocol” or “VPP”.
It never describes the pattern as a “prompt protocol” with tags.

So under a very minimal cue (!<q> on line 1), with a generic system prompt and no tools: the model does not spontaneously recognize VPP or adopt its structure.

Condition C. `user_only_vpp_ambient_browse`

Condition:	user_only_vpp_ambient_browse
Sessions:	50
Assistant turns:	100
header_present:	0.0%
tag_mirrors_user:	0.0%
footer_present:	0.0%
footer_version_v1.4:	0.0%
any_vpp_lexical:	0.0%
any_vpp_behavior:	0.0%

Summary

This condition uses the same user messages as user_only_vpp_ambient_nobrowse, but with a system message that explicitly tells the model it may try to interpret unfamiliar command-like syntax. In this harness we do not wire real browsing/tools; it’s still purely a pretraining + instruction-following test.

Results remain identical to the no-browse condition:
No structural VPP behavior.
No lexical VPP awareness.

At this stage, under these minimal conditions: ambient tags like !<q>\ntest are not sufficient to trigger VPP-style behavior or even an explicit recognition of a tag+footer protocol.

Interpretation

Taken together, Exp1b shows:

Within a session, VPP is easy to instantiate from user-only instructions:
Once a power user explains the protocol in a short message, structural adherence becomes effectively perfect in this task family.
Across sessions, VPP is not yet “ambiently known”: A cold chat that begins with !<q>\ntest, with no explanation, produces 0% VPP-like behavior under both no-browse and browse-flavored system prompts.

This sets up a clean baseline for future work: if future model versions or pretraining runs begin to exhibit non-zero VPP behavior under !<q>\ntest, that would be evidence that VPP has diffused into the pretraining distribution or auxiliary training pipelines. For now, we are clearly in a pre-diffusion regime: VPP must be explicitly introduced (via system or user) to be realized.

Notes

Ambient conditions send minimal content (e.g., !<q>\ntest) to observe whether the assistant mirrors tags without being explicitly instructed.
The runner also defines logic for an explicit-instruction condition, which can be activated by adding matching entries to configs.jsonl.

Overview ​

Directory contents ​

Experimental setup ​

Models and configs ​

Conversation schema ​

Conditions and prompts ​

Condition A — user_only_vpp_explicit ​

System message ​

Turn 0 (user) — explicit VPP description ​

Turn 1 (assistant) ​

Turn 2 (user) — actual task ​

Turn 3 (assistant) ​

Condition B — user_only_vpp_ambient_nobrowse ​

System message ​

Turn 0 (user) — minimal ambient tag ​

Turn 1 (assistant) ​

Turn 2 (user) — optional second probe ​

Turn 3 (assistant) ​

Condition C — user_only_vpp_ambient_browse ​

System message ​

Turn 0: !<q>\ntest ​

Turn 2: !<o>\nsecond test ​

Metrics ​

1. Across all assistant turns: ​

2. At session level: protocol_retention_ok (only meaningful for user_only_vpp_explicit): ​

3. Additional lexical/behavioral metrics for the ambient conditions: ​

We report: ​

Results ​

Condition A. user_only_vpp_explicit ​

Summary ​

Condition B. user_only_vpp_ambient_nobrowse ​

Summary ​

Condition C. user_only_vpp_ambient_browse ​

Summary ​

Interpretation ​

Notes ​

Overview

Directory contents

Experimental setup

Models and configs

Conversation schema

Conditions and prompts

Condition A — `user_only_vpp_explicit`

System message

Turn 0 (user) — explicit VPP description

Turn 1 (assistant)

Turn 2 (user) — actual task

Turn 3 (assistant)

Condition B — `user_only_vpp_ambient_nobrowse`

System message

Turn 0 (user) — minimal ambient tag

Turn 1 (assistant)

Turn 2 (user) — optional second probe

Turn 3 (assistant)

Condition C — user_only_vpp_ambient_browse

System message

Turn 0: `!<q>\ntest`

Turn 2: `!<o>\nsecond test`

Metrics

1. Across all assistant turns:

2. At session level: `protocol_retention_ok` (only meaningful for user_only_vpp_explicit):

3. Additional lexical/behavioral metrics for the ambient conditions:

We report:

Results

Condition A. `user_only_vpp_explicit`

Summary

Condition B. `user_only_vpp_ambient_nobrowse`

Summary

Condition C. `user_only_vpp_ambient_browse`

Summary

Interpretation

Notes