Transparent AI Framework — Protocol for Open AI System
A public, versioned, transparent, replayable semantic commons — governed openly and designed to resist capture.
Version: 0.1.0 Status: Draft Author: Technology Shield Last Updated: 2026-03-25
1. Preamble
The large AI systems being built today are funded by billions of dollars, trained on opaque data, governed by invisible rules, and designed to concentrate power.
This protocol is a countermeasure.
It does not claim to produce unbiased truth. Instead, it creates a system where:
- All inputs are inspectable
- All transforms are declared
- All rankings are attributable
- All disputes are preserved
- Alternative interpretations can coexist
You do not eliminate bias by claiming neutrality. You reduce bias by making every assumption legible, forkable, challengeable, and reversible.
That is the heart of this protocol.
2. Protocol Layers
The Transparent Semantic Commons operates across five layers:
┌──────────────────────────────────────────────────────┐
│ 5. RETRIEVAL LAYER │
│ Search, ranking, neighbourhood lookup, │
│ contradiction surfacing, competing views │
├──────────────────────────────────────────────────────┤
│ 4. GOVERNANCE LAYER │
│ Reputation, dispute resolution, weighting rules, │
│ moderation, anti-poisoning, forkability │
├──────────────────────────────────────────────────────┤
│ 3. PROOF LAYER │
│ Signatures, hashes, timestamps, lineage, │
│ challenge records, verification │
├──────────────────────────────────────────────────────┤
│ 2. TRANSFORMATION LAYER │
│ Chunking, normalisation, tagging, translation, │
│ embedding, summarisation │
├──────────────────────────────────────────────────────┤
│ 1. SOURCE LAYER │
│ Raw public artefacts: text, images, audio, │
│ datasets, claims, arguments │
└──────────────────────────────────────────────────────┘
Each layer has clear responsibilities, and objects at each layer are defined by the Open Embeddings Schema (separate document).
3. Layer 1 — Source
Purpose
Accept and preserve raw public artefacts with verified provenance.
Rules
| Rule | Description |
|---|---|
| All artefacts must be content-hashed | Integrity verifiable at any time |
| All artefacts must be signed by their submitter | Attribution is non-repudiable |
| All artefacts must declare a licence | Consumption rights are clear |
| All artefacts must be stored durably | Content must survive individual node failure |
| No silent modification | Updates create new artefact versions linked to the original |
Storage
Artefacts are stored off-chain in durable decentralised storage (IPFS, Arweave, Filecoin, or federated node operators). Content hashes and signatures are anchored on-chain for tamper-evidence.
4. Layer 2 — Transformation
Purpose
Record every operation applied to a source artefact, with full transparency about who did it, what software they used, what model they ran, and what parameters they chose.
Rules
| Rule | Description |
|---|---|
| Every transformation must be recorded | No hidden transforms |
| Every transformation must declare its toolchain | Software, model, version, parameters |
| Every transformation must be signed | Accountability for the transform |
| Transformations are append-only | You cannot silently re-run a transform |
| Multiple transforms per artefact are expected | Different models, chunking strategies, and approaches coexist |
Transform Types
| Type | Description |
|---|---|
chunk |
Split an artefact into smaller units |
translate |
Translate to another language |
classify |
Assign classification labels |
summarize |
Produce a summary |
embed |
Produce a vector embedding |
redact |
Remove sensitive content (with declaration) |
normalize |
Standardise format or encoding |
5. Layer 3 — Proof
Purpose
Provide cryptographic evidence that the provenance chain is intact, that objects have not been tampered with, and that challenges and disputes are preserved.
Mechanisms
| Mechanism | Purpose |
|---|---|
| Content-addressed IDs | Object identity is derived from content; any change produces a different ID |
| Ed25519 signatures | Every object is signed by its creator |
| DID-based identity | Signers are identified by decentralised identifiers |
| Timestamp anchoring | Object timestamps are anchored on a public ledger |
| Provenance chain verification | Any embedding can be traced back through its transform to the source artefact |
| Challenge preservation | Challenges are immutable; they cannot be silently dismissed |
| Governance decision publication | Every governance action is a signed, public record |
Verification Protocol
For any object retrieved from the commons:
- Hash verification — Does the object's content match its declared hash?
- Signature verification — Is the signature valid for the declared signer?
- Identity resolution — Does the signer's DID resolve to a valid public key?
- Lineage verification — Does the provenance chain back to the source artefact verify?
- Challenge check — Are there open challenges against this object or any object in its chain?
- Governance check — Are there governance decisions (tombstones, supersessions) affecting this object?
Result: verified | unverifiable | disputed | tombstoned | superseded
6. Layer 4 — Governance
Purpose
Govern the commons in a way that resists capture by any single party, organisation, or alliance.
6.1 Core Governance Rules
These rules are constitutional — they define the minimum governance that all participants accept:
| # | Rule |
|---|---|
| 1 | No silent edits. Every change creates a new versioned record. |
| 2 | No hard deletes. Only tombstones and supersession. |
| 3 | Every transformation signed. No anonymous transforms. |
| 4 | Every moderation decision public. No shadow bans, no invisible removal. |
| 5 | Every ranking policy versioned. No invisible algorithm changes. |
| 6 | Every challenge preserved. Challenges cannot be deleted, only resolved. |
| 7 | Every fork portable. Any community can fork the full dataset, embeddings, policies, and reputation. |
| 8 | Consumers should contribute back. Pure extraction without reciprocity is discouraged through governance mechanisms. |
6.2 Roles
| Role | Responsibility | Power |
|---|---|---|
| Contributors | Submit artefacts, embeddings, claims, evidence links | Create objects |
| Validators | Verify provenance chains and object integrity | Flag invalid objects |
| Domain Stewards | Curate quality within a specific domain | Recommend, prioritise, tag |
| Challenge Reviewers | Evaluate and resolve challenges | Issue governance decisions |
| Protocol Maintainers | Evolve the protocol and schema | Propose protocol changes (subject to community approval) |
| Index Operators | Run retrieval indexes | Serve queries (cannot alter underlying data) |
Anti-capture constraint: No single role may control all of: - Data admission - Ranking - Dispute resolution
These three powers must be held by different parties.
6.3 Challenge Resolution Process
Challenge opened
│
▼
Evidence gathering period (configurable, default 14 days)
│
▼
Review by Challenge Reviewers (minimum 3 independent reviewers)
│
├── Upheld → Governance action (tombstone, supersede, flag)
├── Dismissed → Challenge marked as dismissed with rationale
└── Escalated → Broader community vote
│
▼
Appeal window (configurable, default 30 days)
│
├── No appeal → Decision final
└── Appeal → Re-review with expanded panel
6.4 Forkability
This is the ultimate anti-capture safety valve.
If a community believes governance has been captured, they must be able to fork:
| What Can Be Forked | How |
|---|---|
| The dataset (artefacts) | Content-addressed; any node can replicate |
| The embeddings | Content-addressed; stored off-chain |
| The governance policies | Versioned, public documents |
| The reputation graph | Exported as signed objects |
| The retrieval policies | Public, versioned objects |
| The challenge history | Immutable, content-addressed |
The fork inherits the full history. The new community can then modify governance going forward while preserving the shared past.
7. Layer 5 — Retrieval
Purpose
Provide transparent, policy-driven retrieval that makes bias visible rather than invisible.
7.1 Retrieval Policy as a Public Object
Retrieval policies are themselves signed, versioned objects:
{
"type": "RetrievalPolicy",
"policy_id": "cid:bafy...",
"candidate_sources": ["public", "science", "english"],
"similarity_metric": "cosine",
"recency_weight": 0.1,
"reputation_weight": 0.2,
"challenge_penalty": 0.5,
"diversity_requirement": {
"minority_view_floor": 0.15,
"contradiction_surfacing": true,
"max_single_source_share": 0.3
},
"jurisdictional_filters": [],
"version": "1.0.0",
"published_by": "did:key:z6Mk...",
"signature": "ed25519:..."
}
Different communities can publish different retrieval policies while sharing the same underlying commons. This is how pluralism works in practice.
7.2 Bias-Resistant Retrieval Pattern
For any query, the retrieval response includes:
{
"query": "...",
"policy_used": "cid:bafy...",
"model_spaces_used": ["bge-m3", "e5-large-v2"],
"results": {
"supporting": [
{
"embedding_id": "cid:bafy...",
"similarity": 0.94,
"provenance_status": "verified",
"challenge_status": "none"
}
],
"contradicting": [
{
"embedding_id": "cid:bafy...",
"similarity": 0.87,
"provenance_status": "verified",
"challenge_status": "none"
}
],
"disputed": [
{
"embedding_id": "cid:bafy...",
"challenge_id": "cid:bafy...",
"challenge_status": "open"
}
]
},
"policy_explanation": "Results include top 10 supporting and top 5 contradicting. Minority view floor of 15% applied. Disputed objects surfaced separately."
}
Instead of returning one answer, the system returns:
- Closest matches (supporting evidence)
- Strongest counter-matches (contradicting evidence)
- Unresolved disputes (challenged objects)
- Policy explanation (which rules shaped the response)
- Provenance status (verification state of each result)
This is fundamentally healthier than a single invisible ranking.
7.3 Index Operator Rules
Anyone can run an index. Index operators:
- Must declare which retrieval policies they support
- Must not alter underlying object data
- Must serve provenance chains on request
- Must surface challenges and disputes
- Must declare their funding and governance
- Can be forked (anyone can run a competing index)
8. Anti-Capture Mechanisms
8.1 Structural
| Mechanism | How It Resists Capture |
|---|---|
| Content addressing | Data cannot be silently altered |
| Multiple embedding spaces | No single model priesthood |
| Multiple index operators | No single retrieval monopoly |
| Public governance decisions | No shadow moderation |
| Forkability | Exit option prevents lock-in |
| Role separation | No single party controls admission + ranking + disputes |
8.2 Economic
| Mechanism | How It Resists Capture |
|---|---|
| Contribution receipts | Consumers must give back |
| Influence caps | Contribution does not buy unlimited influence |
| Transparent funding | Index operators and stewards declare funding sources |
| No pay-for-ranking | Objects cannot be promoted through payment |
8.3 Social
| Mechanism | How It Resists Capture |
|---|---|
| Public challenge process | Anyone can challenge any object |
| Appeal process | Governance decisions can be appealed |
| Minority view floors | Retrieval policies must surface dissenting views |
| Contradiction surfacing | Counter-evidence is returned alongside supporting evidence |
| Community forking | Disagree? Fork. History is preserved. |
9. Bootstrapping Plan
Phase 1 — Foundation
| Deliverable | Description |
|---|---|
| Manifesto | Public statement of principles and intent |
| Protocol specification | This document, formalised and versioned in a Git repository |
| Schema repository | The Open Embeddings Schema as a versioned spec |
| Reference node | A single operational node demonstrating the protocol |
| Sample corpus | A small, curated dataset with full provenance |
| Challenge flow demo | A working example of the challenge and resolution process |
| Community home | A Matrix space for working groups and discussion |
Phase 2 — Community
| Deliverable | Description |
|---|---|
| Signed contributors | First cohort of contributors with verified DIDs |
| Multiple embedding providers | Embeddings from at least 3 different models |
| Public retrieval explorer | A web UI for browsing artefacts, embeddings, claims, and challenges |
| Contradiction surfacing demo | Retrieval that shows supporting and contradicting evidence |
| Governance handbook | Published rules for challenge review and governance decisions |
Phase 3 — Federation
| Deliverable | Description |
|---|---|
| Multi-node federation | Multiple independent nodes sharing the commons |
| Reputation/stake system | Contribution-weighted influence |
| Alternate index operators | At least 2 independent retrieval services |
| Client applications | Tools for researchers, journalists, educators |
| Domain communities | Specialised communities (science, law, history, etc.) |
Phase 4 — Maturity
| Deliverable | Description |
|---|---|
| Cross-language support | Artefacts and embeddings in multiple languages |
| Institutional adoption | Libraries, universities, NGOs contributing |
| Governance evolution | Community-driven protocol changes |
| Ecosystem growth | Third-party tools and integrations |
10. Community and Communication
Recommended Stack
| Layer | Platform | Purpose |
|---|---|---|
| Core collaboration | Matrix | Working groups, design discussions, governance rooms, moderated community channels |
| Public publishing | Fediverse / ActivityPub | Public announcements, essays, manifestos, recruiting aligned people, federated discussion |
| Discovery | AT Protocol / Bluesky | Short-form thought leadership, finding protocol and open-web people, sharing milestones |
| Specification | Git repository | Versioned protocol and schema specifications |
| In-person | FOSDEM-style events | Finding serious contributors, presenting the protocol, meeting adjacent communities |
Why This Stack
- Matrix is an open network for decentralised communication, governed by a foundation committed to open standards. It fits a serious protocol-building community.
- ActivityPub is a W3C Recommendation for decentralised social networking. It provides reach without platform dependence.
- AT Protocol is an open framework for public conversation with account portability. It is good for discovery and attracting technical contributors.
- Git provides permanence, versioning, and collaboration for specifications.
- FOSDEM and similar events connect the open-source and decentralised-web communities in person.
11. Charter
Transparent Semantic Commons Charter v0.1
- No hidden transforms.
- No silent edits.
- No opaque ranking.
- No single embedding monopoly.
- Every claim must be linkable to evidence.
- Every object must be challengeable.
- Every governance action must be public.
- Every community must be able to fork.
- Consumers should contribute back.
- The commons exists for collective flourishing, not extraction.
12. The Hard Truths
The enemy is not just secrecy. It is also:
- Convenience — closed systems are easier to use
- Apathy — most people will not participate in governance
- Scale asymmetry — billion-dollar systems have more resources
- Moderation burden — open systems attract abuse
- Slow institutional capture — governance can be co-opted gradually
This protocol wins only if it is:
- Simpler to inspect than closed systems
- Easier to fork than captured systems
- More useful in practice than being merely morally appealing
A working prototype matters more than a perfect philosophy.
13. Naming
Candidate names for the broader initiative:
| Name | Rationale |
|---|---|
| Transparent Semantic Commons | Says exactly what it is: transparent, semantic, and common |
| Open Semantic Commons | Emphasises openness |
| Civic Knowledge Ledger | Emphasises the civic and knowledge dimensions |
| Forkable Truth Protocol | Emphasises the anti-capture mechanism |
The recommended name is Transparent Semantic Commons — it is precise, unglamorous, and resistant to marketing capture.
14. Relationship to Technology Shield
Technology Shield contributes this protocol and framework as part of its commitment to cybersecurity that serves people, not just organisations.
The Transparent AI Framework connects to Technology Shield's other collateral:
| Collateral | Relationship |
|---|---|
| Pattern Blueprint | The protocol layers are themselves architectural patterns |
| Security Reference Architecture | The commons infrastructure follows zoning principles |
| Secure Development Framework | Protocol node software is built with the SDF pipeline |
| Secure Collaborative Development | The open-source commons benefits from SCDA patterns for community contribution |
| Shield Business | Future integration for organisations tracking AI supply chain risk |
15. What to Build First
One repo. One schema. One Matrix room. One public manifesto. One explorer UI. One sample corpus. One challenge flow.
Start there. Everything else follows.