LLMConsent

Consent doesn’t stop at training: data and attribution in the agentic era

2026-06-28T02:00:00+00:00

We spent a lot of words last time on why it is so hard to trace a model’s output back to the data baked into its weights. The short version: during pre-training, influence is diffuse, the geometry fights you, and exact per-output attribution is mostly a fantasy.

Here is the thing almost no one is saying out loud. That problem, the hard one, is about a world that is already half gone. The frontier is not a model trained once on a frozen corpus and then shipped. The frontier is an agent: a model wired to tools, reading live data, taking actions, calling other agents, and carrying memory between sessions. And in that world, the data question is different, the consent question is different, and oddly enough, attribution gets easier in exactly the place training made it hard.

The shift in one sentence

Training-era consent asked, “may you learn from my data?” Agentic consent has to ask, “may you read this, use it, and act on it right now, for this purpose, within these limits, and who is accountable when you do?”

That is a different shape of question. It is about the present tense, not the past. It is scoped to a task, not a corpus. And it is revocable in a way that a trained weight never really is, because the agent has to come back and ask again the next time it acts.

Where the data actually flows now

When people picture AI and data, they picture the scrape: a crawler hoovering up the open web into a training set. Agents move data in ways that picture misses entirely.

An agent reads your live context. It opens your documents, your email, your calendar, your codebase, through tool calls and connectors. None of that is training data. It never touches the weights. It flows into the context window, shapes one set of actions, and is gone. The influence is real and immediate, and it is also, crucially, observable, because it happened in a request you can log rather than in a gradient update you cannot.

An agent takes actions. It sends the email, books the flight, files the ticket, moves the money, merges the pull request. The output is no longer just text on a screen. It is an effect in the world, and effects in the world have consequences and need accountability in a way that a paragraph does not.

An agent talks to other agents. It delegates a subtask, hands off context, pulls a memory another agent wrote. Your data crosses not just system boundaries but agent boundaries, and the chain of who-told-whom is where responsibility either gets tracked or gets lost.

Each of these is a consent surface, and none of them is covered by “did you agree to be in the training set.”

Attribution flips

Here is the part that should change how people think about this. The closer data sits to the moment of use, the more attributable it becomes. We made that point about retrieval and in-context use last time. Agents live entirely at that end of the spectrum.

An agent’s behavior is, in principle, fully traceable. There is a prompt. There is a set of tool calls with inputs and outputs. There is a context window with known contents. There is a sequence of actions with timestamps. If you instrument the agent, you do not have to estimate which data influenced which action with an inverse Hessian and a prayer. You can read it off the trace.

This is a genuine inversion. Pre-training attribution is hard because the causal path is smeared across billions of parameters. Agentic attribution is tractable because the causal path is a log. The catch is that the log only exists if the protocol requires it to exist. Attribution in the agentic era is not a research problem. It is an instrumentation mandate, and standards are how mandates get enforced across vendors who would otherwise each log whatever they felt like, or nothing.

That is why LCS-003, the agent permission standard, is built around capabilities and an audit trail rather than vague trust. An agent should carry an explicit, bounded grant: these actions, this spending limit, this rate, these domains, expiring then, revocable now, and every action it takes against that grant should be recorded. The point is not bureaucracy. The point is that “the agent did something with my data” should always have an answer to “what, exactly, and on whose authority.”

The new hard problems

Making attribution tractable does not make the agentic world simple. It trades one set of hard problems for another, and the new ones are about accountability across boundaries rather than statistics inside a model.

Delegation. Agent A is allowed to act for you. It delegates to agent B, which delegates to a tool run by a third party. Whose consent governs B’s action? Does your grant flow down the chain, and if so, with what limits? When B does something wrong, the audit trail needs to walk back up to the authority that permitted it. Delegation without traceable chains is just laundering responsibility, and it is the default unless a standard says otherwise. This is the part of LCS-003 we argue about the most.

Shared memory. Agents are more useful when they remember, and more useful still when they can share what they remember so you stop re-explaining yourself to every new tool. But a memory has provenance. Who created it, from whose data, under what consent, with what time to live. LCS-004 treats memory as something with an owner and an access policy rather than a free-floating fact, precisely so that consent and attribution survive the handoff between agents. A memory pool with no provenance is a quiet way to strip consent off data by passing it around until nobody remembers where it came from.

Your standing model. As agents get more personal, there is pressure to build a persistent representation of you that any of them can consult. Done badly, that is every platform secretly assembling its own profile. Done well, it is a thing you own and grant access to deliberately. That is the bet behind LCS-002, the digital twin: one representation, owned by the person, with tiered and revocable access, instead of a dozen unaccountable shadow profiles.

Compensation gets more natural, not less

There is a quiet upside in all of this for the compensation question that training made so painful.

When influence is diffuse across weights, paying for it fairly is close to impossible, which is most of why nobody has. When data is used by an agent at a specific moment for a specific action, the usage is metered almost for free. There was a call. It read these inputs. It happened at this time. That is a billable event in a way that “your text marginally shifted a few million parameters” never was.

So the economics that felt hand-wavy in the training context become ordinary in the agentic one. Per-use, per-action settlement against the terms in a consent token is just invoicing. LCS-001 already carries the rate fields for it. The agentic era is where they start to mean something concrete rather than aspirational, because the meter finally exists.

This needs to ride alongside the tools, not behind them

Agents are standardizing how they reach the world. Tool and context protocols are becoming shared infrastructure, which is good, it is how the agent ecosystem avoids fragmenting into a hundred incompatible connectors. But a tooling standard answers “how does the agent call this?” It does not answer “was the agent allowed to, on whose authority, within what limits, and is it logged?”

That second question is the one a consent layer answers, and it has to sit right next to the tool call, not in a terms-of-service document nobody reads. Permission checked at the moment of action. Limits enforced by the runtime. The action recorded against the grant. When consent lives beside the tool call, attribution and accountability come almost for free, because the trace is already there. When it lives somewhere else, you are back to trusting that everyone behaves, which is how we got here.

Where this leaves us

The training-data fight will keep going, and it matters. But planning the entire future of AI consent around it is like writing careful rules for letters while everyone moves to live calls. The action has moved to agents that read, act, and remember in real time, and that move is good news in one specific way: it puts data use back at the surface, where it can be seen, scoped, logged, and paid for.

The harder, more interesting problems are now about accountability across delegation chains, provenance across shared memory, and who is answerable when an autonomous system acts. Those are exactly the problems LCS-002, LCS-003, and LCS-004 exist to take on, and they are the least finished parts of the standard, which is to say the parts where showing up early counts for the most.

If agentic systems are what you build or worry about, this is the work. The drafts are on GitHub, and the argument is open at the contribute page.

The attribution problem: tracing an output back to your data

2025-12-28T00:00:00+00:00

If you want to compensate people when their data is used to build a model, you eventually run into a question that sounds simple and is not: which data was used, and how much did it matter? Call it the attribution problem. Most public debate skips past it, because it is easier to argue about whether scraping is fair than to admit that even if everyone agreed it should be paid for, we would not currently know how to divide the money.

This post is an attempt to lay the problem out plainly, separate the parts that are genuinely hard from the parts that just sound hard, and explain the stance LLMConsent takes as a result. It gets technical. That is on purpose. A consent protocol that hand-waves over attribution is selling something.

Two questions that get confused

The first thing to do is split “was my data used” into two very different questions, because they have very different answers.

Provenance, or membership, asks: was this specific data point part of the training set? This is a question about a fact in the past. Did the file enter the pipeline.

Influence, or attribution proper, asks: how much did this data point shape the model’s parameters, and through them, a particular output? This is a question about cause and effect inside a system with hundreds of billions of moving parts.

These get blurred together constantly, and it matters, because provenance is mostly tractable and influence mostly is not. If you can keep them separate, a lot of the confusion clears up.

Provenance is the easy half

If you control the training pipeline, you know what went in. You can hash every document at ingestion, commit those hashes to a log, and later prove that a given file was or was not part of a given training run. Cryptographically this is ordinary. Merkle trees and signed manifests have done this for decades in other contexts. There is no deep research problem in proving “this document was in corpus version 7.”

The harder version of provenance is when you do not control the pipeline and want to find out after the fact whether your data was in someone’s model. That is membership inference, and it is a statistical attack rather than a clean proof. You probe the model, look at how confidently it predicts your text, and infer membership from the gap between how it treats data it has seen and data it has not. It works better on outliers and memorized content, worse on ordinary text that looks like everything else. It gives you a probability, not a receipt.

For a consent protocol the lesson is to lean on the clean version. If consent is checked and recorded at the moment data enters a pipeline, provenance stops being a forensic question and becomes a logged fact. You do not have to reverse engineer membership from the outside if permission was verified on the way in.

Influence is the hard half

Now the difficult question. A model produced a sentence. Your blog post was somewhere in the trillions of tokens it trained on. How much credit does your post deserve for that sentence?

The honest baseline answer is that there is a precise definition of influence and it is almost never computable. The definition is counterfactual: the influence of your data point is the difference between the model you got and the model you would have gotten if that point had been left out. Train with it, train without it, compare. This is the leave-one-out ideal, and it is the thing every practical method is trying to approximate, because actually doing it means retraining a frontier model once per data point. Nobody is retraining a model a trillion times to settle a royalty.

So the field builds estimators. A few worth knowing by name:

Influence functions, introduced to modern machine learning by Koh and Liang in 2017, estimate the leave-one-out effect without retraining, using the model’s gradients and a term involving the inverse Hessian of the loss. The math is clean. The cost is brutal, because that Hessian is the size of the model squared, and you have to approximate it hard to run it on anything large. Anthropic’s 2023 work scaled influence functions to large language models using an approximation called EK-FAC, and one of their findings is instructive: the sequences that influence a given output are often related by theme and reasoning pattern rather than by surface wording. Influence is real, but it is diffuse and a little alien. It does not point at one source.
TracIn takes a different route, tracing influence by following the dot product of gradients across training checkpoints. Intuitively, a training example influenced an output if updating on it would have reduced the loss on that output. Cheaper in some setups, still approximate.
Datamodels, from Ilyas and collaborators in 2022, fit a surrogate that predicts a model’s behavior as a function of which training examples were included. It treats the model as something to be regressed against its own training set. Powerful, and also expensive to construct, because you train many models on many subsets to fit the surrogate.

Every one of these is an approximation of the counterfactual, each with its own failure modes, and none of them is cheap enough to run per output per data point at the scale of a real product. That is not a temporary engineering gap. The geometry of how these models store information is working against you.

Why the geometry fights back

A neural network does not file your blog post in a drawer labeled “your blog post.” Training distributes what it learns across the weights. A single concept is spread over many neurons, and a single neuron participates in many concepts. The interpretability literature calls the first property distributed representation and the second polysemanticity, and recent work on superposition shows models deliberately pack more features than they have dimensions, overlapping them because they can usually get away with it.

The practical consequence is that there is no spot in the weights you can point to and say “that came from this person.” The information is smeared. Your data nudged millions of parameters by tiny amounts, and so did everyone else’s, and the output you care about is the joint result of all those nudges passed through a deeply nonlinear function. Asking for one source’s share of one output is a bit like asking which raindrop is responsible for a particular eddy in a river.

The exception: memorization

There is one regime where attribution gets dramatically easier, and it is the one that makes headlines. Sometimes a model does not generalize a piece of data, it memorizes it, and can reproduce it close to verbatim. Work by Carlini and others has shown you can extract memorized training data from large models, that larger models memorize more, and that the single biggest driver of memorization is duplication. Data that appears many times in the corpus is far more likely to be regurgitated, which is also why deduplicating training data measurably reduces memorization.

Memorized content is attributable almost by definition. If the model emits your text verbatim, the link is not statistical, it is visible. This is the easy and legally loud case. But it is the minority case. Most of what a model does is generalization, recombination, the diffuse kind of influence that the methods above can only estimate. Building an entire compensation regime on the memorized tail would miss most of how these systems actually use data.

After pre-training, the ground shifts

Everything so far is about pre-training, the giant first pass over the open corpus. The picture changes, somewhat for the better, in the stages that come after.

Fine-tuning runs on smaller, curated datasets. The counterfactual is still the right definition and still not free, but the numbers are friendlier. With thousands or millions of examples instead of trillions, influence estimates are more stable and the set of candidate sources is bounded. Attribution here is hard but no longer hopeless.

Reinforcement learning from human feedback is harder again in a different way. The training signal is preference data, humans choosing between outputs, and that signal shapes behavior globally rather than injecting retrievable facts. Tracing a specific model behavior back to a specific preference label is murky, and the labelers themselves are a data source whose contribution almost no one accounts for.

Retrieval augmented generation is the pleasant surprise. When a model answers by pulling documents into its context at run time, the documents it used are not a mystery. They are right there in the request. Attribution becomes logging. This is why citations in retrieval systems are tractable while citations for pre-training knowledge are not. If you want clean, per-use attribution today, retrieval is where it already exists.

In-context use is the trivial case. Whatever you put in the prompt, the model saw, and you know exactly what you put there. That is not really training, but it is increasingly how data reaches models in practice, and it points at where this is all heading.

Notice the pattern. The closer data sits to the moment of use, the more attributable it becomes. The deeper it is baked into the weights, the less. A consent system should take that seriously instead of pretending one mechanism fits all of it.

What LLMConsent does about it

Given all of the above, designing the protocol around perfect per-output attribution would be designing around something that does not exist. So we do not.

We separate the two questions on purpose. Provenance is treated as a verifiable fact, established when consent is checked at ingestion, not reconstructed later by attack. LCS-001 is built around that grant and check moment, so the record of what was permitted exists before any training happens, not as an afterthought.

For influence, we bound rather than measure. A consent token carries a maximum influence ceiling, expressed in basis points, that caps how much any single source is allowed to shape a model. Capping influence is far more achievable than measuring it exactly after the fact, and it turns an impossible accounting problem into a tractable policy one. You do not need to know that a source contributed 0.0007 of an output if the terms already said it may contribute at most a set fraction.

Where attribution genuinely is available, retrieval, memorization, fine-tuning sets, the protocol expects implementations to use it, and the economics can be exact in those cases. Where it is not, the model is consent at ingestion plus bounded influence plus usage-based settlement, rather than a fantasy of tracing every token to a payee. We would rather ship a mechanism that is honest about its approximations than a precise-sounding one that quietly cannot be implemented.

There is also revocation, which drags in another genuinely open problem: machine unlearning. Removing the effect of a data point from an already-trained model is its own research frontier, with approaches that range from full retraining of data shards to approximate methods with real limits. The token in LCS-001 can signal that consent is revocable and that unlearning is requested, but we are not going to pretend the ecosystem can perfectly forget on command yet. It cannot. Naming the gap in the standard is better than hiding it.

The honest summary

Provenance is a solved problem if you record consent at the right moment. Influence during pre-training is, for now, fundamentally approximate, and the shape of these models suggests it will stay that way. Attribution improves the closer you get to the point of use, which is exactly where AI is moving with retrieval and agents. A serious consent protocol should prove what it can prove, bound what it cannot measure, and refuse to dress up estimates as receipts.

If you work on training data attribution, influence estimation, or unlearning, we would genuinely like to be wrong about the hard parts. The standards are on GitHub, and the place to argue is an issue or a proposal.

Introducing LLMConsent

2025-09-15T00:00:00+00:00

For the last two years the conversation about AI and data has been stuck in the same loop. A model ships. Someone notices their work is in it. A lawsuit gets filed, or a company adds an opt-out form that almost nobody finds, and everyone moves on until the next model ships. Nothing about the underlying machinery changes, because there is no underlying machinery. There is no agreed way for a person to say “yes, you can use this, under these terms” and for a system to check that answer before it acts.

That gap is what we are trying to close. Today we are releasing the first draft of LLMConsent, an open protocol for consent between humans and AI systems.

Why this is a protocol, not a product

The instinct most people have is to build a product. A consent dashboard, a licensing marketplace, a “data union” with a slick app. We think that instinct is wrong, or at least incomplete, and it is worth saying why up front.

A product governs whoever uses that product. If one company builds the best consent tool in the world, it still only covers the data flowing through that one company. The moment your data crosses into a different model, a different vendor, a different jurisdiction, the rules reset to zero. Consent that cannot travel with the data is not really consent. It is a setting on someone else’s server.

The things that actually solved this class of problem on the internet were not products. They were agreements. TCP/IP is an agreement about how to move packets. HTTP is an agreement about how to move documents. TLS is an agreement about how to do it without everyone watching. None of them are owned. Their value came precisely from the fact that everyone could implement them and nobody could revoke them.

AI is missing an agreement of that kind, and the missing one is about consent. Who is allowed to use what, for which purpose, for how long, and at what price. LLMConsent is our proposal for that agreement.

What it actually is

LLMConsent is a set of open standards. The core of it is a consent token: a signed, checkable statement that scopes how a piece of data or a person’s digital representation may be used. Train on it or do not. Run inference with it or do not. Let an agent act on it or do not. Bound how much any single source can influence a model. Set an expiry. Attach a price. Revoke it later if you change your mind.

There are four core standards in this first release.

LCS-001 defines the consent token itself and the grant, check, and revoke lifecycle that everything else is built on.
LCS-002 describes a digital twin, a persistent, user owned model that AI systems can reference with permission instead of each one rebuilding a private profile of you from scratch.
LCS-003 covers agent permissions: what an autonomous agent is allowed to do on your behalf, with spending limits, rate limits, delegation rules, and an audit trail.
LCS-004 handles memory shared across agents, so context can follow you between systems without you losing control of what gets remembered.

They build on each other in that order. You can read all four today. They are drafts, and they will change, and that is the point.

What it is not

Because of the company this protocol keeps, a few clarifications are worth making plainly.

This is not a token, a coin, or an investment of any kind. There is nothing to buy. The protocol uses cryptography because consent needs to be verifiable and signatures are how you verify things, but the goal is consent management, not speculation.

This is not a single company’s API with an open-source sticker on it. There are no admin keys. No one, including us, can freeze, seize, or quietly rewrite someone’s consent. If the project succeeds, it should outlive any of its current maintainers, the same way HTTP outlived the people who first wrote it down.

And it is not finished. The standards are early. The reference SDKs are mostly unwritten. We are publishing now, in this state, on purpose, because a standard written behind closed doors and revealed as a finished thing is not a standard. It is a press release. We would rather get the hard questions early.

On licensing

The code is under MIT. The standards and documentation are under CC BY 4.0. The split is deliberate. Implementations should be as easy to build on as possible, and specifications should be free to quote, translate, and fork as long as the attribution stays intact. Nobody should have to ask us for permission to implement a protocol whose entire purpose is permission.

How it is governed

The process is modeled on the ones that built the open internet: the IETF’s RFC process, the W3C, and the BIP and EIP improvement-proposal traditions. Anyone can propose a standard. Proposals are reviewed in public. Things move forward by rough consensus and working code, not by a vote we control. Accepted standards become immutable, and changing one means writing a new proposal that supersedes it, with the reasoning on the record.

We are in the bootstrap phase, and we are honest about what that means. Right now there is a small group doing the early work, and a founding maintainer with a tie-breaking vote that is meant to sunset. The plan is to give that power away as the community grows, not to accumulate it.

What we are asking for

If you have read this far, you are probably the kind of person we need.

Read LCS-001 and tell us where it is wrong. Try to implement a piece of it and tell us where the spec is ambiguous or impossible. Build an SDK in a language we have not covered. Open a proposal for the standard we forgot. Argue with us in the open.

The questions in front of us are genuinely unsolved. How do you attribute a model’s behavior back to the data that shaped it, and how much of that is even possible? How should compensation work when influence is diffuse? What does consent mean for an agent that acts in real time rather than a model trained once? We do not have clean answers to all of these. We have a structure to work them out in, and an insistence that the work happen in public.

This is the ground floor. Come build it with us.

The specs are on GitHub. Conversation happens on Discord. And if you just want to follow along, the email list is at the bottom of this page.