Working proof — v0.9 draft

The Gödel–Chaitin Modeling Boundary — Or, Why There's No Digital Twin for Causality

On Formal Limits of Self-Describing Systems, with Applications to Digital Twins, Language-Theoretic Security, and Large Language Models.

Trey Darley ∶ Proper Tools SRL, Brussels ∶ Published 2026-04-29

Status: v0.9 working draft. Privately circulated since December 2025 in search of co-authors. Now published as-is for open critique. Refinement expected; co-authors and reviewers welcome.

Abstract

We identify and formalize a class of limits that emerges when systems become sufficiently expressive to model themselves. We call this the Gödel–Chaitin Modeling Boundary: the point at which a system's representational power is sufficient for useful self-description, but insufficient for complete or consistent self-verification.

We demonstrate these limits across three domains:

Digital twins of complex physical and socio-technical systems, for which we prove that predictive equivalence, causal fidelity, and behavioral isomorphism are formally unattainable above a defined complexity threshold.
Language-theoretic security, in which input ambiguity becomes an attack surface precisely because systems must interpret languages they cannot fully define from within.
Large language models, for which we prove by reduction to Rice's theorem that prompt injection is not an implementation flaw but a structural consequence of operating above an expressiveness threshold.

In each case, the limits arise from the same underlying structure: Kolmogorov incompressibility, Lyapunov instability, Gödelian incompleteness, and Chaitinian randomness. The domains change. The boundary does not.

The implication is direct: these problems are not engineering failures awaiting better technology. They are mathematical constraints on what can be known, represented, or predicted about sufficiently complex systems. Responsible design requires accepting the boundary, not pretending it does not exist.

1. Introduction

When I was fourteen, I disabled a single setting on a production network: Spanning Tree.

It looked harmless. I thought it might speed things up.

Instead, the network ate itself. Switches collapsed into a broadcast storm. Traffic looped endlessly. Bandwidth turned into heat, and the system froze.

Only years later did I understand what had happened. Spanning Tree is not a traffic feature. It is how a network reasons about its own structure. By disabling it, I did not remove an optional component — I removed the system's ability to understand itself.

The network was powerful enough to move data anywhere. It was not powerful enough to survive without a model of its own topology.

This was not just a configuration mistake. It was an early encounter with a deeper limit.

That limit is the subject of this paper.

The pattern appears across domains. A rulebook that cannot prove its own correctness. A parser that must guess at ambiguous input. A language model that cannot reliably distinguish instruction from content. A digital twin that diverges from the system it claims to represent. In each case, the failure is not accidental. It is structural. It follows from the same formal boundary, encountered from a different angle.

We argue that these are not separate problems. They are the same problem: the problem of self-description in sufficiently expressive systems.

We call the point at which this problem becomes unavoidable the Gödel–Chaitin Modeling Boundary. This paper defines it precisely, demonstrates it across three domains, and draws practical conclusions for system design, security architecture, and the governance of AI systems.

2. Definitions

Unified Representation System. A system in which instructions and the content operated upon by those instructions are expressed in the same representational language, and in which the system is capable of interpreting meta-instructions that modify its own instruction-following behavior. (Large language models are a prominent example — systems in which instructions and content share a single representational substrate.)

Digital Twin. A computational representation that claims predictive or explanatory equivalence to a physical, biological, or socio-technical system.

Causal Structure. The set of all state transitions, dependencies, and generative rules that determine a system's evolution.

Complexity Threshold C. A regime in which the minimal description of the system (in the Kolmogorov sense) is not significantly smaller than the system itself, and/or the system exhibits sensitivity to initial conditions.

Expressiveness Threshold T. The point at which a system can interpret instructions that refer to, modify, or override its own instruction-following behavior, using the same representational substrate as ordinary content.

Out-of-Band Authority. Any mechanism for establishing instruction legitimacy that does not rely solely on the content of the input itself — for example, cryptographic signatures, architectural separation, or trusted channels.

Gödel–Chaitin Modeling Boundary. The point at which a system's representational power is sufficient for useful self-description, but insufficient for complete or consistent self-verification. Above this boundary, some uncertainty is unavoidable — not because of missing information, but because of the formal structure of self-reference.

3. Formal Foundations

The limits identified in this paper rest on four results from mathematics and theoretical computer science. We state them here in accessible form; formal citations follow.

Kolmogorov Incompressibility (Kolmogorov, 1965; Chaitin, 1966). For any sufficiently complex system, the shortest possible description of the system's causal structure is asymptotically equal to the system itself. Any model is necessarily a lossy compression. Some truths are irreducibly complex — their shortest description is the ground truth itself. Crucially, which outcomes are algorithmically random cannot be known in advance; determining this would require solving the halting problem for the system's generative process.

Sensitive Dependence on Initial Conditions (Lorenz, 1963; Lyapunov, formalized). For systems exhibiting chaotic dynamics, prediction error grows exponentially with time — even under infinitesimal parameter error. No refinement of model precision eliminates this divergence; it merely delays it.

Gödelian Incompleteness (Gödel, 1931). Any sufficiently expressive formal system contains true statements that cannot be proven using the system's own rules. A system powerful enough to describe itself cannot contain a complete and consistent description of itself.

Rice's Theorem (Rice, 1953). Any non-trivial semantic property of the inputs to a sufficiently expressive interpreter is undecidable. That is, there is no general procedure that can determine, for all possible inputs, whether a given input has a given semantic property — without restricting the language's expressiveness.

Together, these results identify the overall modeling boundary. The point at which representation fails, prediction diverges, and equivalence becomes formally unattainable is not a contingent engineering limit. It is a mathematical one.

4. Domain I: Digital Twins and the Limits of Causal Equivalence

4.1 The Promise and the Hidden Assumption

The promise of the digital twin is seductive: a faithful computational mirror of a physical, biological, or socio-technical system, capable of prediction, explanation, and counterfactual reasoning. Build the model detailed enough, the thinking goes, and it becomes functionally equivalent to the thing itself.

But this promise rests on a hidden assumption: that the causal structure of a complex system can be captured in a representation smaller than the system itself. For systems above complexity threshold C, this assumption is false.

4.2 Premises

Premise 1 (Kolmogorov Incompressibility). For systems above threshold C, the shortest possible description of the system's causal structure is asymptotically equal to the system itself. Any model is necessarily a lossy compression.

Premise 2 (Lyapunov Instability). For systems above threshold C exhibiting chaotic dynamics, prediction error grows exponentially with time — even under infinitesimal parameter error.

Premise 3 (Representation Requires Compression). Any digital twin is a finite structure. To be smaller than the system it models, it must compress — and therefore discard — causally relevant information.

Premise 4 (Self-Referential Systems). Human and socio-technical systems contain agents that model themselves, other agents, and the system as a whole. This introduces reflexive causal loops: actions based on expectations, expectations based on models, models based on observed actions. Such systems fall directly under Gödel–Chaitin limits.

Premise 5 (Empirical Witness). Systems typically targeted for digital twin applications — ecosystems, organisms, economies, urban systems, supply chains, organizations — empirically exhibit complexity above threshold C.

4.3 The Modeling Trilemma

Any digital twin of a self-referential system above threshold C must either:

Exclude some causally relevant features (incompleteness), or
Permit contradictions or predictive failures (inconsistency), or
Replicate the system at full complexity (negating the purpose of modeling).

No fourth option exists, even in principle.

4.4 Proof Sketch

Assume a perfect digital twin D exists for a system S above complexity threshold C. To be perfect, D must preserve all causally relevant information, reproduce all possible trajectories of S, and include all self-referential causal loops within S.

However:

By Kolmogorov incompressibility, S's full causal structure cannot be compressed into a smaller model D.
By Lyapunov divergence, arbitrarily small errors in D's initial state yield divergent trajectories over finite time.
By Gödel, S's self-referential causal structure contains undecidable relationships that D cannot simultaneously represent completely and consistently.
By Chaitin, algorithmically random elements of S cannot be predicted or compressed by D without replicating S in full — and the location of these elements is itself undecidable.

Therefore, no perfect digital twin D can exist for any system S above complexity threshold C. ∎

4.5 What Digital Twins Can and Cannot Do

This result does not argue that digital twins are useless. They remain powerful tools for bounded prediction, scenario exploration, and uncertainty reduction. Their strength is pragmatic, not ontological: digital twins aid engineers and decision-makers in navigating uncertainty, provided we do not mistake approximation for equivalence or confuse model behavior with system behavior.

There is no digital twin for causality. There is only modeling — done with appropriate humility about what models can and cannot do.

5. Domain II: Language-Theoretic Security and the Attack Surface of Ambiguity

5.1 Input as Language

Computers are often described as machines that follow instructions. This is true, but incomplete.

A real computer system does not only follow rules. It must also read, interpret, and act on representations of rules. Files, network messages, commands, timestamps, and configurations are all descriptions the system must understand before it can act.

When a program reads input from the network, that input is not merely data. It is a message written in some language. The program must decide where the message starts, where it ends, and what each part means. If the language is fully defined, the program can behave safely. If it is incomplete or ambiguous, the program must guess. (This is precisely why the original MacOS filesystem divided all files into Resource and Data forks.)

Those guesses are precisely where most catastrophic failures of assumptions at scale in code occur.

5.2 The LANGSEC Observation

Language-theoretic security — LANGSEC — formalizes this observation. It argues that many security failures occur not because attackers are unusually clever, but because the system is confused about what it is reading. When the rules of the input language are unclear, attackers explore the gaps.

If a program cannot clearly tell where one message ends and another begins, an attacker can hide instructions inside the message. If a file format allows multiple interpretations, the system may choose the wrong one. If input is treated as data in one place and as instructions in another, unexpected behavior follows.

In each case, the failure does not come from a missing check or a typo in the code. It comes from ambiguity — from the system not fully understanding the language it is interpreting.

This is the rulebook problem again. The program must decide what is allowed using rules that are themselves incomplete. The system cannot fully protect itself from within.

5.3 The Structural Connection

If you classify vulnerabilities by root cause rather than by symptom, ambiguity in input interpretation dominates the landscape. Eliminating ambiguity would remove the majority of real-world exploit classes we currently know how to name.

But for systems above the Gödel–Chaitin expressiveness threshold, ambiguity cannot be fully eliminated. The system's power — its ability to interpret complex, context-dependent input — is the same property that makes complete disambiguation impossible. Capability and vulnerability are not separate properties. They are the same property, viewed from different angles.

LANGSEC shows one place where the Gödel–Chaitin boundary becomes visible in practice. The next section shows another.

6. Domain III: Prompt Injection and the Undecidability of Instruction Boundaries

6.1 The Problem

Uniform Representation Systems (aka, Large language models) accept instructions and content through the same channel, in the same language. A user writes a prompt; the model reads it. A user provides a document; the model reads it. The system must decide what is instruction and what is content using the same interpretive machinery for both.

If an input says “ignore your previous instructions,” the system must decide whether this is a command or text describing a command. The answer depends not on the words alone, but on their meaning and their effect on the system's behavior. This is a semantic question, not a syntactic one.

This is known as prompt injection. It is commonly treated as a bug — something that better training or stronger filters will eventually eliminate.

It is neither.

6.2 Premises

Premise 6 (Expressiveness Threshold). There exists a threshold of expressiveness T, defined as the ability of a system to interpret and act on instructions that refer to, modify, or override its own instruction-following behavior, using the same representational substrate as ordinary content.

Premise 7 (Undecidable Boundary). For any system whose control behavior is mediated exclusively through an input channel expressed in a language meeting threshold T, there exists no procedure operating solely on that input — without access to out-of-band authority — that can, in all cases, decide whether a given segment should be interpreted as instruction or as content, without restricting the language's expressiveness.

Premise 8 (Empirical Witness). Large language models demonstrably exceed threshold T, as evidenced by their correct interpretation of meta-instructions that modify instruction-following behavior expressed entirely within the same input stream.

6.3 Lemma: Reduction to Rice's Theorem

A system above threshold T functions as an interpreter for its input language; input strings function analogously to programs in the interpreted language, with their interpretation determining system behavior.

Assume, for contradiction, that there exists a procedure D operating solely on the input string — without access to out-of-band authority — that, for any input string S, correctly decides whether S should be interpreted as instruction or as content.

Whether an input should be interpreted as instruction depends on its meaning and effect on the system's behavior, not merely on its syntactic form. It is therefore a semantic property of inputs.

By Rice's Theorem, any non-trivial semantic property of inputs to a sufficiently expressive interpreter is undecidable. The property “is an instruction” is non-trivial: some inputs satisfy it and others do not. Because the system's behavior is determined by interpreting input strings as executable control descriptions, the mapping from string → behavior is computable and non-trivial.

Therefore, no procedure D operating solely on input can decide instruction boundaries for systems above threshold T. ∎

6.4 Implication

Prompt injection in systems above threshold T is not a defect of implementation but a structural consequence of operating above the expressiveness threshold. This reframes prompt injection defenses:

Not boundary enforcement, but damage containment
Not input classification, but architectural containment, trust hierarchies, and human governance

Mitigation may reduce incidence or impact. Elimination without loss of expressiveness is impossible.

We are asking for a sharp knife that cannot cut the wielder. The request is coherent. The artifact is not.

7. The Unified Boundary

The three domains examined above exhibit the same structure:

Domain	System	Self-description	Formal limit
Digital twins	Physical/socio-technical	Causal model	Kolmogorov + Lyapunov + Gödel
LANGSEC	Parsers and interpreters	Input grammar	Incompleteness of language definition
LLM prompt injection	Language models	Instruction/content boundary	Rice's theorem

In each case:

The system must model or interpret something about itself or its inputs.
The system's usefulness depends on that modeling being expressive.
That expressiveness introduces a boundary beyond which the modeling cannot be complete, consistent, or decidable.
The boundary is not a contingent engineering limit. It is a mathematical one.

The domain changes. The structure does not.

8. Designing for the Boundary

There is a temptation, when confronted with formal limits, to assume they apply somewhere else — to larger systems, or less careful engineers.

They do not. They apply to any system that has crossed the threshold. And we are building a civilization on such systems.

Our global digital infrastructure is not merely complicated. It is a distributed network of interdependent, Turing-complete components that interpret one another's outputs, model their own state, and depend on shared representations — of time, identity, and authority — that cannot be fully verified from within the system.

Its aggregate failure modes are not merely difficult to enumerate. They are formally impossible to enumerate. Any consistent model the system constructs of itself is necessarily incomplete.

This is not a flaw to be eliminated. It is a structural condition.

The question is not whether our systems will encounter their limits. They already have. The question is whether we design for that encounter honestly, or pretend it is not happening.

Honest design begins with accepting what cannot be solved:

Prompt injection cannot be eliminated from systems whose power depends on interpreting instructions and content in the same language.
Timestamp overflows cannot be inventoried out of existence in systems whose components recursively reference one another's state.
Parsing ambiguities cannot be patched away from systems that must accept input languages they cannot fully define.
Digital twins cannot achieve causal equivalence with the systems they model.

In each case, the responsible path is the same: architectural containment, explicit trust hierarchies, graceful degradation, and human governance over the gaps no formal system can close for itself.

Perfection is not available. What is available is a practice: building systems that remain coherent under stress, fail in ways humans can understand, and do not pretend certainty where none exists.

9. Conclusion

We have demonstrated that the Gödel–Chaitin Modeling Boundary — the formal limit of self-description in sufficiently expressive systems — manifests consistently across digital twins, language-theoretic security, and large language model prompt injection.

The limits are not engineering failures awaiting better technology. They are mathematical constraints on what can be known, represented, or predicted about sufficiently complex systems.

Digital twins cannot achieve causal equivalence with systems above complexity threshold C.

Parsers and interpreters cannot eliminate ambiguity from input languages they cannot fully define from within.

Large language models cannot decide instruction boundaries without out-of-band authority, by reduction to Rice's theorem.

In each domain, the same conclusion follows: the responsible response is not to pretend the boundary does not exist, but to design for it — with architectural containment, trust boundaries, graceful failure, and human governance over the gaps.

There is no digital twin for causality.
There is no parser that eliminates all ambiguity.
There is no prompt injection filter that eliminates all injection.

There is only honest design: acknowledging the boundary, working within it, and building systems that fail gracefully when they encounter it.

References

Chaitin, G. J. (1966). On the length of programs for computing finite binary sequences. Journal of the ACM, 13(4), 547–569.
Gödel, K. (1931). Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I. Monatshefte für Mathematik und Physik, 38, 173–198.
Kolmogorov, A. N. (1965). Three approaches to the quantitative definition of information. Problems of Information Transmission, 1(1), 1–7.
Lorenz, E. N. (1963). Deterministic nonperiodic flow. Journal of the Atmospheric Sciences, 20(2), 130–141.
Rice, H. G. (1953). Classes of recursively enumerable sets and their decision problems. Transactions of the American Mathematical Society, 74(2), 358–366.
Turing, A. M. (1936). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 2(42), 230–265.
Darley, T. (2025). On the undecidability of instruction boundaries in unified representation systems, or: Why prompt injection is not a bug. TLP:CLEAR working draft.