Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| uiai [2026/03/17 01:04] – pedroortega | uiai [2026/03/17 01:40] (current) – [Definition: Counterfactual action] pedroortega | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Universal Artificial Intelligence as Imitation ====== | ====== Universal Artificial Intelligence as Imitation ====== | ||
| - | **Pedro A. Ortega** | + | **Pedro A. Ortega**\\ |
| - | Daios Technologies | + | //Keywords: Solomonoff induction, universal imitation, causal interventions, |
| - | Correspondence: | + | Technical Report\\ |
| - | Keywords: Solomonoff induction, universal imitation, causal interventions, | + | March 2026\\ |
| - | Technical Report | + | |
| - | March 2026 | + | |
| ===== Abstract ===== | ===== Abstract ===== | ||
| Line 400: | Line 398: | ||
| Generate $(\dot{\gamma}_j, | Generate $(\dot{\gamma}_j, | ||
| - | - **Shared prefix:** Set | + | * **Shared prefix:** Set $\dot{\gamma}_{\le k-1} := \gamma_{\le k-1}$, $\dot{x}_{\le k-1} := x_{\le k-1}$. |
| - | | + | |
| - | | + | |
| - | \qquad | + | |
| - | | + | |
| - | $$ | + | |
| - | - **Force an $\mathcal{A}$-block start:** Set | + | * **Force an $\mathcal{A}$-block start:** Set $\dot{\gamma}_k := 1$. |
| - | | + | |
| - | | + | |
| - | $$ | + | |
| - | - **Evolve branch chronologically: | + | * **Evolve branch chronologically: |
| - | | + | $$ |
| - | | + | |
| - | $$ | + | |
| - | | + | |
| - | $$ | + | |
| \dot{\gamma}_{j+1} \sim \Gamma(\cdot \mid \dot{\gamma}_{\le j}, \dot{x}_{\le j}). | \dot{\gamma}_{j+1} \sim \Gamma(\cdot \mid \dot{\gamma}_{\le j}, \dot{x}_{\le j}). | ||
| - | | + | $$ |
| Let $k' > k$ be the first position such that $\dot{\gamma}_{k' | Let $k' > k$ be the first position such that $\dot{\gamma}_{k' | ||
| Line 440: | Line 426: | ||
| To define the world’s $\mathcal{A}$-continuation at $k$, run the following tokenization procedure, initialized from the already-written on-path transcript up to $k-1$. Let $(\dot{\gamma}_j)_{j \ge k}$ be generated as follows: | To define the world’s $\mathcal{A}$-continuation at $k$, run the following tokenization procedure, initialized from the already-written on-path transcript up to $k-1$. Let $(\dot{\gamma}_j)_{j \ge k}$ be generated as follows: | ||
| - | - **Shared prefix:** Set | + | * **Shared prefix:** Set $\dot{\gamma}_{\le k-1} := \gamma_{\le k-1}$. |
| - | | + | |
| - | | + | |
| - | $$ | + | |
| - | - **Force an $\mathcal{A}$-block start:** Set | + | * **Force an $\mathcal{A}$-block start:** Set $\dot{\gamma}_{k} := 1$. |
| - | | + | |
| - | | + | |
| - | $$ | + | |
| - | - **Read transcript chronologically: | + | * **Read transcript chronologically: |
| - | $$ | + | $$ |
| \dot{\gamma}_{j+1} \sim \Gamma(\cdot \mid \dot{\gamma}_{\le j}, x_{\le j}). | \dot{\gamma}_{j+1} \sim \Gamma(\cdot \mid \dot{\gamma}_{\le j}, x_{\le j}). | ||
| - | | + | $$ |
| Let $k' > k$ be the first position such that $\dot{\gamma}_{k' | Let $k' > k$ be the first position such that $\dot{\gamma}_{k' | ||
| Line 565: | Line 545: | ||
| Assume $(\Sigma, | Assume $(\Sigma, | ||
| - | - **Action-slot is chosen by coin flip.** | + | * **Action-slot is chosen by coin flip.** At each $k_i$, the gate draws $\gamma(k_i) \sim \mathrm{Bernoulli}(\rho_i)$, $\rho_i \in (0,1)$, where $\rho_i$ is a chronological function of the agent-visible history $h_i$. Conditional on $h_i$, the bit $\gamma(k_i)$ is independent of the world’s $\mathcal{A}$-token $\dot{a}^{(k_i)}$ at $k_i$. |
| - | | + | |
| - | | + | |
| - | | + | |
| - | \qquad | + | |
| - | | + | |
| - | $$ | + | |
| - | | + | |
| - | - **Gate held fixed through action-slot.** | + | * **Gate held fixed through action-slot.** The gate holds the value of $\gamma(k_i)$ fixed throughout the $\mathcal{A}$-token beginning at $k_i$. If $\gamma(k_i)=0$, |
| - | | + | |
| - | - **Infinitely many agent-written slots.** | + | * **Infinitely many agent-written slots.** With probability $1$, $\gamma(k_i)=1$ occurs for infinitely many $i$. |
| - | | + | |
| **Induced agent interventions and world targets.** | **Induced agent interventions and world targets.** | ||
| - | Before we proceed, we need to clarify the indexing of action slots, and in particular their substrate position versus agent-time. According to the standard setup, the schedule specifies substrate positions $k_1 < k_2 < \cdots$. Then $\dot{a}^{(k_i)} \in \mathcal{A}$ denotes the $\mathcal{A}$-token the world would write starting at $k_i$. If $\gamma(k_i)=0$ this token is realized on-path as an embedded third-party action; if $\gamma(k_i)=1$ it is only a counterfactual target. To index only the factual actions, the slots assigned to the agent, let | + | Before we proceed, we need to clarify the indexing of action slots, and in particular their substrate position versus agent-time. According to the standard setup, the schedule specifies substrate positions $k_1 < k_2 < \cdots$. Then $\dot{a}^{(k_i)} \in \mathcal{A}$ denotes the $\mathcal{A}$-token the world would write starting at $k_i$. If $\gamma(k_i)=0$ this token is realized on-path as an embedded third-party action; if $\gamma(k_i)=1$ it is only a counterfactual target. To index only the factual actions, the slots assigned to the agent, let $i_1 < i_2 < \cdots$ be the random indices with $\gamma(k_{i_t}) = 1$. For each $t \ge 1$, define $a_{t+1} \in \mathcal{A}$ as the $\mathcal{A}$-token the agent actually writes at $k_{i_t}$, and define the corresponding counterfactual target by $\dot{a}_{t+1} := \dot{a}^{(k_{i_t})}$. |
| - | + | ||
| - | $$ | + | |
| - | i_1 < i_2 < \cdots | + | |
| - | $$ | + | |
| - | + | ||
| - | be the random indices with $\gamma(k_{i_t}) = 1$. For each $t \ge 1$, define $a_{t+1} \in \mathcal{A}$ as the $\mathcal{A}$-token the agent actually writes at $k_{i_t}$, and define the corresponding counterfactual target by | + | |
| - | + | ||
| - | $$ | + | |
| - | \dot{a}_{t+1} := \dot{a}^{(k_{i_t})}. | + | |
| - | $$ | + | |
| Notice that in this case the previous observation token was completed, and hence | Notice that in this case the previous observation token was completed, and hence | ||
| Line 600: | Line 561: | ||
| **Deviation measures.** | **Deviation measures.** | ||
| - | To quantify how closely intrinsic completion tracks the target continuation, | + | To quantify how closely intrinsic completion tracks the target continuation, |
| - | $$ | + | For the measure $\mu$ set $\overline{\mu}(a \mid \cdot) := \mu(a \mid \cdot) \quad \text{for } a \in \mathcal{A}$, and $ |
| - | \overline{\mathcal{A}} := \mathcal{A} \cup \{\bot\}. | + | \overline{\mu}(\bot \mid \cdot) := 0$. |
| - | $$ | + | |
| - | + | ||
| - | Define | + | |
| - | + | ||
| - | $$ | + | |
| - | \overline{M}(a \mid \cdot) := M(a \mid \cdot) \quad \text{for } a \in \mathcal{A}, | + | |
| - | $$ | + | |
| - | + | ||
| - | and | + | |
| - | + | ||
| - | $$ | + | |
| - | \overline{M}(\bot \mid \cdot) := 1 - \sum_{a \in \mathcal{A}} M(a \mid \cdot). | + | |
| - | $$ | + | |
| - | + | ||
| - | For the measure $\mu$ set | + | |
| - | + | ||
| - | $$ | + | |
| - | \overline{\mu}(a \mid \cdot) := \mu(a \mid \cdot) \quad \text{for } a \in \mathcal{A}, | + | |
| - | $$ | + | |
| - | + | ||
| - | and | + | |
| - | + | ||
| - | $$ | + | |
| - | \overline{\mu}(\bot \mid \cdot) := 0. | + | |
| - | $$ | + | |
| For distributions $P,Q$ on a countable set, define | For distributions $P,Q$ on a countable set, define | ||
| Line 1153: | Line 1089: | ||
| Notice that we can combine a variety of schema rules that instantiate well-known decision principles and other preference structures, such as: | Notice that we can combine a variety of schema rules that instantiate well-known decision principles and other preference structures, such as: | ||
| - | - // | + | * // |
| - | - // | + | |
| - | - // | + | |
| - | - //Choice from comparisons:// | + | |
| - | - //Rule- / constitution-following:// | + | |
| - | - //Program synthesis / tool protocol:// $u$ encodes a specification together with a computable evaluator or tool interface; $f$ outputs a program, or macro-action, | + | |
| - | - //Norms / dialogue acts:// $u$ encodes an interaction context together with a computable norm taxonomy; $f$ outputs an appropriate dialogue act, such as apologize, clarify, refuse, or defer, consistent with the norms and the stated context. | + | |
| On prompts of the corresponding type, the agent will behave “as if” following the schema. | On prompts of the corresponding type, the agent will behave “as if” following the schema. | ||