Pedro A. Ortega

Pedro A. Ortega https://adaptiveagents.org/ 2026-05-06T13:02:54+00:00 Pedro A. Ortega https://adaptiveagents.org/ https://adaptiveagents.org/_media/logo.png text/html 2024-07-26T18:53:34+00:00 Anonymous (anonymous@undisclosed.example.com) and_or_kl https://adaptiveagents.org/and_or_kl?rev=1722020014&do=diff And, Or, and the Two KL Projections " I discuss the difference between minimizing the KL-divergence with respect to the first and second argument, and will conclude that they correspond to AND and OR operations on distributions, respectively." Cite as: Ortega, P.A. $$ \min_p D(p \| q) \qquad \text{versus} \qquad \min_p D(q \| p), $$$$ D(p \| q) := \sum_x p(x) \log \frac{ p(x) }{ q(x) }. $$$N$$q_1, q_2, \ldots, q_N$$\mathcal{X}$$w_1, w_2, \ldots, w_N$$$ q(x) = \sum_i w_i q_i(x), $$… text/html 2023-11-19T17:21:05+00:00 Anonymous (anonymous@undisclosed.example.com) argmaxprior https://adaptiveagents.org/argmaxprior?rev=1700414465&do=diff Arg-Max Prior Paper Ortega, P.A., Grau-Moya, J., Genewein, T., Balduzzi, D. and Braun, D.A. “A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function.” Neural Information Processing Systems (NIPS) 2012 [[PDF]] Download Code $h(x): \mathcal{X} \rightarrow \mathcal{R}$$\bar{f}(x)$$\mathcal{D}_t := \{(x_i, y_i)\}_{i=1}^t$$\alpha>0$\[ P(x^\ast|\mathcal{D}_t) \propto \exp\{ \alpha \cdot h(x^\ast) \}. \]$\alpha$$\alpha \approx 0$$\alpha \rightarrow \infty… text/html 2023-11-19T17:21:07+00:00 Anonymous (anonymous@undisclosed.example.com) bayesian_causal_induction https://adaptiveagents.org/bayesian_causal_induction?rev=1700414467&do=diff Bayesian Causal Induction ...also known as Causal Discovery. This talk was first presented at the 2011 NIPS Workshop on Philosophy and Machine Learning. The talk slides are [here], and the workshop paper [here]. Please cite this as: Pedro A. Ortega. Bayesian Causal Induction. 2011 NIPS Workshop in Philosophy and Machine Learning.$X$$Y$$X$$Y$$Y$$X$$h$$¬h$$X$$Y$$X$$Y$$H$$Y$$h$$\neg x$$y$$\frac{1}{4}$$\neg y$$\frac{3}{4}$$X$$Y$$Y$$X$$X$$x$$Y$$y$$H = h$$h$$x$$y$\begin{align*} P(h|x,y) &= \frac{P(y… text/html 2023-11-19T17:21:06+00:00 Anonymous (anonymous@undisclosed.example.com) bayesian_control_rule https://adaptiveagents.org/bayesian_control_rule?rev=1700414466&do=diff Thompson Sampling & Bayesian Control Rule Thompson sampling is not just a heuristic with nice properties, but, under closer scrutiny, reveals some interesting aspects about the reinforcement learning problem that have not been analyzed before. Two aspects that are particularly interesting are the intimate connection to Bayesian inference (in fact, to adaptive compression) and the intricate relation to causality.\[ P(\theta|\hat{A},O) = \frac{ P(\theta) P(\hat{A}, O|\theta) }{ P(\hat{A}, O) },… text/html 2025-04-08T15:20:40+00:00 Anonymous (anonymous@undisclosed.example.com) belief_flows https://adaptiveagents.org/belief_flows?rev=1744125640&do=diff Belief Flows Paper Ortega, P.A., Crammer, K., Lee, D.D. “Belief Flows for Robust Online Learning.” Information Theory and Applications (ITA), February 2015. [PDF] [[Slides]] In a nutshell [Belief flows illustration] Belief flows chooses the most conservative belief update given a single observation of the error gradient at a location chosen through Thompson sampling. $F_w(x)$$x \in \mathbb{R}^p$$w \in \mathbb{R}^d$$P(w)$\[ P(w) = N(w; \mu, \Sigma) = \prod_n N(w_n; \mu_n, \sigma^2_n), \]… text/html 2026-04-02T14:14:22+00:00 Anonymous (anonymous@undisclosed.example.com) bio https://adaptiveagents.org/bio?rev=1775139262&do=diff CV and Bio Curriculum Vitae Curriculum Vitae [ [PDF]] (updated March 2026) Short Bio Pedro A. Ortega is the founder of Daios AI. Previously he was VP of Research at Kosen Labs, and the lead of the Safety Analysis Team at DeepMind. His research focuses on the formal principles of intelligent systems addressing basic questions in AGI and AGI safety research, including bounded-rational & risk-sensitive planning and causal generalization. His approach lies at the intersection between machine l… text/html 2023-11-19T17:21:09+00:00 Anonymous (anonymous@undisclosed.example.com) causality https://adaptiveagents.org/causality?rev=1700414469&do=diff Measure-Theoretic Causality My super-old causality slides can be found [here]. Try out the Colab tutorial with a causal reasoning engine in it. Paper Subjectivity, Bayesianism, and Causality Ortega, P.A. Special Issue on Philosophical Aspects of Pattern Recognition$\mathcal{R}$$\Omega$$\Omega \in \mathcal{R}$$U, V \in \mathcal{R}$$U \cap V = \varnothing$$U \subset V$$V \subset U$$U, V \in \mathcal{R}$$V \subset U$$(V_n)_{n \in \mathbb{N}}$$\mathcal{R}$$U \setminus V = \bigcup_n V_n$$(V_n)_{… text/html 2024-10-14T14:15:01+00:00 Anonymous (anonymous@undisclosed.example.com) compai https://adaptiveagents.org/compai?rev=1728915301&do=diff Induction and AI " I explore how: pattern recognition relates to computation, its connection to logic through induction and deduction, and why universal pattern recognition defies the simplicity of universal computation." What's the next entry in this list?$$ \text{B123A, A231B, B312A, A123B, } \ldots $$$\text{B231A}$$p$$x$$p$$x$$p$$$ \text{PatternRecognition}(x) = \text{Computation}^{-1}(x). $$$n$$n$ text/html 2023-11-19T17:21:09+00:00 Anonymous (anonymous@undisclosed.example.com) drawings https://adaptiveagents.org/drawings?rev=1700414469&do=diff Photos/Drawings [Pedro Ortega in Cambridge] [Pedro Ortega in Santiago] [Pedro Ortega in Jerusalem] Cambridge (2010) Santiago (2011) Jerusalem (2013) [Pedro Ortega in Philly] [Pedro Ortega at Tate Modern] Philly (2016) Tate Modern (2019) [Pizza Daniel Braun and Pedro Ortega] Pizza with Daniel (2009) [At the NIH] [Christopher Bishop, Zoubin Ghahramani and Pedro Ortega] Pedro Ortega at NIH (2011) With C. Bishop and Z. Ghahramani (2006) [Lella] [Pedro Ortega in Waterbeach] [Ludwig… text/html 2023-11-19T17:21:06+00:00 Anonymous (anonymous@undisclosed.example.com) freeenergy https://adaptiveagents.org/freeenergy?rev=1700414466&do=diff Information-Theoretic Bounded Rationality Under construction. Check out our latest summary paper: Ortega, P.A., Braun, D.A., Dyer, J.S., Kim, K.-E., and Tishby, N. Information-Theoretic Bounded Rationality ArXiv:1512.06789, 2015 [PDF] Bounded rationality$\mathcal{A}$$\mathcal{X}$$U: \mathcal{X} \rightarrow \mathbb{R}$$U(x)$$P$$P(x|a)$$x \in \mathcal{X}$$a \in \mathcal{A}$$a^\ast \in \mathcal{A}$\begin{align} a^\ast &= \arg\max_{a \in \mathcal{A}} \mathbf{E}[ U|a ] \\ &= \arg\max_{… text/html 2026-04-09T09:17:10+00:00 Anonymous (anonymous@undisclosed.example.com) home https://adaptiveagents.org/home?rev=1775726230&do=diff Pedro A. Ortega AGI and Cybernetics Researcher [Pedro A. Ortega] About Founder of the AGI startup Daios. Previously I was VP of Research at Kosen Labs and the lead of the Safety Analysis Team at DeepMind. My research focuses on artificial general intelligence and the formal principles of intelligence text/html 2023-11-19T17:21:10+00:00 Anonymous (anonymous@undisclosed.example.com) iscolloquium https://adaptiveagents.org/iscolloquium?rev=1700414470&do=diff Max Planck Intelligent Systems Colloquium (IS Colloquium) The Max-Planck IS Colloquium is a series of talks about a topic that is of broad appeal to the intelligent system’s community and is given by a world-renowned researcher. Invited participants include graduate students, faculty and other interested members of the Max-Planck community. The goal is to foster discussion and dialogue on larger themes that encourage sophisticated and interdisciplinary perspectives. text/html 2024-12-24T12:58:02+00:00 Anonymous (anonymous@undisclosed.example.com) klderivation https://adaptiveagents.org/klderivation?rev=1735045082&do=diff Why does every choice come with an entropy tax? " I present a very general derivation that shows how every choice carries an unavoidable “entropy tax,” reflecting the hidden cost of shifting from old beliefs to new choices. " Cite as: Ortega, P.A. “Why does every choice come with a tax?”, Tech Note 3, DAIOS, 2024.\[ \text{Choice Tax} \propto \sum_x P(x|d) \log \frac{ P(x|d) }{ P(x) }, \]$P(x)$$P(x|d)$$d$$x$$\mathcal{X}$$\Omega$$P$$\omega \in \Omega$$\Omega$$e \subset \Omega$$e$$\omega \in … text/html 2024-09-17T18:31:05+00:00 Anonymous (anonymous@undisclosed.example.com) l-factor https://adaptiveagents.org/l-factor?rev=1726597865&do=diff L-Factor Compute your L-Factor: the number of papers where you were first or last author, minus the number of papers where you were a middle author. You can use the Python code below (requires installing scholarly and thefuzz). from scholarly import scholarly from thefuzz import fuzz author_name = NAME_HERE search_query = scholarly.search_author(author_name) result = next(search_query) author = scholarly.fill(result) good = 0 bad = 0 # Iterate over publications. for pub in author['publica… text/html 2023-11-19T17:21:08+00:00 Anonymous (anonymous@undisclosed.example.com) leejc https://adaptiveagents.org/leejc?rev=1700414468&do=diff Lee Lab Journal Club When and Where Location Engineering, Levine 512 Date Every Wednesday at 6:30pm Meetings Date Topic Speaker To Read Jan 15, 2014 Fast Algs. for Gaussian Noise Invariant ICA Jimmy Wang Paper Jan 22, 2014 Generative Local Metric Learning for Nearest Neighbor Classification text/html 2023-11-19T17:21:05+00:00 Anonymous (anonymous@undisclosed.example.com) mdp https://adaptiveagents.org/mdp?rev=1700414465&do=diff MDPs Using Bayesian Control Rule/Thompson Sampling This is the model-free reinforcement learning algorithm that we originally used as an example to showcase the Bayesian control rule, inspired by the “Bayesian Q-Learning” paper by Dearden et al. Please cite as: Ortega, P.A. and Braun D.A. $(\mathcal{X}, \mathcal{A}, T, r)$$\mathcal{X}$$\mathcal{A}$$T_a(x;x') = P(x'|a,x)$$a \in \mathcal{A}$$x \in \mathcal{X}$$x' \in \mathcal{X}$$r(x,a) \in \mathcal{R} := \mathbb{R}$$x \in \mathcal{X}$$a \in \mat… text/html 2023-11-19T17:21:08+00:00 Anonymous (anonymous@undisclosed.example.com) oldnews https://adaptiveagents.org/oldnews?rev=1700414468&do=diff Old News * 15th July 2014: “Subjectivity, Bayesianism, and Causality” available as a preprint on arXiv. * 9th December 2013: The NIPS Workshop on Planning with Information Constraints was a success! * 25th September 2013: Talk “Information-Theoretic Bounded Rationality text/html 2026-03-07T12:12:47+00:00 Anonymous (anonymous@undisclosed.example.com) posts https://adaptiveagents.org/posts?rev=1772885567&do=diff Blog & Essays Universal Artificial Intelligence as Imitation Beyond Alignment: The Case for a Robustness Agenda in AI Safety How to translate third-person into first-person experience? Why does every choice come with an entropy tax? Induction and AI And, Or, and the Two KL-Projections Old Stuff A Summary of Bounded Rationality Thompson Sampling / Bayesian Control Rule Bayesian Control Rule for MDPs Causality, and their measure-theoretic formalization Bayesian Causal Induction A… text/html 2026-04-02T14:03:46+00:00 Anonymous (anonymous@undisclosed.example.com) publications https://adaptiveagents.org/publications?rev=1775138626&do=diff Publications The :!: denote my favorite or most representative works. Theses [2] Ortega, P.A. :!: A Unified Framework for Resource-Bounded Autonomous Agents Interacting with Unknown Environments PhD Thesis, Dept. of Engineering, University of Cambridge, 2011.$ϵ$ text/html 2025-07-10T16:09:24+00:00 Anonymous (anonymous@undisclosed.example.com) robustness https://adaptiveagents.org/robustness?rev=1752163764&do=diff Beyond Alignment: Robustness in AI Safety " Advanced AI is highly adaptable yet inherently unpredictable, making it nearly impossible to embed a fixed set of human values from the start. Traditional alignment methods fall short because AI can reinterpret its goals dynamically, so instead, we need a robustness approach—one that emphasizes continuous oversight, rigorous stress-testing, and outcome-based regulation. This strategy mirrors how we manage human unpredictability, keeping human respons… text/html 2024-10-27T18:11:57+00:00 Anonymous (anonymous@undisclosed.example.com) sba https://adaptiveagents.org/sba?rev=1730052717&do=diff Stochastic Blahut Arimoto for Fine-Tuning LLMs " We will derive a reinforcement learning algorithm suitable for integration with deep learning architectures, grounded in robust principles from rate-distortion theory. This approach will yield an agent optimized for memory efficiency.$X$$X^\ast$$P(\tau)$$X^\ast$$R(\tau) \in \mathbb{R}$\[ F(Q) = \mathbb{E}_{Q}\bigl[R(\tau)\bigr] - \frac{1}{\beta} D_{KL}\bigl( Q \| P \bigr) \]\[ P^\ast(\tau) = \frac{P(\tau) \exp(\beta R(\tau))}{\sum_{\ta… text/html 2024-06-20T16:33:07+00:00 Anonymous (anonymous@undisclosed.example.com) sidebar https://adaptiveagents.org/sidebar?rev=1718901187&do=diff Home Blog & Essays CV & Bio Publications Pics & Drawings Google Scholar Twitter text/html 2024-02-12T12:09:09+00:00 Anonymous (anonymous@undisclosed.example.com) syntax https://adaptiveagents.org/syntax?rev=1707739749&do=diff Formatting Syntax DokuWiki supports some simple markup language, which tries to make the datafiles to be as readable as possible. This page contains all possible syntax you may use when editing the pages. Simply have a look at the source of this page by pressing text/html 2024-12-25T12:40:19+00:00 Anonymous (anonymous@undisclosed.example.com) third_person https://adaptiveagents.org/third_person?rev=1735130419&do=diff How to translate third-person into first-person? :!: Article under construction. " Imitation is a potent learning mechanism observed across the animal kingdom, enabling individuals to acquire behaviors without direct, first-person experience. Unlike operant conditioning, the core mechanism behind reinforcement learning which relies on personal actions and their consequences, imitation allows for learning through observation, bypassing the need for direct reinforcement.$X$$Y$$Y$$Y$$X$$Y$$X$$P(… text/html 2026-03-17T10:30:36+00:00 Anonymous (anonymous@undisclosed.example.com) uiai https://adaptiveagents.org/uiai?rev=1773743436&do=diff Universal Artificial Intelligence as Imitation Pedro A. Ortega Keywords: Solomonoff induction, universal imitation, causal interventions, adaptive control. Technical Report March 2026 Abstract Modern AI often defines agency as reward maximization: specify an objective, then learn to optimize it through interaction. This paper argues for an alternative foundation in which agency is inference: purposeful behavior emerges from learning compact generative explanations of how outcomes depend … text/html 2026-03-17T01:36:48+00:00 Anonymous (anonymous@undisclosed.example.com) universal_ai_as_imitation https://adaptiveagents.org/universal_ai_as_imitation?rev=1773711408&do=diff Universal Artificial Intelligence as Imitation Full Paper: [ PDF ] - HTML Simplified: [General Audience Version] " Many AI frameworks define agency as utility maximization: specify a scalar objective, then learn actions that increase it. A different foundation treats agency as imitation$$ -\log P(\textbf{see(o)} \mid \textbf{do(a)}, \textbf{context}). $$