Publications
- Theses
- Articles and Conference Papers
  - 2026
  - 2023
  - 2022
  - 2021
  - 2020
  - 2019
  - 2018
  - 2017
  - 2016
  - 2015
  - 2014
  - 2013
  - 2012
  - 2011
  - 2010
  - 2009 and earlier

Publications

The denote my favorite or most representative works.

Theses

[2] Ortega, P.A.
A Unified Framework for Resource-Bounded Autonomous Agents Interacting with Unknown Environments
PhD Thesis, Dept. of Engineering, University of Cambridge, 2011.
Thesis supervisor: Zoubin Ghahramani
Thesis committee: Marcus Hutter and Carl E. Rasmussen
[PDF]

[1] Ortega, P.A.
Design of Interactive Processing Mechanisms for the Analysis of Brain Waves (in Spanish)
Dissertation, School of Physical and Mathematical Sciences, University of Chile, 2005.
[PDF]

Articles and Conference Papers

2026

[57] Safe AI Should be Bounded and Multi-Agent
Hyland D., Jarne Ornia D., Bishop N., Dyer J., Macmillan-Scott O., Gaveniak T., Calinescu A., Wooldridge M., Rosas F., Ortega P.A.
Position Paper
[PDF]

[56] Bounded Rationality, Hedging, and Generalization
Ortega, P.A
Under Review, ArXiv:2605.15340, 2026
[PDF] [HTML]

[55] Universal Artificial Intelligence as Imitation
Ortega, P.A
Under Review
[PDF] [PDF-general audience]

2023

[54] Neural Networks and the Chomsky Hierarchy
Delétang G., Ruoss A., Grau-Moya J., Genewein T., Wenliang L.K., Catt E., Cundy C., Hutter M., Legg S., Veness J., Ortega P.A
International Conference on Learning Representations (ICLR), 2023
[PDF]

2022

[53] Beyond Bayes-optimality: meta-learning what you know you don't know
Grau-Moya J., Delétang G., Kunesch M., Genewein T., Catt E., Li W.K., Ruoss A., Cundy C., Veness J., Wang J.X., Hutter M., Summerfield C., Legg S., Ortega P.A
ArXiv:2207.02098, 2022
[PDF]

[52] Your Policy Regularizer is Secretly an Adversary
Brekelmans R., Genewein T., Grau-Moya J., Delétang G., Kunesch M., Legg S., Ortega P.A.
Transactions on Machine Learning Research, 2022
[PDF]

2021

[51] Model-Free Risk-Sensitive Reinforcement Learning
Delétang G., Grau-Moya J., Kunesch M., Genewein T., Brekelmans R., Legg S., Ortega P.A.
DeepMind Technical Report, ArXiv:2111.02907, 2021
[PDF]

[50] Shaking the foundations: delusions in sequence models for interaction and control
Ortega P.A., Kunesch M., Delétang G., Genewein T., Grau-Moya J., Veness J., Buchli J., Degrave J., Piot B., Perolat J., Everitt T., Tallec C., Parisotto E., Erez T., Chen Y., de Freitas, N., Legg S.
DeepMind Technical Report, ArXiv:2110.10819, 2021
[PDF]

[49] Causal Analysis of Agent Behavior for AI Safety
Déletang G., Grau-Moya J., Martic M., Genewein T., McGrath T., Mikulik V., Kunesch M., Legg S., Ortega P.A.
ArXiv:2010.12237, 2020
[PDF]

[48] From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization
Perolat J., Munos R., Lespiau J.-B., Omidshafiei S., Rowland M., Ortega P.A., Burch N., Anthony T., Balduzzi D., De Vylder B., Piliouras G., Lanctot M., Tuyls K.
ICML 2021
[PDF]

2020

[47] Agent Incentives: A Causal Perspective
Everitt T., Carey R., Langlois E., Ortega P.A., Legg S.
AAAI Conference on Artificial Intelligence, 2020.
[PDF]

[46] Meta-trained agents implement Bayes-optimal agents
Mikulik V., Delétang G., McGrath T., Genewein T., Martic M., Legg S., Ortega P.A.
Neural Information Processing Systems (NIPS), 2020.
[PDF]

[45] Algorithms for Causal Reasoning in Probability Trees
Genewein T., McGrath T., Delétang G., Mikulik V., Martic M., Legg S., Ortega P.A.
ArXiv:2010.12237, 2020
[PDF][Colab Tutorial]

[44] Action and Perception as Divergence Minimization
Hafner D., Ortega P.A., Ba J., Parr T., Friston K., Heess N.
arXiv:2009.01791, 2020
[PDF]

2019

[43] Meta reinforcement learning as task inference
Humplik J., Galashov A., Hasenclever L., Ortega P.A., Teh Y.W., Heess N.
arXiv:1905.06424, 2019
[PDF]

[42] Intrinsic Social Motivation via Causal Influence in Multi-Agent RL
Jaques N., Lazaridou A., Hughes E., Gulcehre C., Ortega P.A., Strouse D.J., Leibo J.Z., de Freitas N.
International Conference on Machine Learning (ICML), 2019
[PDF]

[41] Meta-learning of Sequential Strategies
Ortega P.A., Wang J.X., Rowland M., Genewein T., Kurth-Nelson Z., Pascanu R., Heess N., Veness J., Pritzel A., Sprechmann P., Jayakumar S.M., McGrath T., Miller K., Azar M., Osband I., Rabinowitz N., György A., Chiappa S., Osindero S., Teh Y.W., van Hasselt H., de Freitas N., Botvinick M., Legg S.
DeepMind Technical Report, 2019
[PDF]

[40] Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings
Everitt T., Ortega P.A., Barnes E., Legg S.
arXiv:1902.09980, 2019
[PDF]

[39] Causal Reasoning from Meta-reinforcement Learning
Dasgupta I., Wang J., Chiappa S., Mitrovic J., Ortega P.A., Raposo D., Hughes E., Battaglia E., Botvinick M., Kurth-Nelson Z.
arXiv:1901.08162, 2019
[PDF]

2018

[38] Bayesian Optimistic Kullback-Leibler Exploration
Lee K., Kim G.-H., Ortega P.A., Lee D.D., and Kim K.-E.
Machine Learning, 2018
[PDF]

[37] Modelling Friends and Foes
Ortega, P.A. and Legg, S.
ArXiv:1807.00196, 2018
[PDF]

2017

[36] AI safety gridworlds.
Leike, J., Martic, M., Krakovna, V., Ortega, P.A., Everitt, T., Lefrancq, A., Orseau, L. and Legg, S.
ArXiv:1711.09883, 2017
[PDF]

2016

[35] Ortega, P.A. and Tishby, N.
Memory controls time perception and intertemporal choices
ArXiv:1604.05129, 2016
[PDF]

[34] Human Decision-Making under Limited Time.
Ortega, P.A. and Stocker, A.A.
Neural Information Processing Systems (NIPS), 2016.
[PDF]

[33] Bayesian Reinforcement Learning with Behavioral Feedback.
Hong, T., Lee, J., Kim, K.-E., Ortega, P.A., and Lee, D.D.
International Joint Conference on Artificial Intelligence (IJCAI), 2016.
[PDF]

[32] Decision-making under ambiguity is modulated by visual framing, but not by motor vs. non-motor context. Experiments and an information-theoretic ambiguity model.
Grau-Moya, J. and Ortega, P.A. and Braun, D.A.
PLoS One, 11(4):e0153179, 2015.
[PDF]

2015

[31] Ortega, P.A., Braun, D.A., Dyer, J.S., Kim, K.-E., and Tishby, N.
Information-Theoretic Bounded Rationality
ArXiv:1512.06789, 2015
[PDF]

[30] Commentary: What is epistemic value in free energy models of learning and acting? A bounded rationality perspective.
Ortega, P.A. and Braun, D.A.
Cognitive Neuroscience, 2015.
[PDF]

[29] Subjectivity, Bayesianism, and Causality
Ortega, P.A.
Special Issue on Philosophical Aspects of Pattern Recognition
Pattern Recognition Letters, pp. 63-70, 2015
[PDF]

[28] Causal reasoning in a prediction task with hidden causes
Ortega, P.A. and Lee, D.D. and Stocker, A.A.
37th Annual Cognitive Science Society Meeting (CogSci), 2015
[PDF] [PDF Slides] [Poster]

[27] Reactive bandits with attitude
Ortega, P.A. and Kim, K.-E. and Lee, D.D.
18th International Conference on Artificial Intelligence and Statistics (AISTATS), 2015
[PDF] [Poster]

[26] Belief flows for robust online learning
Ortega, P.A. and Crammer, K. and Lee, D.D.
Information Theory and Applications (ITA), pp. 70-77, 2015
[PDF] [PDF Slides]

[25] Perceptual adaptation: Getting ready for the future
Wei, X.-X. and Ortega, P.A. and Stocker, A.A.
Computational and Systems Neuroscience (Cosyne), 2015
Won best poster award

2014

[24] Information-theoretic bounded rationality and $ϵ$-optimality
Braun, D.A. and Ortega, P.A
Entropy 16(8), 4662-4676, 2014
[PDF]

[23] Ortega, P.A. and Lee, D.D.
An Adversarial Interpretation of Information-Theoretic Bounded Rationality
Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI '14), 2014
[PDF] [Poster]

[22] Generalized Thompson Sampling for Sequential Decision-Making and Causal Inference
Ortega, P.A. and Braun, D.A.
Complex Adaptive Systems Modeling 2:2, 2014
[PDF]

[21] Dynamic Belief State Representations
Lee, D.D., Ortega, P.A. and Stocker, A.
Current Opinion in Neurobiology 25, pp. 221–227, 2014
[PDF]

[20] Monte Carlo Methods for Exact & Efficient Solution of the Generalized Optimality Equations
Ortega, P.A., Braun, D.A. and Tishby, N.
IEEE International Conference on Robotics and Automation (ICRA), 2014
[PDF]

2013

[19] An Adversarial Interpretation of Information-Theoretic Bounded Rationality
Ortega, P.A. and Lee, D.
NIPS Workshop on Planning with Information Constraints, 2013
[PDF]

[18] Thermodynamics as a theory of decision-making with information processing costs
Ortega, P.A. and Braun, D.A.
Proceedings of the Royal Society A 20120683, 2013.
[PDF]

[17] Metabolic cost as an organizing principle for cooperative learning
Balduzzi D., Ortega, P.A. and Besserve, M.
Advances in Complex Systems 2013.
[PDF]

2012

[16] Adaptive Coding of Actions and Observations
Ortega, P.A. and Braun, D.A.
NIPS Workshop on Information in Perception and Action, 2012.
[PDF]

[15] A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function
Ortega, P.A., Grau-Moya, J., Genewein, T., Balduzzi, D. and Braun, D.A.
Neural Information Processing Systems (NIPS) 2012
[PDF] [CODE]

[14] Risk-Sensitivity in Bayesian Sensorimotor Integration
Grau-Moya, J., Ortega, P.A. and Braun, D.A. (2012)
PLOS Computational Biology 8(9): e1002698
[PDF]

[13] Free Energy and the Generalized Optimality Equations for Sequential Decision Making
Ortega, P.A. and Braun, D.A.
European Workshop on Reinforcement Learning 2012
[PDF]

2011

[12] Ortega, P.A.
Bayesian Causal Induction
NIPS Workshop on Philosophy and Machine Learning, 2011.
[PDF]

[11] Information, Utility and Bounded Rationality
Ortega, P.A. and Braun, D.A.
The fourth conference on artificial general intelligence, pp. 269-274, 2011.
[PDF]

[10] Reinforcement Learning and the Bayesian Control Rule
Ortega, P.A. and Braun, D.A. and Godsill, S.J.
The fourth conference on artificial general intelligence, pp. 281-285, 2011.
[PDF]

[9] Motor coordination: When two have to act as one
Braun, D.A. and Ortega P.A. and Wolpert D.M.
2011 Special issue of Experimental Brain Research on Joint Action
[PDF]

[8] Path Integral Control and Bounded Rationality
Braun, D.A. and Ortega, P.A. and Theodorou, E. and Schaal, S.
2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Paris.
[PDF]

2010

[7] A minimum relative entropy principle for learning and acting
Ortega, P.A. and Braun, D.A.
Journal of Artificial Intelligence Research 38, pp. 475-511, 2010.
[PDF]

[6] A minimum relative entropy principle for adaptive control in linear quadratic regulators
Braun, D.A. and Ortega, P.A.
Proceedings of the 7th international conference on informatics in control, automation and robotics, pp. 103-108, 2010
[PDF]

[5] A conversion between utility and information
Ortega, P.A. and Braun, D.A.
The third conference on artificial general intelligence, pp. 115-120, 2010
[PDF]

[4] A Bayesian rule for adaptive control based on causal interventions
Ortega, P.A. and Braun, D.A.
The third conference on artificial general intelligence, pp. 121-126, 2010
[PDF]

2009 and earlier

[3] Nash equilibria in multi-agent motor interactions
Braun D.A., Ortega P.A. & Wolpert D.M. (2009)
PLoS Computational Biology 5 (8):e1000468
[PDF]

[2] Error Backpropagation with Generalized Functional Composition
Bassi, A. and Ortega, P.A.
Technical Report, Department of Computer Science, University of Chile (2006)
[PDF]

[1] A Medical Claim Fraud/Abuse Detection System based on Data Mining: A Case Study in Chile
Ortega, P.A. and Figueroa, C. and Ruz, G.
DMIN 2006:224-231
[PDF]

Table of Contents