HOME - Click on [|•|] to show the abstract and on the abstract to hide it - Click on the title to open on arXiv - Click on the authors to open on ar5iv
0
 
...
[|•|] Bootstrapping Motor Skill Learning with Motion Planning (2021)   -   Abbatematteo, Ben and Rosen, Eric and Tellex, Stefanie and Konidaris, George   [|•|]
[|•|] Relative Entropy Regularized Policy Iteration (2018)   -   Abdolmaleki, Abbas and Springenberg, Jost Tobias and Degrave, Jonas and Bohez, Steven and Tassa, Yuval and Belov, Dan and Heess, Nicolas and Riedmiller, Martin   [|•|]
[|•|] Towards Characterizing Divergence in Deep Q-Learning (2019)   -   Achiam, Joshua and Knight, Ethan and Abbeel, Pieter   [|•|]
[|•|] Legged Locomotion in Challenging Terrains using Egocentric Vision (2022)   -   Agarwal, Ananye and Kumar, Ashish and Malik, Jitendra and Pathak, Deepak   [|•|]
[|•|] Deep Reinforcement Learning at the Edge of the Statistical Precipice (2021)   -   Agarwal, Rishabh and Schwarzer, Max and Castro, Pablo Samuel and Courville, Aaron and Bellemare, Marc G.   [|•|]
[|•|] Understanding the impact of entropy on policy optimization (2018)   -   Ahmed, Zafarali and Roux, Nicolas Le and Norouzi, Mohammad and Schuurmans, Dale   [|•|]
[|•|] OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning (2020)   -   Ajay, Anurag and Kumar, Aviral and Agrawal, Pulkit and Levine, Sergey and Nachum, Ofir   [|•|]
[|•|] Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards (2020)   -   Amin, Susan and Gomrokchi, Maziar and Aboutalebi, Hossein and Satija, Harsh and Precup, Doina   [|•|]
[|•|] A Survey of Exploration Methods in Reinforcement Learning (2021)   -   Amin, Susan and Gomrokchi, Maziar and Satija, Harsh and Hoof, Herke van and Precup, Doina   [|•|]
[|•|] Input Convex Neural Networks (2016)   -   Amos, Brandon and Xu, Lei and Kolter, J. Zico   [|•|]
[|•|] Hindsight Experience Replay (2017)   -   Andrychowicz, Marcin and Wolski, Filip and Ray, Alex and Schneider, Jonas and Fong, Rachel and Welinder, Peter and McGrew, Bob and Tobin, Josh and Abbeel, Pieter and Zaremba, Wojciech   [|•|]
[|•|] Layer-wise learning of deep generative models (2012)   -   Arnold, Ludovic and Ollivier, Yann   [|•|]
[|•|] A Brief Survey of Deep Reinforcement Learning (2017)   -   Arulkumaran, Kai and Deisenroth, Marc Peter and Brundage, Miles and Bharath, Anil Anthony   [|•|]
[|•|] Breaking the Curse of Dimensionality with Convex Neural Networks (2014)   -   Bach, Francis   [|•|]
[|•|] The Option-Critic Architecture (2016)   -   Bacon, Pierre-Luc and Harb, Jean and Precup, Doina   [|•|]
[|•|] Never Give Up: Learning Directed Exploration Strategies (2020)   -   Badia, Adrià Puigdomènech and Sprechmann, Pablo and Vitvitskyi, Alex and Guo, Daniel and Piot, Bilal and Kapturowski, Steven and Tieleman, Olivier and Arjovsky, Martín and Pritzel, Alexander and Bolt, Andew and Blundell, Charles   [|•|]
[|•|] Compatible Value Gradients for Reinforcement Learning of Continuous Deep Policies (2015)   -   Balduzzi, David and Ghifary, Muhammad   [|•|]
[|•|] Efficient Online Reinforcement Learning with Offline Data (2023)   -   Ball, Philip J. and Smith, Laura and Kostrikov, Ilya and Levine, Sergey   [|•|]
[|•|] Ready Policy One: World Building Through Active Learning (2020)   -   Ball, Philip and Parker-Holder, Jack and Pacchiano, Aldo and Choromanski, Krzysztof and Roberts, Stephen   [|•|]
[|•|] Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization (2020)   -   Barde, Paul and Roy, Julien and Jeon, Wonseok and Pineau, Joelle and Pal, Christopher and Nowrouzezahrai, Derek   [|•|]
[|•|] Successor Features for Transfer in Reinforcement Learning (2016)   -   Barreto, André and Dabney, Will and Munos, Rémi and Hunt, Jonathan J. and Schaul, Tom and Hasselt, Hado van and Silver, David   [|•|]
[|•|] Rearrangement: A Challenge for Embodied AI (2020)   -   Batra, Dhruv and Chang, Angel X. and Chernova, Sonia and Davison, Andrew J. and Deng, Jia and Koltun, Vladlen and Levine, Sergey and Malik, Jitendra and Mordatch, Igor and Mottaghi, Roozbeh and Savva, Manolis and Su, Hao   [|•|]
[|•|] Relational inductive biases, deep learning, and graph networks (2018)   -   Battaglia, Peter W. and Hamrick, Jessica B. and Bapst, Victor and Sanchez-Gonzalez, Alvaro and Zambaldi, Vinicius and Malinowski, Mateusz and Tacchetti, Andrea and Raposo, David and Santoro, Adam and Faulkner, Ryan and Gulcehre, Caglar and Song, Francis and Ballard, Andrew and Gilmer, Justin and Dahl, George and Vaswani, Ashish and Allen, Kelsey and Nash, Charles and Langston, Victoria and Dyer, Chris and Heess, Nicolas and Wierstra, Daan and Kohli, Pushmeet and Botvinick, Matt and Vinyals, Oriol and Li, Yujia and Pascanu, Razvan   [|•|]
[|•|] Learning to Continually Learn (2020)   -   Beaulieu, Shawn and Frati, Lapo and Miconi, Thomas and Lehman, Joel and Stanley, Kenneth O. and Clune, Jeff and Cheney, Nick   [|•|]
[|•|] Training in Task Space to Speed Up and Guide Reinforcement Learning (2019)   -   Bellegarda, Guillaume and Byl, Katie   [|•|]
[|•|] A Geometric Perspective on Optimal Representations for Reinforcement Learning (2019)   -   Bellemare, Marc G. and Dabney, Will and Dadashi, Robert and Taiga, Adrien Ali and Castro, Pablo Samuel and Roux, Nicolas Le and Schuurmans, Dale and Lattimore, Tor and Lyle, Clare   [|•|]
[|•|] A Distributional Perspective on Reinforcement Learning (2017)   -   Bellemare, Marc G. and Dabney, Will and Munos, Rémi   [|•|]
[|•|] The Cramer Distance as a Solution to Biased Wasserstein Gradients (2017)   -   Bellemare, Marc G. and Danihelka, Ivo and Dabney, Will and Mohamed, Shakir and Lakshminarayanan, Balaji and Hoyer, Stephan and Munos, Rémi   [|•|]
[|•|] Unifying Count-Based Exploration and Intrinsic Motivation (2016)   -   Bellemare, Marc G. and Srinivasan, Sriram and Ostrovski, Georg and Schaul, Tom and Saxton, David and Munos, Remi   [|•|]
[|•|] Representation Learning: A Review and New Perspectives (2012)   -   Bengio, Yoshua and Courville, Aaron and Vincent, Pascal   [|•|]
[|•|] Model-Based Action Exploration for Learning Dynamic Motion Skills (2018)   -   Berseth, Glen and Panne, Michiel van de   [|•|]
[|•|] LEAF: Latent Exploration Along the Frontier (2020)   -   Bharadhwaj, Homanga and Garg, Animesh and Shkurti, Florian   [|•|]
[|•|] Proximal Distilled Evolutionary Reinforcement Learning (2019)   -   Bodnar, Cristian and Day, Ben and Lió, Pietro   [|•|]
[|•|] Universal Successor Features Approximators (2018)   -   Borsa, Diana and Barreto, André and Quan, John and Mankowitz, Daniel and Munos, Rémi and Hasselt, Hado van and Silver, David and Schaul, Tom   [|•|]
[|•|] Practical Gauss-Newton Optimisation for Deep Learning (2017)   -   Botev, Aleksandar and Ritter, Hippolyt and Barber, David   [|•|]
[|•|] A Theory of Universal Learning (2020)   -   Bousquet, Olivier and Hanneke, Steve and Moran, Shay and Handel, Ramon van and Yehudayoff, Amir   [|•|]
[|•|] On Identifiability in Transformers (2019)   -   Brunner, Gino and Liu, Yang and Pascual, Damián and Richter, Oliver and Ciaramita, Massimiliano and Wattenhofer, Roger   [|•|]
[|•|] Modern Koopman Theory for Dynamical Systems (2021)   -   Brunton, Steven L. and Budišić, Marko and Kaiser, Eurika and Kutz, J. Nathan   [|•|]
[|•|] Exploration by Random Network Distillation (2018)   -   Burda, Yuri and Edwards, Harrison and Storkey, Amos and Klimov, Oleg   [|•|]
[|•|] Offline Reinforcement Learning at Multiple Frequencies (2022)   -   Burns, Kaylee and Yu, Tianhe and Finn, Chelsea and Hausman, Karol   [|•|]
[|•|] Capturability-based Pattern Generation for Walking with Variable Height (2018)   -   Caron, Stéphane and Escande, Adrien and Lanari, Leonardo and Mallein, Bastien   [|•|]
[|•|] Stein Variational Goal Generation for adaptive Exploration in Multi-Goal Reinforcement Learning (2022)   -   Castanet, Nicolas and Lamprier, Sylvain and Sigaud, Olivier   [|•|]
[|•|] Robust Feedback Motion Policy Design Using Reinforcement Learning on a 3D Digit Bipedal Robot (2021)   -   Castillo, Guillermo A. and Weng, Bowen and Zhang, Wei and Hereid, Ayonga   [|•|]
[|•|] Learning Action Representations for Reinforcement Learning (2019)   -   Chandak, Yash and Theocharous, Georgios and Kostas, James and Jordan, Scott and Thomas, Philip S.   [|•|]
[|•|] Goal-Conditioned Reinforcement Learning with Imagined Subgoals (2021)   -   Chane-Sane, Elliot and Schmid, Cordelia and Laptev, Ivan   [|•|]
[|•|] Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills (2021)   -   Chebotar, Yevgen and Hausman, Karol and Lu, Yao and Xiao, Ted and Kalashnikov, Dmitry and Varley, Jake and Irpan, Alex and Eysenbach, Benjamin and Julian, Ryan and Finn, Chelsea and Levine, Sergey   [|•|]
[|•|] Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for Robotic Bin Picking (2022)   -   Chen, Kai and Cao, Rui and James, Stephen and Li, Yichuan and Liu, Yun-Hui and Abbeel, Pieter and Dou, Qi   [|•|]
[|•|] Decision Transformer: Reinforcement Learning via Sequence Modeling (2021)   -   Chen, Lili and Lu, Kevin and Rajeswaran, Aravind and Lee, Kimin and Grover, Aditya and Laskin, Michael and Abbeel, Pieter and Srinivas, Aravind and Mordatch, Igor   [|•|]
[|•|] Neural Ordinary Differential Equations (2018)   -   Chen, Ricky T. Q. and Rubanova, Yulia and Bettencourt, Jesse and Duvenaud, David   [|•|]
[|•|] InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets (2016)   -   Chen, Xi and Duan, Yan and Houthooft, Rein and Schulman, John and Sutskever, Ilya and Abbeel, Pieter   [|•|]
[|•|] Symbolic Discovery of Optimization Algorithms (2023)   -   Chen, Xiangning and Liang, Chen and Huang, Da and Real, Esteban and Wang, Kaiyuan and Liu, Yao and Pham, Hieu and Dong, Xuanyi and Luong, Thang and Hsieh, Cho-Jui and Lu, Yifeng and Le, Quoc V.   [|•|]
[|•|] Randomized Ensembled Double Q-Learning: Learning Fast Without a Model (2021)   -   Chen, Xinyue and Wang, Che and Zhou, Zijian and Ross, Keith   [|•|]
[|•|] Optimal transport natural gradient for statistical manifolds with continuous sample space (2018)   -   Chen, Yifan and Li, Wuchen   [|•|]
[|•|] Adversarially Trained Actor Critic for Offline Reinforcement Learning (2022)   -   Cheng, Ching-An and Xie, Tengyang and Jiang, Nan and Agarwal, Alekh   [|•|]
[|•|] Divide & Conquer Imitation Learning (2022)   -   Chenu, Alexandre and Perrin-Gilbert, Nicolas and Sigaud, Olivier   [|•|]
[|•|] Diffusion Policy: Visuomotor Policy Learning via Action Diffusion (2023)   -   Chi, Cheng and Xu, Zhenjia and Feng, Siyuan and Cousineau, Eric and Du, Yilun and Burchfiel, Benjamin and Tedrake, Russ and Song, Shuran   [|•|]
[|•|] Lyapunov-based Safe Policy Optimization for Continuous Control (2019)   -   Chow, Yinlam and Nachum, Ofir and Faust, Aleksandra and Duenez-Guzman, Edgar and Ghavamzadeh, Mohammad   [|•|]
[|•|] Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning (2019)   -   Chu, Casey and Blanchet, Jose and Glynn, Peter   [|•|]
[|•|] Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models (2018)   -   Chua, Kurtland and Calandra, Roberto and McAllister, Rowan and Levine, Sergey   [|•|]
[|•|] Better Exploration with Optimistic Actor-Critic (2019)   -   Ciosek, Kamil and Vuong, Quan and Loftin, Robert and Hofmann, Katja   [|•|]
[|•|] Phasic Policy Gradient (2020)   -   Cobbe, Karl and Hilton, Jacob and Klimov, Oleg and Schulman, John   [|•|]
[|•|] Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents (2017)   -   Conti, Edoardo and Madhavan, Vashisht and Such, Felipe Petroski and Lehman, Joel and Stanley, Kenneth O. and Clune, Jeff   [|•|]
[|•|] Hierarchical Behavioral Repertoires with Unsupervised Descriptors (2018)   -   Cully, Antoine and Demiris, Yiannis   [|•|]
[|•|] Quality and Diversity Optimization: A Unifying Modular Framework (2017)   -   Cully, Antoine and Demiris, Yiannis   [|•|]
[|•|] Implicit Quantile Networks for Distributional Reinforcement Learning (2018)   -   Dabney, Will and Ostrovski, Georg and Silver, David and Munos, Rémi   [|•|]
[|•|] Distributional Reinforcement Learning with Quantile Regression (2017)   -   Dabney, Will and Rowland, Mark and Bellemare, Marc G. and Munos, Rémi   [|•|]
[|•|] Primal Wasserstein Imitation Learning (2020)   -   Dadashi, Robert and Hussenot, Léonard and Geist, Matthieu and Pietquin, Olivier   [|•|]
[|•|] Continuous Control with Action Quantization from Demonstrations (2021)   -   Dadashi, Robert and Hussenot, Léonard and Vincent, Damien and Girgin, Sertan and Raichuk, Anton and Geist, Matthieu and Pietquin, Olivier   [|•|]
[|•|] Deep Gaussian Processes (2012)   -   Damianou, Andreas C. and Lawrence, Neil D.   [|•|]
[|•|] Natural Neural Networks (2015)   -   Desjardins, Guillaume and Simonyan, Karen and Pascanu, Razvan and Kavukcuoglu, Koray   [|•|]
[|•|] Sharp Minima Can Generalize For Deep Nets (2017)   -   Dinh, Laurent and Pascanu, Razvan and Bengio, Samy and Bengio, Yoshua   [|•|]
[|•|] Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning (2019)   -   Doan, Thang and Mazoure, Bogdan and Abdar, Moloud and Durand, Audrey and Pineau, Joelle and Hjelm, R. Devon   [|•|]
[|•|] GAN Q-learning (2018)   -   Doan, Thang and Mazoure, Bogdan and Lyle, Clare   [|•|]
[|•|] Tutorial on Variational Autoencoders (2016)   -   Doersch, Carl   [|•|]
[|•|] Adapting Auxiliary Losses Using Gradient Similarity (2018)   -   Du, Yunshu and Czarnecki, Wojciech M. and Jayakumar, Siddhant M. and Farajtabar, Mehrdad and Pascanu, Razvan and Lakshminarayanan, Balaji   [|•|]
[|•|] Online Trajectory Planning Through Combined Trajectory Optimization and Function Approximation: Application to the Exoskeleton Atalante (2019)   -   Duburcq, Alexis and Chevaleyre, Yann and Bredeche, Nicolas and Boéris, Guilhem   [|•|]
[|•|] Reactive Stepping for Humanoid Robots using Reinforcement Learning: Application to Standing Push Recovery on the Exoskeleton Atalante (2022)   -   Duburcq, Alexis and Schramm, Fabian and Boéris, Guilhem and Bredeche, Nicolas and Chevaleyre, Yann   [|•|]
[|•|] First return, then explore (2020)   -   Ecoffet, Adrien and Huizinga, Joost and Lehman, Joel and Stanley, Kenneth O. and Clune, Jeff   [|•|]
[|•|] Go-Explore: a New Approach for Hard-Exploration Problems (2019)   -   Ecoffet, Adrien and Huizinga, Joost and Lehman, Joel and Stanley, Kenneth O. and Clune, Jeff   [|•|]
[|•|] RvS: What is Essential for Offline RL via Supervised Learning? (2021)   -   Emmons, Scott and Eysenbach, Benjamin and Kostrikov, Ilya and Levine, Sergey   [|•|]
[|•|] Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO (2020)   -   Engstrom, Logan and Ilyas, Andrew and Santurkar, Shibani and Tsipras, Dimitris and Janoos, Firdaus and Rudolph, Larry and Madry, Aleksander   [|•|]
[|•|] IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures (2018)   -   Espeholt, Lasse and Soyer, Hubert and Munos, Remi and Simonyan, Karen and Mnih, Volodymir and Ward, Tom and Doron, Yotam and Firoiu, Vlad and Harley, Tim and Dunning, Iain and Legg, Shane and Kavukcuoglu, Koray   [|•|]
[|•|] Diversity is All You Need: Learning Skills without a Reward Function (2018)   -   Eysenbach, Benjamin and Gupta, Abhishek and Ibarz, Julian and Levine, Sergey   [|•|]
[|•|] Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification (2021)   -   Eysenbach, Benjamin and Levine, Sergey and Salakhutdinov, Ruslan   [|•|]
[|•|] Maximum Entropy RL (Provably) Solves Some Robust RL Problems (2021)   -   Eysenbach, Benjamin and Levine, Sergey   [|•|]
[|•|] Imitating Past Successes can be Very Suboptimal (2022)   -   Eysenbach, Benjamin and Udatha, Soumith and Levine, Sergey and Salakhutdinov, Ruslan   [|•|]
[|•|] Contrastive Learning as Goal-Conditioned Reinforcement Learning (2022)   -   Eysenbach, Benjamin and Zhang, Tianjun and Salakhutdinov, Ruslan and Levine, Sergey   [|•|]
[|•|] Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization (2016)   -   Finn, Chelsea and Levine, Sergey and Abbeel, Pieter   [|•|]
[|•|] Deep Spatial Autoencoders for Visuomotor Learning (2015)   -   Finn, Chelsea and Tan, Xin Yu and Duan, Yan and Darrell, Trevor and Levine, Sergey and Abbeel, Pieter   [|•|]
[|•|] Bootstrapped Meta-Learning (2021)   -   Flennerhag, Sebastian and Schroecker, Yannick and Zahavy, Tom and Hasselt, Hado van and Silver, David and Singh, Satinder   [|•|]
[|•|] Automatic Goal Generation for Reinforcement Learning Agents (2017)   -   Florensa, Carlos and Held, David and Geng, Xinyang and Abbeel, Pieter   [|•|]
[|•|] Reverse Curriculum Generation for Reinforcement Learning (2017)   -   Florensa, Carlos and Held, David and Wulfmeier, Markus and Zhang, Michael and Abbeel, Pieter   [|•|]
[|•|] Differentiable Quality Diversity (2021)   -   Fontaine, Matthew C. and Nikolaidis, Stefanos   [|•|]
[|•|] Noisy Networks for Exploration (2017)   -   Fortunato, Meire and Azar, Mohammad Gheshlaghi and Piot, Bilal and Menick, Jacob and Osband, Ian and Graves, Alex and Mnih, Vlad and Munos, Remi and Hassabis, Demis and Pietquin, Olivier and Blundell, Charles and Legg, Shane   [|•|]
[|•|] An Introduction to Deep Reinforcement Learning (2018)   -   Francois-Lavet, Vincent and Henderson, Peter and Islam, Riashat and Bellemare, Marc G. and Pineau, Joelle   [|•|]
[|•|] Brax – A Differentiable Physics Engine for Large Scale Rigid Body Simulation (2021)   -   Freeman, C. Daniel and Frey, Erik and Raichuk, Anton and Girgin, Sertan and Mordatch, Igor and Bachem, Olivier   [|•|]
[|•|] D4RL: Datasets for Deep Data-Driven Reinforcement Learning (2020)   -   Fu, Justin and Kumar, Aviral and Nachum, Ofir and Tucker, George and Levine, Sergey   [|•|]
[|•|] Diagnosing Bottlenecks in Deep Q-learning Algorithms (2019)   -   Fu, Justin and Kumar, Aviral and Soh, Matthew and Levine, Sergey   [|•|]
[|•|] A Minimalist Approach to Offline Reinforcement Learning (2021)   -   Fujimoto, Scott and Gu, Shixiang Shane   [|•|]
[|•|] Addressing Function Approximation Error in Actor-Critic Methods (2018)   -   Fujimoto, Scott and Hoof, Herke van and Meger, David   [|•|]
[|•|] Off-Policy Deep Reinforcement Learning without Exploration (2018)   -   Fujimoto, Scott and Meger, David and Precup, Doina   [|•|]
[|•|] Policy Optimization by Genetic Distillation (2017)   -   Gangwani, Tanmay and Peng, Jian   [|•|]
[|•|] Hierarchical Skills for Efficient Exploration (2021)   -   Gehring, Jonas and Synnaeve, Gabriel and Krause, Andreas and Usunier, Nicolas   [|•|]
[|•|] A Theory of Regularized Markov Decision Processes (2019)   -   Geist, Matthieu and Scherrer, Bruno and Pietquin, Olivier   [|•|]
[|•|] Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis (2018)   -   George, Thomas and Laurent, César and Bouthillier, Xavier and Ballas, Nicolas and Vincent, Pascal   [|•|]
[|•|] Reinforcement Learning from Passive Data via Latent Intentions (2023)   -   Ghosh, Dibya and Bhateja, Chethan and Levine, Sergey   [|•|]
[|•|] Learning to Reach Goals via Iterated Supervised Learning (2019)   -   Ghosh, Dibya and Gupta, Abhishek and Reddy, Ashwin and Fu, Justin and Devin, Coline and Eysenbach, Benjamin and Levine, Sergey   [|•|]
[|•|] Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective (2022)   -   Ghugare, Raj and Bharadhwaj, Homanga and Eysenbach, Benjamin and Levine, Sergey and Salakhutdinov, Ruslan   [|•|]
[|•|] Recall Traces: Backtracking Models for Efficient Reinforcement Learning (2018)   -   Goyal, Anirudh and Brakel, Philemon and Fedus, William and Singhal, Soumye and Lillicrap, Timothy and Levine, Sergey and Larochelle, Hugo and Bengio, Yoshua   [|•|]
[|•|] Feedback MPC for Torque-Controlled Legged Robots (2019)   -   Grandia, Ruben and Farshidian, Farbod and Ranftl, René and Hutter, Marco   [|•|]
[|•|] Variational Intrinsic Control (2016)   -   Gregor, Karol and Rezende, Danilo Jimenez and Wierstra, Daan   [|•|]
[|•|] Correlation and variable importance in random forests (2013)   -   Gregorutti, Baptiste and Michel, Bertrand and Saint-Pierre, Philippe   [|•|]
[|•|] Hamiltonian Neural Networks (2019)   -   Greydanus, Sam and Dzamba, Misko and Yosinski, Jason   [|•|]
[|•|] Provably Faster Gradient Descent via Long Steps (2023)   -   Grimmer, Benjamin   [|•|]
[|•|] A Review of Safe Reinforcement Learning: Methods, Theory and Applications (2022)   -   Gu, Shangding and Yang, Long and Du, Yali and Chen, Guang and Walter, Florian and Wang, Jun and Yang, Yaodong and Knoll, Alois   [|•|]
[|•|] Continuous Deep Q-Learning with Model-based Acceleration (2016)   -   Gu, Shixiang and Lillicrap, Timothy and Sutskever, Ilya and Levine, Sergey   [|•|]
[|•|] Neural Predictive Belief Representations (2018)   -   Guo, Zhaohan Daniel and Azar, Mohammad Gheshlaghi and Piot, Bilal and Pires, Bernardo A. and Munos, Rémi   [|•|]
[|•|] BYOL-Explore: Exploration by Bootstrapped Prediction (2022)   -   Guo, Zhaohan Daniel and Thakoor, Shantanu and Pîslar, Miruna and Pires, Bernardo Avila and Altché, Florent and Tallec, Corentin and Saade, Alaa and Calandriello, Daniele and Grill, Jean-Bastien and Tang, Yunhao and Valko, Michal and Munos, Rémi and Azar, Mohammad Gheshlaghi and Piot, Bilal   [|•|]
[|•|] Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning (2017)   -   Gupta, Abhishek and Devin, Coline and Liu, YuXuan and Abbeel, Pieter and Levine, Sergey   [|•|]
[|•|] Demonstration-Bootstrapped Autonomous Practicing via Multi-Task Reinforcement Learning (2022)   -   Gupta, Abhishek and Lynch, Corey and Kinman, Brandon and Peake, Garrett and Levine, Sergey and Hausman, Karol   [|•|]
[|•|] Meta-Reinforcement Learning of Structured Exploration Strategies (2018)   -   Gupta, Abhishek and Mendonca, Russell and Liu, YuXuan and Abbeel, Pieter and Levine, Sergey   [|•|]
[|•|] Towards Variable Assistance for Lower Body Exoskeletons (2019)   -   Gurriet, Thomas and Tucker, Maegan and Duburcq, Alexis and Boeris, Guilhem and Ames, Aaron D.   [|•|]
[|•|] World Models (2018)   -   Ha, David and Schmidhuber, Jürgen   [|•|]
[|•|] Learning to Walk in the Real World with Minimal Human Effort (2020)   -   Ha, Sehoon and Xu, Peng and Tan, Zhenyu and Levine, Sergey and Tan, Jie   [|•|]
[|•|] Latent Space Policies for Hierarchical Reinforcement Learning (2018)   -   Haarnoja, Tuomas and Hartikainen, Kristian and Abbeel, Pieter and Levine, Sergey   [|•|]
[|•|] Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning (2023)   -   Haarnoja, Tuomas and Moran, Ben and Lever, Guy and Huang, Sandy H. and Tirumala, Dhruva and Humplik, Jan and Wulfmeier, Markus and Tunyasuvunakool, Saran and Siegel, Noah Y. and Hafner, Roland and Bloesch, Michael and Hartikainen, Kristian and Byravan, Arunkumar and Hasenclever, Leonard and Tassa, Yuval and Sadeghi, Fereshteh and Batchelor, Nathan and Casarini, Federico and Saliceti, Stefano and Game, Charles and Sreendra, Neil and Patel, Kushal and Gwira, Marlon and Huber, Andrea and Hurley, Nicole and Nori, Francesco and Hadsell, Raia and Heess, Nicolas   [|•|]
[|•|] Composable Deep Reinforcement Learning for Robotic Manipulation (2018)   -   Haarnoja, Tuomas and Pong, Vitchyr and Zhou, Aurick and Dalal, Murtaza and Abbeel, Pieter and Levine, Sergey   [|•|]
[|•|] Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor (2018)   -   Haarnoja, Tuomas and Zhou, Aurick and Abbeel, Pieter and Levine, Sergey   [|•|]
[|•|] Soft Actor-Critic Algorithms and Applications (2018)   -   Haarnoja, Tuomas and Zhou, Aurick and Hartikainen, Kristian and Tucker, George and Ha, Sehoon and Tan, Jie and Kumar, Vikash and Zhu, Henry and Gupta, Abhishek and Abbeel, Pieter and Levine, Sergey   [|•|]
[|•|] TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow (2017)   -   Hafner, Danijar and Davidson, James and Vanhoucke, Vincent   [|•|]
[|•|] Deep Hierarchical Planning from Pixels (2022)   -   Hafner, Danijar and Lee, Kuang-Huei and Fischer, Ian and Abbeel, Pieter   [|•|]
[|•|] Dream to Control: Learning Behaviors by Latent Imagination (2019)   -   Hafner, Danijar and Lillicrap, Timothy and Ba, Jimmy and Norouzi, Mohammad   [|•|]
[|•|] Mastering Atari with Discrete World Models (2020)   -   Hafner, Danijar and Lillicrap, Timothy and Norouzi, Mohammad and Ba, Jimmy   [|•|]
[|•|] Mastering Diverse Domains through World Models (2023)   -   Hafner, Danijar and Pasukonis, Jurgis and Ba, Jimmy and Lillicrap, Timothy   [|•|]
[|•|] Towards General and Autonomous Learning of Core Skills: A Case Study in Locomotion (2020)   -   Hafner, Roland and Hertweck, Tim and Klöppner, Philipp and Bloesch, Michael and Neunert, Michael and Wulfmeier, Markus and Tunyasuvunakool, Saran and Heess, Nicolas and Riedmiller, Martin   [|•|]
[|•|] Hierarchical Few-Shot Imitation with Skill Transition Models (2021)   -   Hakhamaneshi, Kourosh and Zhao, Ruihan and Zhan, Albert and Abbeel, Pieter and Laskin, Michael   [|•|]
[|•|] On the role of planning in model-based deep reinforcement learning (2020)   -   Hamrick, Jessica B. and Friesen, Abram L. and Behbahani, Feryal and Guez, Arthur and Viola, Fabio and Witherspoon, Sims and Anthony, Thomas and Buesing, Lars and Veličković, Petar and Weber, Théophane   [|•|]
[|•|] Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration (2020)   -   Han, Seungyul and Sung, Youngchul   [|•|]
[|•|] Temporal Difference Learning for Model Predictive Control (2022)   -   Hansen, Nicklas and Wang, Xiaolong and Su, Hao   [|•|]
[|•|] IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies (2023)   -   Hansen-Estruch, Philippe and Kostrikov, Ilya and Janner, Michael and Kuba, Jakub Grudzien and Levine, Sergey   [|•|]
[|•|] Feedback Control of an Exoskeleton for Paraplegics: Toward Robustly Stable Hands-free Dynamic Walking (2018)   -   Harib, Omar and Hereid, Ayonga and Agrawal, Ayush and Gurriet, Thomas and Finet, Sylvain and Boeris, Guilhem and Duburcq, Alexis and Mungai, M. Eva and Masselin, Matthieu and Ames, Aaron D. and Sreenath, Koushil and Grizzle, Jessy   [|•|]
[|•|] Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery (2019)   -   Hartikainen, Kristian and Geng, Xinyang and Haarnoja, Tuomas and Levine, Sergey   [|•|]
[|•|] Deep Reinforcement Learning and the Deadly Triad (2018)   -   Hasselt, Hado van and Doron, Yotam and Strub, Florian and Hessel, Matteo and Sonnerat, Nicolas and Modayil, Joseph   [|•|]
[|•|] Soft Hindsight Experience Replay (2020)   -   He, Qiwei and Zhuang, Liansheng and Li, Houqiang   [|•|]
[|•|] Emergence of Locomotion Behaviours in Rich Environments (2017)   -   Heess, Nicolas and TB, Dhruva and Sriram, Srinivasan and Lemmon, Jay and Merel, Josh and Wayne, Greg and Tassa, Yuval and Erez, Tom and Wang, Ziyu and Eslami, S. M. Ali and Riedmiller, Martin and Silver, David   [|•|]
[|•|] Dropout Q-Functions for Doubly Efficient Reinforcement Learning (2021)   -   Hiraoka, Takuya and Imagawa, Takahisa and Hashimoto, Taisei and Onishi, Takashi and Tsuruoka, Yoshimasa   [|•|]
[|•|] Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets ()   -   Hong, Zhang-Wei and Kumar, Aviral and Karnik, Sathwik and Bhandwaldar, Abhishek and Srivastava, Akash and Pajarinen, Joni and Laroche, Romain and Gupta, Abhishek and Agrawal, Pulkit   [|•|]
[|•|] Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay (2016)   -   Hosu, Ionel-Alexandru and Rebedea, Traian   [|•|]
[|•|] Evolved Policy Gradients (2018)   -   Houthooft, Rein and Chen, Richard Y. and Isola, Phillip and Stadie, Bradly C. and Wolski, Filip and Ho, Jonathan and Abbeel, Pieter   [|•|]
[|•|] Planning Goals for Exploration (2023)   -   Hu, Edward S. and Chang, Richard and Rybkin, Oleh and Jayaraman, Dinesh   [|•|]
[|•|] A2C is a special case of PPO (2022)   -   Huang, Shengyi and Kanervisto, Anssi and Raffin, Antonin and Wang, Weixun and Ontañón, Santiago and Dossa, Rousslan Fernand Julien   [|•|]
[|•|] Dissecting Deep RL with High Update Ratios: Combatting Value Overestimation and Divergence (2024)   -   Hussing, Marcel and Voelcker, Claas and Gilitschenski, Igor and Farahmand, Amir-massoud and Eaton, Eric   [|•|]
[|•|] Broadly-Exploring, Local-Policy Trees for Long-Horizon Task Planning (2020)   -   Ichter, Brian and Sermanet, Pierre and Lynch, Corey   [|•|]
[|•|] A Closer Look at Deep Policy Gradients (2018)   -   Ilyas, Andrew and Engstrom, Logan and Santurkar, Shibani and Tsipras, Dimitris and Janoos, Firdaus and Rudolph, Larry and Madry, Aleksander   [|•|]
[|•|] Improving Regression Performance with Distributional Losses (2018)   -   Imani, Ehsan and White, Martha   [|•|]
[|•|] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (2015)   -   Ioffe, Sergey and Szegedy, Christian   [|•|]
[|•|] Population Based Training of Neural Networks (2017)   -   Jaderberg, Max and Dalibard, Valentin and Osindero, Simon and Czarnecki, Wojciech M. and Donahue, Jeff and Razavi, Ali and Vinyals, Oriol and Green, Tim and Dunning, Iain and Simonyan, Karen and Fernando, Chrisantha and Kavukcuoglu, Koray   [|•|]
[|•|] Reinforcement Learning with Unsupervised Auxiliary Tasks (2016)   -   Jaderberg, Max and Mnih, Volodymyr and Czarnecki, Wojciech Marian and Schaul, Tom and Leibo, Joel Z. and Silver, David and Kavukcuoglu, Koray   [|•|]
[|•|] Attention is not Explanation (2019)   -   Jain, Sarthak and Wallace, Byron C.   [|•|]
[|•|] Task-Embedded Control Networks for Few-Shot Imitation Learning (2018)   -   James, Stephen and Bloesch, Michael and Davison, Andrew J.   [|•|]
[|•|] 3D Simulation for Robot Arm Control with Deep Q-Learning (2016)   -   James, Stephen and Johns, Edward   [|•|]
[|•|] When to Trust Your Model: Model-Based Policy Optimization (2019)   -   Janner, Michael and Fu, Justin and Zhang, Marvin and Levine, Sergey   [|•|]
[|•|] Offline Reinforcement Learning as One Big Sequence Modeling Problem (2021)   -   Janner, Michael and Li, Qiyang and Levine, Sergey   [|•|]
[|•|] Fast Marching Tree: a Fast Marching Sampling-Based Method for Optimal Motion Planning in Many Dimensions (2013)   -   Janson, Lucas and Schmerling, Edward and Clark, Ashley and Pavone, Marco   [|•|]
[|•|] gradSim: Differentiable simulation for system identification and visuomotor control (2021)   -   Jatavallabhula, Krishna Murthy and Macklin, Miles and Golemo, Florian and Voleti, Vikram and Petrini, Linda and Weiss, Martin and Considine, Breandan and Parent-Levesque, Jerome and Xie, Kevin and Erleben, Kenny and Paull, Liam and Shkurti, Florian and Nowrouzezahrai, Derek and Fidler, Sanja   [|•|]
[|•|] Benchmarking Potential Based Rewards for Learning Humanoid Locomotion (2023)   -   Jeon, Se Hwan and Heim, Steve and Khazoom, Charles and Kim, Sangbae   [|•|]
[|•|] Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic (2023)   -   Ji, Tianying and Luo, Yu and Sun, Fuchun and Zhan, Xianyuan and Zhang, Jianwei and Xu, Huazhe   [|•|]
[|•|] Population-Guided Parallel Policy Search for Reinforcement Learning (2020)   -   Jung, Whiyoung and Park, Giseung and Sung, Youngchul   [|•|]
[|•|] Uncertainty-Aware Reinforcement Learning for Collision Avoidance (2017)   -   Kahn, Gregory and Villaflor, Adam and Pong, Vitchyr and Abbeel, Pieter and Levine, Sergey   [|•|]
[|•|] QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation (2018)   -   Kalashnikov, Dmitry and Irpan, Alex and Pastor, Peter and Ibarz, Julian and Herzog, Alexander and Jang, Eric and Quillen, Deirdre and Holly, Ethan and Kalakrishnan, Mrinal and Vanhoucke, Vincent and Levine, Sergey   [|•|]
[|•|] Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching (2021)   -   Kamienny, Pierre-Alexandre and Tarbouriech, Jean and Lamprier, Sylvain and Lazaric, Alessandro and Denoyer, Ludovic   [|•|]
[|•|] Recent Advances in Path Integral Control for Trajectory Optimization: An Overview in Theoretical and Algorithmic Perspectives (2023)   -   Kazim, Muhammad and Hong, JunGee and Kim, Min-Gyeom and Kim, Kwang-Ki K.   [|•|]
[|•|] Learning Stable Normalizing-Flow Control for Robotic Manipulation (2020)   -   Khader, Shahbaz Abdul and Yin, Hang and Falco, Pietro and Kragic, Danica   [|•|]
[|•|] Supervised Contrastive Learning (2020)   -   Khosla, Prannay and Teterwak, Piotr and Wang, Chen and Sarna, Aaron and Tian, Yonglong and Isola, Phillip and Maschinot, Aaron and Liu, Ce and Krishnan, Dilip   [|•|]
[|•|] Improving Variational Inference with Inverse Autoregressive Flow (2016)   -   Kingma, Diederik P. and Salimans, Tim and Jozefowicz, Rafal and Chen, Xi and Sutskever, Ilya and Welling, Max   [|•|]
[|•|] An Introduction to Variational Autoencoders (2019)   -   Kingma, Diederik P. and Welling, Max   [|•|]
[|•|] Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning (2019)   -   Koller, Torsten and Berkenkamp, Felix and Turchetta, Matteo and Boedecker, Joschka and Krause, Andreas   [|•|]
[|•|] Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning (2018)   -   Kostrikov, Ilya and Agrawal, Kumar Krishna and Dwibedi, Debidatta and Levine, Sergey and Tompson, Jonathan   [|•|]
[|•|] Offline Reinforcement Learning with Implicit Q-Learning (2021)   -   Kostrikov, Ilya and Nair, Ashvin and Levine, Sergey   [|•|]
[|•|] Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels (2020)   -   Kostrikov, Ilya and Yarats, Denis and Fergus, Rob   [|•|]
[|•|] An Efficiently Solvable Quadratic Program for Stabilizing Dynamic Locomotion (2013)   -   Kuindersma, Scott and Permenter, Frank and Tedrake, Russ   [|•|]
[|•|] Deep Successor Reinforcement Learning (2016)   -   Kulkarni, Tejas D. and Saeedi, Ardavan and Gautam, Simanta and Gershman, Samuel J.   [|•|]
[|•|] RMA: Rapid Motor Adaptation for Legged Robots (2021)   -   Kumar, Ashish and Fu, Zipeng and Pathak, Deepak and Malik, Jitendra   [|•|]
[|•|] Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction (2019)   -   Kumar, Aviral and Fu, Justin and Tucker, George and Levine, Sergey   [|•|]
[|•|] DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction (2020)   -   Kumar, Aviral and Gupta, Abhishek and Levine, Sergey   [|•|]
[|•|] Expanding Motor Skills through Relay Neural Networks (2017)   -   Kumar, Visak C. V. and Ha, Sehoon and Liu, C. Karen   [|•|]
[|•|] Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics (2020)   -   Kuznetsov, Arsenii and Shvechikov, Pavel and Grishin, Alexander and Vetrov, Dmitry   [|•|]
[|•|] Exploration in Deep Reinforcement Learning: A Survey (2022)   -   Ladosz, Pawel and Weng, Lilian and Kim, Minwoo and Oh, Hyondong   [|•|]
[|•|] Professor Forcing: A New Algorithm for Training Recurrent Networks (2016)   -   Lamb, Alex and Goyal, Anirudh and Zhang, Ying and Zhang, Saizheng and Courville, Aaron and Bengio, Yoshua   [|•|]
[|•|] CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery (2022)   -   Laskin, Michael and Liu, Hao and Peng, Xue Bin and Yarats, Denis and Rajeswaran, Aravind and Abbeel, Pieter   [|•|]
[|•|] Optimal Control via Combined Inference and Numerical Optimization (2021)   -   Layeghi, Daniel and Tonneau, Steve and Mistry, Michael   [|•|]
[|•|] Contrastive Representation Learning: A Framework and Review (2020)   -   Le-Khac, Phuc H. and Healy, Graham and Smeaton, Alan F.   [|•|]
[|•|] Robust Recovery Controller for a Quadrupedal Robot using Deep Reinforcement Learning (2019)   -   Lee, Joonho and Hwangbo, Jemin and Hutter, Marco   [|•|]
[|•|] Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization (2021)   -   Lee, Youngwoon and Lim, Joseph J. and Anandkumar, Anima and Zhu, Yuke   [|•|]
[|•|] Safe Mutations for Deep and Recurrent Neural Networks through Output Gradients (2017)   -   Lehman, Joel and Chen, Jay and Clune, Jeff and Stanley, Kenneth O.   [|•|]
[|•|] State Representation Learning for Control: An Overview (2018)   -   Lesort, Timothée and Díaz-Rodríguez, Natalia and Goudou, Jean-François and Filliat, David   [|•|]
[|•|] End-to-End Training of Deep Visuomotor Policies (2015)   -   Levine, Sergey and Finn, Chelsea and Darrell, Trevor and Abbeel, Pieter   [|•|]
[|•|] Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection (2016)   -   Levine, Sergey and Pastor, Peter and Krizhevsky, Alex and Quillen, Deirdre   [|•|]
[|•|] Learning Multi-Level Hierarchies with Hindsight (2017)   -   Levy, Andrew and Konidaris, George and Platt, Robert and Saenko, Kate   [|•|]
[|•|] Visualizing the Loss Landscape of Neural Nets (2017)   -   Li, Hao and Xu, Zheng and Taylor, Gavin and Studer, Christoph and Goldstein, Tom   [|•|]
[|•|] Hierarchical Planning Through Goal-Conditioned Offline Reinforcement Learning (2022)   -   Li, Jinning and Tang, Chen and Tomizuka, Masayoshi and Zhan, Wei   [|•|]
[|•|] Understanding the Complexity Gains of Single-Task RL with a Curriculum (2022)   -   Li, Qiyang and Zhai, Yuexiang and Ma, Yi and Levine, Sergey   [|•|]
[|•|] Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning (2019)   -   Li, Richard and Jabri, Allan and Darrell, Trevor and Agrawal, Pulkit   [|•|]
[|•|] Active Hierarchical Exploration with Stable Subgoal Representation Learning (2021)   -   Li, Siyuan and Zhang, Jin and Wang, Jianhao and Yu, Yang and Zhang, Chongjie   [|•|]
[|•|] Solving Compositional Reinforcement Learning Problems via Task Reduction (2021)   -   Li, Yunfei and Wu, Yilin and Xu, Huazhe and Wang, Xiaolong and Wu, Yi   [|•|]
[|•|] Robust and Versatile Bipedal Jumping Control through Reinforcement Learning (2023)   -   Li, Zhongyu and Peng, Xue Bin and Abbeel, Pieter and Levine, Sergey and Berseth, Glen and Sreenath, Koushil   [|•|]
[|•|] Continuous control with deep reinforcement learning (2015)   -   Lillicrap, Timothy P. and Hunt, Jonathan J. and Pritzel, Alexander and Heess, Nicolas and Erez, Tom and Tassa, Yuval and Silver, David and Wierstra, Daan   [|•|]
[|•|] Dynamics-Aware Quality-Diversity for Efficient Learning of Skill Repertoires (2021)   -   Lim, Bryan and Grillotti, Luca and Bernasconi, Lorenzo and Cully, Antoine   [|•|]
[|•|] Learning Null Space Projections in Operational Space Formulation (2016)   -   Lin, Hsiu-Chin and Howard, Matthew   [|•|]
[|•|] From Motor Control to Team Play in Simulated Humanoid Football (2021)   -   Liu, Siqi and Lever, Guy and Wang, Zhe and Merel, Josh and Eslami, S. M. Ali and Hennes, Daniel and Czarnecki, Wojciech M. and Tassa, Yuval and Omidshafiei, Shayegan and Abdolmaleki, Abbas and Siegel, Noah Y. and Hasenclever, Leonard and Marris, Luke and Tunyasuvunakool, Saran and Song, H. Francis and Wulfmeier, Markus and Muller, Paul and Haarnoja, Tuomas and Tracey, Brendan D. and Tuyls, Karl and Graepel, Thore and Heess, Nicolas   [|•|]
[|•|] Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations (2018)   -   Locatello, Francesco and Bauer, Stefan and Lucic, Mario and Rätsch, Gunnar and Gelly, Sylvain and Schölkopf, Bernhard and Bachem, Olivier   [|•|]
[|•|] Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control (2018)   -   Lowrey, Kendall and Rajeswaran, Aravind and Kakade, Sham and Todorov, Emanuel and Mordatch, Igor   [|•|]
[|•|] Reset-Free Lifelong Learning with Skill-Space Planning (2020)   -   Lu, Kevin and Grover, Aditya and Abbeel, Pieter and Mordatch, Igor   [|•|]
[|•|] A Unified Approach to Interpreting Model Predictions (2017)   -   Lundberg, Scott and Lee, Su-In   [|•|]
[|•|] Self-Imitation Learning by Planning (2021)   -   Luo, Sha and Kasaei, Hamidreza and Schomaker, Lambert   [|•|]
[|•|] Dynamics-Regulated Kinematic Policy for Egocentric Pose Estimation (2021)   -   Luo, Zhengyi and Hachiuma, Ryo and Yuan, Ye and Kitani, Kris   [|•|]
[|•|] Embodied Scene-aware Human Pose Estimation (2022)   -   Luo, Zhengyi and Iwase, Shun and Yuan, Ye and Kitani, Kris   [|•|]
[|•|] From Universal Humanoid Control to Automatic Physically Valid Character Creation (2022)   -   Luo, Zhengyi and Yuan, Ye and Kitani, Kris M.   [|•|]
[|•|] Learning Dynamics and Generalization in Reinforcement Learning (2022)   -   Lyle, Clare and Rowland, Mark and Dabney, Will and Kwiatkowska, Marta and Gal, Yarin   [|•|]
[|•|] Learning Latent Plans from Play (2019)   -   Lynch, Corey and Khansari, Mohi and Xiao, Ted and Kumar, Vikash and Tompson, Jonathan and Levine, Sergey and Sermanet, Pierre   [|•|]
[|•|] Learn Zero-Constraint-Violation Policy in Model-Free Constrained Reinforcement Learning (2021)   -   Ma, Haitong and Liu, Changliu and Li, Shengbo Eben and Zheng, Sifa and Sun, Wenchao and Chen, Jianyu   [|•|]
[|•|] Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics (2017)   -   Mahler, Jeffrey and Liang, Jacky and Niyaz, Sherdil and Laskey, Michael and Doan, Richard and Liu, Xinyu and Ojea, Juan Aparicio and Goldberg, Ken   [|•|]
[|•|] Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning (2021)   -   Makoviychuk, Viktor and Wawrzyniak, Lukasz and Guo, Yunrong and Lu, Michelle and Storey, Kier and Macklin, Miles and Hoeller, David and Rudin, Nikita and Allshire, Arthur and Handa, Ankur and State, Gavriel   [|•|]
[|•|] Hamilton-Jacobi formulation for reach-avoid differential games (2009)   -   Margellos, Kostas and Lygeros, John   [|•|]
[|•|] New insights and perspectives on the natural gradient method (2014)   -   Martens, James   [|•|]
[|•|] Speed learning on the fly (2015)   -   Massé, Pierre-Yves and Ollivier, Yann   [|•|]
[|•|] PBCS : Efficient Exploration and Exploitation Using a Synergy between Reinforcement Learning and Motion Planning (2020)   -   Matheron, Guillaume and Perrin, Nicolas and Sigaud, Olivier   [|•|]
[|•|] The problem with DDPG: understanding failures in deterministic environments with sparse rewards (2019)   -   Matheron, Guillaume and Perrin, Nicolas and Sigaud, Olivier   [|•|]
[|•|] Leveraging exploration in off-policy algorithms via normalizing flows (2019)   -   Mazoure, Bogdan and Doan, Thang and Durand, Audrey and Hjelm, R. Devon and Pineau, Joelle   [|•|]
[|•|] State Representation Learning from Demonstration (2019)   -   Merckling, Astrid and Coninx, Alexandre and Cressot, Loic and Doncieux, Stéphane and Perrin-Gilbert, Nicolas   [|•|]
[|•|] Modified Actor-Critics (2019)   -   Merdivan, Erinc and Hanke, Sten and Geist, Matthieu   [|•|]
[|•|] Neural probabilistic motor primitives for humanoid control (2018)   -   Merel, Josh and Hasenclever, Leonard and Galashov, Alexandre and Ahuja, Arun and Pham, Vu and Wayne, Greg and Teh, Yee Whye and Heess, Nicolas   [|•|]
[|•|] Catch & Carry: Reusable Neural Controllers for Vision-Guided Whole-Body Tasks (2019)   -   Merel, Josh and Tunyasuvunakool, Saran and Ahuja, Arun and Tassa, Yuval and Hasenclever, Leonard and Pham, Vu and Erez, Tom and Wayne, Greg and Heess, Nicolas   [|•|]
[|•|] Discrete Sequential Prediction of Continuous Actions for Deep RL (2017)   -   Metz, Luke and Ibarz, Julian and Jaitly, Navdeep and Davidson, James   [|•|]
[|•|] Transformers are Sample-Efficient World Models (2022)   -   Micheli, Vincent and Alonso, Eloi and Fleuret, François   [|•|]
[|•|] Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt (2022)   -   Mindermann, Sören and Brauner, Jan and Razzak, Muhammed and Sharma, Mrinank and Kirsch, Andreas and Xu, Winnie and Höltgen, Benedikt and Gomez, Aidan N. and Morisot, Adrien and Farquhar, Sebastian and Gal, Yarin   [|•|]
[|•|] Longitudinal high-throughput TCR repertoire profiling reveals the dynamics of T cell memory formation after mild COVID-19 infection (2020)   -   Minervina, Anastasia A. and Komech, Ekaterina A. and Titov, Aleksei and Koraichi, Meriem Bensouda and Rosati, Elisa and Mamedov, Ilgar Z. and Franke, Andre and Efimov, Grigory A. and Chudakov, Dmitriy M. and Mora, Thierry and Walczak, Aleksandra M. and Lebedev, Yuri B. and Pogorelyy, Mikhail V.   [|•|]
[|•|] A geometrical introduction to screw theory (2012)   -   Minguzzi, E.   [|•|]
[|•|] Computational Geometry Column 42 (2001)   -   Mitchell, Joseph S. B. and O’Rourke, Joseph   [|•|]
[|•|] Asynchronous Methods for Deep Reinforcement Learning (2016)   -   Mnih, Volodymyr and Badia, Adrià Puigdomènech and Mirza, Mehdi and Graves, Alex and Lillicrap, Timothy P. and Harley, Tim and Silver, David and Kavukcuoglu, Koray   [|•|]
[|•|] Reinforcement Learning with Probabilistically Complete Exploration (2020)   -   Morere, Philippe and Francis, Gilad and Blau, Tom and Ramos, Fabio   [|•|]
[|•|] Convolutional neural network models for cancer type prediction based on gene expression (2019)   -   Mostavi, Milad and Chiu, Yu-Chiao and Huang, Yufei and Chen, Yidong   [|•|]
[|•|] Illuminating search spaces by mapping elites (2015)   -   Mouret, Jean-Baptiste and Clune, Jeff   [|•|]
[|•|] Regularizing Action Policies for Smooth Control with Reinforcement Learning (2020)   -   Mysore, Siddharth and Mabsout, Bassel and Mancuso, Renato and Saenko, Kate   [|•|]
[|•|] Near-Optimal Representation Learning for Hierarchical Reinforcement Learning (2018)   -   Nachum, Ofir and Gu, Shixiang and Lee, Honglak and Levine, Sergey   [|•|]
[|•|] Smoothed Action Value Functions for Learning Gaussian Policies (2018)   -   Nachum, Ofir and Norouzi, Mohammad and Tucker, George and Schuurmans, Dale   [|•|]
[|•|] Trust-PCL: An Off-Policy Trust Region Method for Continuous Control (2017)   -   Nachum, Ofir and Norouzi, Mohammad and Xu, Kelvin and Schuurmans, Dale   [|•|]
[|•|] Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning? (2019)   -   Nachum, Ofir and Tang, Haoran and Lu, Xingyu and Gu, Shixiang and Lee, Honglak and Levine, Sergey   [|•|]
[|•|] AWAC: Accelerating Online Reinforcement Learning with Offline Datasets (2020)   -   Nair, Ashvin and Gupta, Abhishek and Dalal, Murtaza and Levine, Sergey   [|•|]
[|•|] Overcoming Exploration in Reinforcement Learning with Demonstrations (2017)   -   Nair, Ashvin and McGrew, Bob and Andrychowicz, Marcin and Zaremba, Wojciech and Abbeel, Pieter   [|•|]
[|•|] Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning (2023)   -   Nakamoto, Mitsuhiko and Zhai, Yuexiang and Singh, Anikait and Mark, Max Sobol and Ma, Yi and Finn, Chelsea and Kumar, Aviral and Levine, Sergey   [|•|]
[|•|] Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics (2020)   -   Neunert, Michael and Abdolmaleki, Abbas and Wulfmeier, Markus and Lampe, Thomas and Springenberg, Jost Tobias and Hafner, Roland and Romano, Francesco and Buchli, Jonas and Heess, Nicolas and Riedmiller, Martin   [|•|]
[|•|] Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images (2014)   -   Nguyen, Anh and Yosinski, Jason and Clune, Jeff   [|•|]
[|•|] When Does Stochastic Gradient Algorithm Work Well? (2018)   -   Nguyen, Lam M. and Nguyen, Nam H. and Phan, Dzung T. and Kalagnanam, Jayant R. and Scheinberg, Katya   [|•|]
[|•|] On First-Order Meta-Learning Algorithms (2018)   -   Nichol, Alex and Achiam, Joshua and Schulman, John   [|•|]
[|•|] The Primacy Bias in Deep Reinforcement Learning (2022)   -   Nikishin, Evgenii and Schwarzer, Max and D’Oro, Pierluca and Bacon, Pierre-Luc and Courville, Aaron   [|•|]
[|•|] Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles (2011)   -   Ollivier, Yann and Arnold, Ludovic and Auger, Anne and Hansen, Nikolaus   [|•|]
[|•|] Training recurrent networks online without backtracking (2015)   -   Ollivier, Yann and Tallec, Corentin and Charpiat, Guillaume   [|•|]
[|•|] Riemannian metrics for neural networks II: recurrent networks and learning symbolic data sequences (2013)   -   Ollivier, Yann   [|•|]
[|•|] True Asymptotic Natural Gradient Optimization (2017)   -   Ollivier, Yann   [|•|]
[|•|] Representation Learning with Contrastive Predictive Coding (2018)   -   Oord, Aaron van den and Li, Yazhe and Vinyals, Oriol   [|•|]
[|•|] Dota 2 with Large Scale Deep Reinforcement Learning (2019)   -   OpenAI, and :, and Berner, Christopher and Brockman, Greg and Chan, Brooke and Cheung, Vicki and Dębiak, Przemysław and Dennison, Christy and Farhi, David and Fischer, Quirin and Hashme, Shariq and Hesse, Chris and Józefowicz, Rafal and Gray, Scott and Olsson, Catherine and Pachocki, Jakub and Petrov, Michael and Pinto, Henrique P. d. O. and Raiman, Jonathan and Salimans, Tim and Schlatter, Jeremy and Schneider, Jonas and Sidor, Szymon and Sutskever, Ilya and Tang, Jie and Wolski, Filip and Zhang, Susan   [|•|]
[|•|] Deep Exploration via Bootstrapped DQN (2016)   -   Osband, Ian and Blundell, Charles and Pritzel, Alexander and Roy, Benjamin Van   [|•|]
[|•|] Can Increasing Input Dimensionality Improve Deep Reinforcement Learning? (2020)   -   Ota, Kei and Oiki, Tomoaki and Jha, Devesh K. and Mariyama, Toshisada and Nikovski, Daniel   [|•|]
[|•|] Vector Quantized Models for Planning (2021)   -   Ozair, Sherjil and Li, Yazhe and Razavi, Ali and Antonoglou, Ioannis and Oord, Aäron van den and Vinyals, Oriol   [|•|]
[|•|] Making Efficient Use of Demonstrations to Solve Hard Exploration Problems (2019)   -   Paine, Tom Le and Gulcehre, Caglar and Shahriari, Bobak and Denil, Misha and Hoffman, Matt and Soyer, Hubert and Tanburn, Richard and Kapturowski, Steven and Rabinowitz, Neil and Williams, Duncan and Barth-Maron, Gabriel and Wang, Ziyu and Freitas, Nando de and Team, Worlds   [|•|]
[|•|] TD-Regularized Actor-Critic Methods (2018)   -   Parisi, Simone and Tangkaratt, Voot and Peters, Jan and Khan, Mohammad Emtiyaz   [|•|]
[|•|] Lipschitz-constrained Unsupervised Skill Discovery (2022)   -   Park, Seohong and Choi, Jongwook and Kim, Jaekyeom and Lee, Honglak and Kim, Gunhee   [|•|]
[|•|] Controllability-Aware Unsupervised Skill Discovery (2023)   -   Park, Seohong and Lee, Kimin and Lee, Youngwoon and Abbeel, Pieter   [|•|]
[|•|] Predictable MDP Abstraction for Unsupervised Model-Based RL (2023)   -   Park, Seohong and Levine, Sergey   [|•|]
[|•|] Effective Diversity in Population Based Reinforcement Learning (2020)   -   Parker-Holder, Jack and Pacchiano, Aldo and Choromanski, Krzysztof and Roberts, Stephen   [|•|]
[|•|] Revisiting Natural Gradient for Deep Networks (2013)   -   Pascanu, Razvan and Bengio, Yoshua   [|•|]
[|•|] Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates (2019)   -   Penedones, Hugo and Riquelme, Carlos and Vincent, Damien and Maennel, Hartmut and Mann, Timothy and Barreto, Andre and Gelly, Sylvain and Neu, Gergely   [|•|]
[|•|] MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies (2019)   -   Peng, Xue Bin and Chang, Michael and Zhang, Grace and Abbeel, Pieter and Levine, Sergey   [|•|]
[|•|] Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning (2019)   -   Peng, Xue Bin and Kumar, Aviral and Zhang, Grace and Levine, Sergey   [|•|]
[|•|] AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control (2021)   -   Peng, Xue Bin and Ma, Ze and Abbeel, Pieter and Levine, Sergey and Kanazawa, Angjoo   [|•|]
[|•|] Learning Locomotion Skills Using DeepRL: Does the Choice of Action Space Matter? (2016)   -   Peng, Xue Bin and Panne, Michiel van de   [|•|]
[|•|] Non-local Policy Optimization via Diversity-regularized Collaborative Exploration (2020)   -   Peng, Zhenghao and Sun, Hao and Zhou, Bolei   [|•|]
[|•|] Accelerating Reinforcement Learning with Learned Skill Priors (2020)   -   Pertsch, Karl and Lee, Youngwoon and Lim, Joseph J.   [|•|]
[|•|] Demonstration-Guided Reinforcement Learning with Learned Skills (2021)   -   Pertsch, Karl and Lee, Youngwoon and Wu, Yue and Lim, Joseph J.   [|•|]
[|•|] TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis (2023)   -   Petrovich, Mathis and Black, Michael J. and Varol, Gül   [|•|]
[|•|] Computational Optimal Transport (2018)   -   Peyré, Gabriel and Cuturi, Marco   [|•|]
[|•|] Learning Compositional Neural Programs with Recursive Tree Search and Planning (2019)   -   Pierrot, Thomas and Ligner, Guillaume and Reed, Scott and Sigaud, Olivier and Perrin, Nicolas and Laterre, Alexandre and Kas, David and Beguir, Karim and Freitas, Nando de   [|•|]
[|•|] Diversity Policy Gradient for Sample Efficient Quality-Diversity Optimization (2020)   -   Pierrot, Thomas and Macé, Valentin and Chalumeau, Félix and Flajolet, Arthur and Cideron, Geoffrey and Beguir, Karim and Cully, Antoine and Sigaud, Olivier and Perrin-Gilbert, Nicolas   [|•|]
[|•|] Learning Compositional Neural Programs for Continuous Control (2020)   -   Pierrot, Thomas and Perrin, Nicolas and Behbahani, Feryal and Laterre, Alexandre and Sigaud, Olivier and Beguir, Karim and Freitas, Nando de   [|•|]
[|•|] First-order and second-order variants of the gradient descent in a unified framework (2018)   -   Pierrot, Thomas and Perrin, Nicolas and Sigaud, Olivier   [|•|]
[|•|] Multi-Objective Quality Diversity Optimization (2022)   -   Pierrot, Thomas and Richard, Guillaume and Beguir, Karim and Cully, Antoine   [|•|]
[|•|] Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning (2020)   -   Pitis, Silviu and Chan, Harris and Zhao, Stephen and Stadie, Bradly and Ba, Jimmy   [|•|]
[|•|] Skew-Fit: State-Covering Self-Supervised Reinforcement Learning (2019)   -   Pong, Vitchyr H. and Dalal, Murtaza and Lin, Steven and Nair, Ashvin and Bahl, Shikhar and Levine, Sergey   [|•|]
[|•|] Importance mixing: Improving sample reuse in evolutionary policy search methods (2018)   -   Pourchot, Aloïs and Perrin, Nicolas and Sigaud, Olivier   [|•|]
[|•|] Learning to Solve NP-Complete Problems - A Graph Neural Network for Decision TSP (2018)   -   Prates, Marcelo O. R. and Avelar, Pedro H. C. and Lemos, Henrique and Lamb, Luis and Vardi, Moshe   [|•|]
[|•|] A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems (2022)   -   Prudencio, Rafael Figueiredo and Maximo, Marcos R. O. A. and Colombini, Esther Luna   [|•|]
[|•|] Estimating Training Data Influence by Tracing Gradient Descent (2020)   -   Pruthi, Garima and Liu, Frederick and Sundararajan, Mukund and Kale, Satyen   [|•|]
[|•|] Information geometry for multiparameter models: New perspectives on the origin of simplicity (2021)   -   Quinn, Katherine N. and Abbott, Michael C. and Transtrum, Mark K. and Machta, Benjamin B. and Sethna, James P.   [|•|]
[|•|] Automated curricula through setter-solver interactions (2019)   -   Racaniere, Sebastien and Lampinen, Andrew K. and Santoro, Adam and Reichert, David P. and Firoiu, Vlad and Lillicrap, Timothy P.   [|•|]
[|•|] Real-World Humanoid Locomotion with Reinforcement Learning (2023)   -   Radosavovic, Ilija and Xiao, Tete and Zhang, Bike and Darrell, Trevor and Malik, Jitendra and Sreenath, Koushil   [|•|]
[|•|] Smooth Exploration for Robotic Reinforcement Learning (2020)   -   Raffin, Antonin and Kober, Jens and Stulp, Freek   [|•|]
[|•|] Decoupling Value and Policy for Generalization in Reinforcement Learning (2021)   -   Raileanu, Roberta and Fergus, Rob   [|•|]
[|•|] Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations (2017)   -   Rajeswaran, Aravind and Kumar, Vikash and Gupta, Abhishek and Vezzani, Giulia and Schulman, John and Todorov, Emanuel and Levine, Sergey   [|•|]
[|•|] Towards Generalization and Simplicity in Continuous Control (2017)   -   Rajeswaran, Aravind and Lowrey, Kendall and Todorov, Emanuel and Kakade, Sham   [|•|]
[|•|] Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables (2019)   -   Rakelly, Kate and Zhou, Aurick and Quillen, Deirdre and Finn, Chelsea and Levine, Sergey   [|•|]
[|•|] Contrastive Language, Action, and State Pre-training for Robot Learning (2023)   -   Rana, Krishan and Melnik, Andrew and Sünderhauf, Niko   [|•|]
[|•|] Residual Skill Policies: Learning an Adaptable Skill-based Action Space for Reinforcement Learning for Robotics (2022)   -   Rana, Krishan and Xu, Ming and Tidd, Brendan and Milford, Michael and Sünderhauf, Niko   [|•|]
[|•|] Euclideanizing Flows: Diffeomorphic Reduction for Learning Stable Dynamical Systems (2020)   -   Rana, Muhammad Asif and Li, Anqi and Fox, Dieter and Boots, Byron and Ramos, Fabio and Ratliff, Nathan   [|•|]
[|•|] On the Convergence of Adam and Beyond (2019)   -   Reddi, Sashank J. and Kale, Satyen and Kumar, Sanjiv   [|•|]
[|•|] SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards (2019)   -   Reddy, Siddharth and Dragan, Anca D. and Levine, Sergey   [|•|]
[|•|] Neural Programmer-Interpreters (2015)   -   Reed, Scott and Freitas, Nando de   [|•|]
[|•|] Successor Feature Representations (2021)   -   Reinke, Chris and Alameda-Pineda, Xavier   [|•|]
[|•|] Extended Tree Search for Robot Task and Motion Planning (2021)   -   Ren, Tianyu and Chalvatzaki, Georgia and Peters, Jan   [|•|]
[|•|] Backplay: "Man muss immer umkehren" (2018)   -   Resnick, Cinjon and Raileanu, Roberta and Kapoor, Sanyam and Peysakhovich, Alexander and Cho, Kyunghyun and Bruna, Joan   [|•|]
[|•|] Variational Inference with Normalizing Flows (2015)   -   Rezende, Danilo Jimenez and Mohamed, Shakir   [|•|]
[|•|] Learning by Playing - Solving Sparse Reward Tasks from Scratch (2018)   -   Riedmiller, Martin and Hafner, Roland and Lampe, Thomas and Neunert, Michael and Degrave, Jonas and Wiele, Tom Van de and Mnih, Volodymyr and Heess, Nicolas and Springenberg, Jost Tobias   [|•|]
[|•|] A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets (2012)   -   Roux, Nicolas Le and Schmidt, Mark and Bach, Francis   [|•|]
[|•|] An Analysis of Categorical Distributional Reinforcement Learning (2018)   -   Rowland, Mark and Bellemare, Marc G. and Dabney, Will and Munos, Rémi and Teh, Yee Whye   [|•|]
[|•|] An overview of gradient descent optimization algorithms (2016)   -   Ruder, Sebastian   [|•|]
[|•|] Generative Class-conditional Autoencoders (2014)   -   Rudy, Jan and Taylor, Graham   [|•|]
[|•|] CAQL: Continuous Action Q-Learning (2019)   -   Ryu, Moonkyung and Chow, Yinlam and Anderson, Ross and Tjandraatmadja, Christian and Boutilier, Craig   [|•|]
[|•|] Dynamic Routing Between Capsules (2017)   -   Sabour, Sara and Frosst, Nicholas and Hinton, Geoffrey E.   [|•|]
[|•|] Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks (2016)   -   Salimans, Tim and Kingma, Diederik P.   [|•|]
[|•|] Graph networks as learnable physics engines for inference and control (2018)   -   Sanchez-Gonzalez, Alvaro and Heess, Nicolas and Springenberg, Jost Tobias and Merel, Josh and Riedmiller, Martin and Hadsell, Raia and Battaglia, Peter   [|•|]
[|•|] Group Sparse Regularization for Deep Neural Networks (2016)   -   Scardapane, Simone and Comminiello, Danilo and Hussain, Amir and Uncini, Aurelio   [|•|]
[|•|] Prioritized Experience Replay (2015)   -   Schaul, Tom and Quan, John and Antonoglou, Ioannis and Silver, David   [|•|]
[|•|] Generative Adversarial Networks are Special Cases of Artificial Curiosity (1990) and also Closely Related to Predictability Minimization (1991) (2019)   -   Schmidhuber, Juergen   [|•|]
[|•|] Learning to Modulate pre-trained Models in RL (2023)   -   Schmied, Thomas and Hofmarcher, Markus and Paischer, Fabian and Pascanu, Razvan and Hochreiter, Sepp   [|•|]
[|•|] Improving Model-Based Reinforcement Learning with Internal State Representations through Self-Supervision (2021)   -   Scholz, Julien and Weber, Cornelius and Hafez, Muhammad Burhan and Wermter, Stefan   [|•|]
[|•|] Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model (2019)   -   Schrittwieser, Julian and Antonoglou, Ioannis and Hubert, Thomas and Simonyan, Karen and Sifre, Laurent and Schmitt, Simon and Guez, Arthur and Lockhart, Edward and Hassabis, Demis and Graepel, Thore and Lillicrap, Timothy and Silver, David   [|•|]
[|•|] Universal Value Density Estimation for Imitation Learning and Goal-Conditioned Reinforcement Learning (2020)   -   Schroecker, Yannick and Isbell, Charles   [|•|]
[|•|] Trust Region Policy Optimization (2015)   -   Schulman, John and Levine, Sergey and Moritz, Philipp and Jordan, Michael I. and Abbeel, Pieter   [|•|]
[|•|] High-Dimensional Continuous Control Using Generalized Advantage Estimation (2015)   -   Schulman, John and Moritz, Philipp and Levine, Sergey and Jordan, Michael and Abbeel, Pieter   [|•|]
[|•|] Proximal Policy Optimization Algorithms (2017)   -   Schulman, John and Wolski, Filip and Dhariwal, Prafulla and Radford, Alec and Klimov, Oleg   [|•|]
[|•|] DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems (2022)   -   Schumacher, Pierre and Häufle, Daniel and Büchler, Dieter and Schmitt, Syn and Martius, Georg   [|•|]
[|•|] Simultaneously Learning Vision and Feature-based Control Policies for Real-world Ball-in-a-Cup (2019)   -   Schwab, Devin and Springenberg, Tobias and Martins, Murilo F. and Lampe, Thomas and Neunert, Michael and Abdolmaleki, Abbas and Hertweck, Tim and Hafner, Roland and Nori, Francesco and Riedmiller, Martin   [|•|]
[|•|] Bigger, Better, Faster: Human-level Atari with human-level efficiency (2023)   -   Schwarzer, Max and Obando-Ceron, Johan and Courville, Aaron and Bellemare, Marc and Agarwal, Rishabh and Castro, Pablo Samuel   [|•|]
[|•|] State Entropy Maximization with Random Encoders for Efficient Exploration (2021)   -   Seo, Younggyo and Chen, Lili and Shin, Jinwoo and Lee, Honglak and Abbeel, Pieter and Lee, Kimin   [|•|]
[|•|] Is Bang-Bang Control All You Need?: Solving Continuous Control with Bernoulli Policies (2021)   -   Seyde, Tim and Gilitschenski, Igor and Schwarting, Wilko and Stellato, Bartolomeo and Riedmiller, Martin and Wulfmeier, Markus and Rus, Daniela   [|•|]
[|•|] Learning to Plan Optimistically: Uncertainty-Guided Deep Exploration via Latent Model Ensembles (2020)   -   Seyde, Tim and Schwarting, Wilko and Karaman, Sertac and Rus, Daniela   [|•|]
[|•|] Solving Continuous Control via Q-learning (2022)   -   Seyde, Tim and Werner, Peter and Schwarting, Wilko and Gilitschenski, Igor and Riedmiller, Martin and Rus, Daniela and Wulfmeier, Markus   [|•|]
[|•|] Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning (2020)   -   Sharma, Archit and Ahn, Michael and Levine, Sergey and Kumar, Vikash and Hausman, Karol and Gu, Shixiang   [|•|]
[|•|] Dynamics-Aware Unsupervised Discovery of Skills (2019)   -   Sharma, Archit and Gu, Shixiang and Levine, Sergey and Kumar, Vikash and Hausman, Karol   [|•|]
[|•|] Sequential Interpretability: Methods, Applications, and Future Direction for Understanding Deep Learning Models in the Context of Sequential Data (2020)   -   Shickel, Benjamin and Rashidi, Parisa   [|•|]
[|•|] Residual Policy Learning (2018)   -   Silver, Tom and Allen, Kelsey and Tenenbaum, Josh and Kaelbling, Leslie   [|•|]
[|•|] Parrot: Data-Driven Behavioral Priors for Reinforcement Learning (2020)   -   Singh, Avi and Liu, Huihan and Zhou, Gaoyue and Yu, Albert and Rhinehart, Nicholas and Levine, Sergey   [|•|]
[|•|] Frugal Actor-Critic: Sample Efficient Off-Policy Deep Reinforcement Learning Using Unique Experiences (2024)   -   Singh, Nikhil Kumar and Saha, Indranil   [|•|]
[|•|] SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition (2022)   -   Slack, Dylan and Chow, Yinlam and Dai, Bo and Wichers, Nevan   [|•|]
[|•|] A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning (2022)   -   Smith, Laura and Kostrikov, Ilya and Levine, Sergey   [|•|]
[|•|] Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit (2019)   -   Sohl-Dickstein, Jascha and Kawaguchi, Kenji   [|•|]
[|•|] ES-MAML: Simple Hessian-Free Meta Learning (2019)   -   Song, Xingyou and Gao, Wenbo and Yang, Yuxiang and Choromanski, Krzysztof and Pacchiano, Aldo and Tang, Yunhao   [|•|]
[|•|] Local Search for Policy Iteration in Continuous Control (2020)   -   Springenberg, Jost Tobias and Heess, Nicolas and Mankowitz, Daniel and Merel, Josh and Byravan, Arunkumar and Abdolmaleki, Abbas and Kay, Jackie and Degrave, Jonas and Schrittwieser, Julian and Tassa, Yuval and Buchli, Jonas and Belov, Dan and Riedmiller, Martin   [|•|]
[|•|] Automatically Bounding the Taylor Remainder Series: Tighter Bounds and New Applications (2022)   -   Streeter, Matthew and Dillon, Joshua V.   [|•|]
[|•|] Do Differentiable Simulators Give Better Policy Gradients? (2022)   -   Suh, H. J. Terry and Simchowitz, Max and Zhang, Kaiqing and Tedrake, Russ   [|•|]
[|•|] FISAR: Forward Invariant Safe Reinforcement Learning with a Deep Neural Network-Based Optimize (2020)   -   Sun, Chuangchuang and Kim, Dong-Ki and How, Jonathan P.   [|•|]
[|•|] A Survey of Deep Network Solutions for Learning Control in Robotics: From Reinforcement to Imitation (2016)   -   Tai, Lei and Zhang, Jingwei and Liu, Ming and Boedecker, Joschka and Burgard, Wolfram   [|•|]
[|•|] Novelty Search in Representational Space for Sample Efficient Exploration (2020)   -   Tao, Ruo Yu and François-Lavet, Vincent and Pineau, Joelle   [|•|]
[|•|] DeepMind Control Suite (2018)   -   Tassa, Yuval and Doron, Yotam and Muldal, Alistair and Erez, Tom and Li, Yazhe and Casas, Diego de Las and Budden, David and Abdolmaleki, Abbas and Merel, Josh and Lefrancq, Andrew and Lillicrap, Timothy and Riedmiller, Martin   [|•|]
[|•|] Action Branching Architectures for Deep Reinforcement Learning (2017)   -   Tavakoli, Arash and Pardo, Fabio and Kormushev, Petar   [|•|]
[|•|] On Bonus-Based Exploration Methods in the Arcade Learning Environment (2021)   -   Taïga, Adrien Ali and Fedus, William and Machado, Marlos C. and Courville, Aaron and Bellemare, Marc G.   [|•|]
[|•|] Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning (2021)   -   Team, DeepMind Interactive Agents and Abramson, Josh and Ahuja, Arun and Brussee, Arthur and Carnevale, Federico and Cassin, Mary and Fischer, Felix and Georgiev, Petko and Goldin, Alex and Gupta, Mansi and Harley, Tim and Hill, Felix and Humphreys, Peter C. and Hung, Alden and Landon, Jessica and Lillicrap, Timothy and Merzic, Hamza and Muldal, Alistair and Santoro, Adam and Scully, Guy and Glehn, Tamara von and Wayne, Greg and Wong, Nathaniel and Yan, Chen and Zhu, Rui   [|•|]
[|•|] Open-Ended Learning Leads to Generally Capable Agents (2021)   -   Team, Open Ended Learning and Stooke, Adam and Mahajan, Anuj and Barros, Catarina and Deck, Charlie and Bauer, Jakob and Sygnowski, Jakub and Trebacz, Maja and Jaderberg, Max and Mathieu, Michael and McAleese, Nat and Bradley-Schmieg, Nathalie and Wong, Nathaniel and Porcel, Nicolas and Raileanu, Roberta and Hughes-Fitt, Steph and Dalibard, Valentin and Czarnecki, Wojciech Marian   [|•|]
[|•|] Safe Reinforcement Learning by Imagining the Near Future (2022)   -   Thomas, Garrett and Luo, Yuping and Ma, Tengyu   [|•|]
[|•|] Learning Setup Policies: Reliable Transition Between Locomotion Behaviours (2021)   -   Tidd, Brendan and Hudson, Nicolas and Cosgun, Akansel and Leitner, Jurgen   [|•|]
[|•|] Learning When to Switch: Composing Controllers to Traverse a Sequence of Terrain Artifacts (2020)   -   Tidd, Brendan and Hudson, Nicolas and Cosgun, Akansel and Leitner, Jurgen   [|•|]
[|•|] Behavior Priors for Efficient Reinforcement Learning (2020)   -   Tirumala, Dhruva and Galashov, Alexandre and Noh, Hyeonwoo and Hasenclever, Leonard and Pascanu, Razvan and Schwarz, Jonathan and Desjardins, Guillaume and Czarnecki, Wojciech Marian and Ahuja, Arun and Teh, Yee Whye and Heess, Nicolas   [|•|]
[|•|] Invariant Funnels around Trajectories using Sum-of-Squares Programming (2010)   -   Tobenkin, Mark M. and Manchester, Ian R. and Tedrake, Russ   [|•|]
[|•|] Modular Safety-Critical Control of Legged Robots (2023)   -   Tosun, Berk and Samur, Evren   [|•|]
[|•|] Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards (2019)   -   Trott, Alexander and Zheng, Stephan and Xiong, Caiming and Socher, Richard   [|•|]
[|•|] Jump-Start Reinforcement Learning (2022)   -   Uchendu, Ikechukwu and Xiao, Ted and Lu, Yao and Zhu, Banghua and Yan, Mengyuan and Simon, Joséphine and Bennice, Matthew and Fu, Chuyuan and Ma, Cong and Jiao, Jiantao and Levine, Sergey and Hausman, Karol   [|•|]
[|•|] Discovering the Elite Hypervolume by Leveraging Interspecies Correlation (2018)   -   Vassiliades, Vassilis and Mouret, Jean-Baptiste   [|•|]
[|•|] Attention Is All You Need (2017)   -   Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia   [|•|]
[|•|] Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards (2017)   -   Vecerik, Mel and Hester, Todd and Scholz, Jonathan and Wang, Fumin and Pietquin, Olivier and Piot, Bilal and Heess, Nicolas and Rothörl, Thomas and Lampe, Thomas and Riedmiller, Martin   [|•|]
[|•|] Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks (2017)   -   Velez, Roby and Clune, Jeff   [|•|]
[|•|] SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration (2022)   -   Vezzani, Giulia and Tirumala, Dhruva and Wulfmeier, Markus and Rao, Dushyant and Abdolmaleki, Abbas and Moran, Ben and Haarnoja, Tuomas and Humplik, Jan and Hafner, Roland and Neunert, Michael and Fantacci, Claudio and Hertweck, Tim and Lampe, Thomas and Sadeghi, Fereshteh and Heess, Nicolas and Riedmiller, Martin   [|•|]
[|•|] Implicitly Regularized RL with Implicit Q-Values (2021)   -   Vieillard, Nino and Andrychowicz, Marcin and Raichuk, Anton and Pietquin, Olivier and Geist, Matthieu   [|•|]
[|•|] Krylov Subspace Descent for Deep Learning (2011)   -   Vinyals, Oriol and Povey, Daniel   [|•|]
[|•|] Convex optimization (2021)   -   Vorontsova, Evgeniya and Hildebrand, Roland and Gasnikov, Alexander and Stonyakin, Fedor   [|•|]
[|•|] Explainable CNN-attention Networks (C-Attention Network) for Automated Detection of Alzheimer’s Disease (2020)   -   Wang, Ning and Chen, Mingxuan and Subbalakshmi, K. P.   [|•|]
[|•|] Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation (2019)   -   Wang, Ruohan and Ciliberto, Carlo and Amadori, Pierluigi and Demiris, Yiannis   [|•|]
[|•|] Support-weighted Adversarial Imitation Learning (2020)   -   Wang, Ruohan and Ciliberto, Carlo and Amadori, Pierluigi and Demiris, Yiannis   [|•|]
[|•|] Benchmarking Model-Based Reinforcement Learning (2019)   -   Wang, Tingwu and Bao, Xuchan and Clavera, Ignasi and Hoang, Jerrick and Wen, Yeming and Langlois, Eric and Zhang, Shunshi and Zhang, Guodong and Abbeel, Pieter and Ba, Jimmy   [|•|]
[|•|] A Surrogate-Assisted Controller for Expensive Evolutionary Reinforcement Learning (2022)   -   Wang, Yuxing and Zhang, Tiantian and Chang, Yongzhe and Liang, Bin and Wang, Xueqian and Yuan, Bo   [|•|]
[|•|] Sample Efficient Actor-Critic with Experience Replay (2016)   -   Wang, Ziyu and Bapst, Victor and Heess, Nicolas and Mnih, Volodymyr and Munos, Remi and Kavukcuoglu, Koray and Freitas, Nando de   [|•|]
[|•|] Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies (2019)   -   Ward, Patrick Nadeem and Smofsky, Ariella and Bose, Avishek Joey   [|•|]
[|•|] Unsupervised Control Through Non-Parametric Discriminative Rewards (2018)   -   Warde-Farley, David and Wiele, Tom Van de and Kulkarni, Tejas and Ionescu, Catalin and Hansen, Steven and Mnih, Volodymyr   [|•|]
[|•|] Optimization-Based Control for Dynamic Legged Robots (2022)   -   Wensing, Patrick M. and Posa, Michael and Hu, Yue and Escande, Adrien and Mansard, Nicolas and Prete, Andrea Del   [|•|]
[|•|] Q-Learning in enormous action spaces via amortized approximate maximization (2020)   -   Wiele, Tom Van de and Warde-Farley, David and Mnih, Andriy and Mnih, Volodymyr   [|•|]
[|•|] Model Predictive Path Integral Control using Covariance Variable Importance Sampling (2015)   -   Williams, Grady and Aldrich, Andrew and Theodorou, Evangelos   [|•|]
[|•|] A Lyapunov Analysis of Momentum Methods in Optimization (2016)   -   Wilson, Ashia C. and Recht, Benjamin and Jordan, Michael I.   [|•|]
[|•|] Evolving simple programs for playing Atari games (2018)   -   Wilson, Dennis G. and Cussat-Blanc, Sylvain and Luga, Hervé and Miller, Julian F.   [|•|]
[|•|] Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance (2021)   -   Wu, Yanqiu and Chen, Xinyue and Wang, Che and Zhang, Yiming and Ross, Keith W.   [|•|]
[|•|] When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning (2022)   -   Xie, Annie and Tajwar, Fahim and Sharma, Archit and Finn, Chelsea   [|•|]
[|•|] Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization (2023)   -   Xu, Haoran and Jiang, Li and Li, Jianxiong and Yang, Zhuoran and Wang, Zhaoran and Chan, Victor Wai Kin and Zhan, Xianyuan   [|•|]
[|•|] VideoGPT: Video Generation using VQ-VAE and Transformers (2021)   -   Yan, Wilson and Zhang, Yunzhi and Abbeel, Pieter and Srinivas, Aravind   [|•|]
[|•|] Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation (2024)   -   Yang, Jonathan and Glossop, Catherine and Bhorkar, Arjun and Shah, Dhruv and Vuong, Quan and Finn, Chelsea and Sadigh, Dorsa and Levine, Sergey   [|•|]
[|•|] TRAIL: Near-Optimal Imitation Learning with Suboptimal Data (2021)   -   Yang, Mengjiao and Levine, Sergey and Nachum, Ofir   [|•|]
[|•|] Discovering Diverse Athletic Jumping Strategies (2021)   -   Yin, Zhiqi and Yang, Zeshi and Panne, Michiel van de and Yin, KangKang   [|•|]
[|•|] How transferable are features in deep neural networks? (2014)   -   Yosinski, Jason and Clune, Jeff and Bengio, Yoshua and Lipson, Hod   [|•|]
[|•|] Self-supervised Sequential Information Bottleneck for Robust Exploration in Deep Reinforcement Learning (2022)   -   You, Bang and Xie, Jingming and Chen, Youping and Peters, Jan and Arenz, Oleg   [|•|]
[|•|] Diffeomorphic Learning (2018)   -   Younes, Laurent   [|•|]
[|•|] COMBO: Conservative Offline Model-Based Policy Optimization (2021)   -   Yu, Tianhe and Kumar, Aviral and Rafailov, Rafael and Rajeswaran, Aravind and Levine, Sergey and Finn, Chelsea   [|•|]
[|•|] Learning Symmetric and Low-energy Locomotion (2018)   -   Yu, Wenhao and Turk, Greg and Liu, C. Karen   [|•|]
[|•|] On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning (2021)   -   Zhang, Baohe and Rajan, Raghu and Pineda, Luis and Lambert, Nathan and Biedenkapp, André and Chua, Kurtland and Hutter, Frank and Calandra, Roberto   [|•|]
[|•|] Three Mechanisms of Weight Decay Regularization (2018)   -   Zhang, Guodong and Wang, Chaoqi and Xu, Bowen and Grosse, Roger   [|•|]
[|•|] Understanding Hindsight Goal Relabeling from a Divergence Minimization Perspective (2022)   -   Zhang, Lunjun and Stadie, Bradly C.   [|•|]
[|•|] C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks (2021)   -   Zhang, Tianjun and Eysenbach, Benjamin and Salakhutdinov, Ruslan and Levine, Sergey and Gonzalez, Joseph E.   [|•|]
[|•|] Adversarially Regularized Autoencoders (2017)   -   Zhao, Jake and Kim, Yoon and Zhang, Kelly and Rush, Alexander M. and LeCun, Yann   [|•|]
[|•|] Domain Generalization: A Survey (2021)   -   Zhou, Kaiyang and Liu, Ziwei and Qiao, Yu and Xiang, Tao and Loy, Chen Change   [|•|]
[|•|] Bottom-Up Skill Discovery from Unsegmented Demonstrations for Long-Horizon Robot Manipulation (2021)   -   Zhu, Yifeng and Stone, Peter and Zhu, Yuke   [|•|]
[|•|] Transfer Learning in Deep Reinforcement Learning: A Survey (2020)   -   Zhu, Zhuangdi and Lin, Kaixiang and Jain, Anil K. and Zhou, Jiayu   [|•|]
[|•|] Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains (2019)   -   Zimmer, Matthieu and Weng, Paul   [|•|]
[|•|] A Bayesian Approach to Policy Recognition and State Representation Learning (2016)   -   Šošić, Adrian and Zoubir, Abdelhak M. and Koeppl, Heinz   [|•|]