Sydney
10 August 2017
Lifelong Learning: A Reinforcement Learning Approach
ICML WORKSHOP 2017
[1] Silver, Daniel L., Qiang Yang, and Lianghao Li. "Lifelong Machine Learning Systems: Beyond Learning Algorithms." AAAI Spring Symposium: Lifelong Machine Learning. Vol. 13. 2013.
[2] Thrun, Sebastian. "Lifelong learning algorithms." Learning to learn. Springer US, 1998. 181-209.
[3] Tessler, Chen, et al. "A deep hierarchical approach to lifelong learning in minecraft." AAAI (2017)
[4] Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. Vol. 1. No. 1. Cambridge: MIT press, 1998.
[5] Bai, Aijun, Feng Wu, and Xiaoping Chen. "Online planning for large markov decision processes with hierarchical decomposition." ACM Transactions on Intelligent Systems and Technology (TIST) 6.4 (2015): 45.
[6] Kempka, Michał, et al. "Vizdoom: A doom-based ai research platform for visual reinforcement learning." arXiv preprint arXiv:1605.02097 (2016).
[7] Dalal, Gal, Elad Gilboa, and Shie Mannor. "Hierarchical decision making in electricity grid management." Proceedings of The 33rd International Conference on Machine Learning. 2016.
[8] Mankowitz, Daniel J., Timothy A. Mann, and Shie Mannor. "Adaptive Skills Adaptive Partitions (ASAP)." Advances in Neural Information Processing Systems. 2016.
[9] Sutton, Richard S, Precup, Doina, and Singh, Satinder. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. AI, 112(1): 181–211, August 1999.
[10] Bacon, Pierre-Luc, Jean Harb, and Doina Precup. "The option-critic architecture." AAAI (2017)
[11] Mankowitz, Daniel J., Timothy A. Mann, and Shie Mannor. "Iterative Hierarchical Optimization for Misspecified Problems (IHOMP)." arXiv preprint arXiv:1602.03348 (2016).
[12] Da Silva, Bruno, George Konidaris, and Andrew Barto. "Learning parameterized skills." arXiv preprint arXiv:1206.6398 (2012).
[13] Sharma, Sahil, Aravind S. Lakshminarayanan, and Balaraman Ravindran. "Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning." arXiv preprint arXiv:1702.06054 (2017).
[14] Precup, Doina, Richard S. Sutton, and Satinder Singh. "Theoretical results on reinforcement learning with temporally abstract options." Machine Learning: ECML-98. Springer Berlin Heidelberg, 1998. 382-393.
[15] Mann, Timothy A and Mannor, Shie. Scaling up approximate value iteration with options: Better policies with fewer iterations. In Proceedings of the 31 st ICML, 2014
[16] Vezhnevets et. al., “Feudal Networks for Hierarchical Reinforcement Learning”, ArXiv, 2017
[17] Rajendran, Janarthanan, et al. "Attend, Adapt and Transfer: Attentive Deep Architecture for Adaptive Transfer from multiple sources." arXiv preprint arXiv:1510.02879 (2015).
[18] Oh, Junhyuk, et al. "Control of memory, active perception, and action in minecraft." arXiv preprint arXiv:1605.09128 (2016).
[19] Liu L. et. al. “Decoding Multi-Task DQN in the World of Minecraft”, EWRL, 2016
[20] Brunskill, E., and Li, L. 2014. Pac-inspired option discovery in lifelong reinforcement learning. In Proceedings of the 31st International Conference on Machine Learning (ICML-14)
[21] Rusu, Andrei A., et al. "Policy distillation." arXiv preprint arXiv:1511.06295 (2015).
[22] Ammar, Haitham Bou, et al. "Online multi-task learning for policy gradient methods." Proceedings of the 31st International Conference on Machine Learning (ICML-14). 2014.
[23] Fernando, Chrisantha, et al. "PathNet: Evolution Channels Gradient Descent in Super Neural Networks." arXiv preprint arXiv:1701.08734 (2017).
[24] Rusu, Andrei A., et al. "Progressive neural networks." arXiv preprint arXiv:1606.04671 (2016).
[25] Bengio, Yoshua, et al. "Curriculum learning." Proceedings of the 26th annual international conference on machine learning. ACM, 2009.
[26] Zahavy, Tom, Nir Ben-Zrihem, and Shie Mannor. "Graying the black box: Understanding DQNs." ICML (2016)
[27] Baram, Nir, Tom Zahavy, and Shie Mannor. "Deep Reinforcement Learning Discovers Internal Models." arXiv preprint arXiv:1606.05174 (2016).