Zongzhang Zhang
Cited by
Cited by
A survey on deep reinforcement learning
Q Liu, JW Zhai, ZZ Zhang, S Zhong, Q Zhou, P Zhang, J Xu
Chinese Journal of Computers 41 (1), 1-27, 2018
Weighted double Q-learning
Z Zhang, Z Pan, MJ Kochenderfer
IJCAI-2017, 3455-3461, 2017
刘全, 翟建伟, 章宗长, 钟珊, 周倩, 章鹏, 徐进
计算机学报 41 (1), 1-27, 2018
A deep Bayesian policy reuse approach against non-stationary agents
Y Zheng, Z Meng, J Hao, Z Zhang, T Yang, C Fan
NeurIPS-2018, 954-964, 2018
Hierarchical deep multiagent reinforcement learning with temporal abstraction
H Tang, J Hao, T Lv, Y Chen, Z Zhang, H Jia, C Ren, Y Zheng, Z Meng, ...
arXiv preprint arXiv:1809.09332, 2018
Weighted double deep multiagent reinforcement learning in stochastic cooperative environments
Y Zheng, Z Meng, J Hao, Z Zhang
PRICAI-2018, 421-429, 2018
A survey on deep reinforcement learning
L Quan, Z Jianwei, Z Zongchang, Z Shan, Z Qian
Chinese Journal of Computers 41 (01), 1-27, 2018
Deep Q-learning with prioritized sampling
J Zhai, Q Liu, Z Zhang, S Zhong, H Zhu, P Zhang, C Sun
ICONIP-2016, 13-22, 2016
Covering number for efficient heuristic-based POMDP planning
Z Zhang, D Hsu, WS Lee
ICML-2014, 28-36, 2014
Efficient deep reinforcement learning via adaptive policy transfer
T Yang, J Hao, Z Meng, Z Zhang, Y Hu, Y Chen, C Fan, W Wang, W Liu, ...
IJCAI-2020, 3094-3100, 2020
Covering number as a complexity measure for POMDP planning and learning
Z Zhang, M Littman, X Chen
AAAI-2012, 1853-1859, 2012
Triple-GAIL: A multi-modal imitation learning framework with generative adversarial Nets
C Fei, B Wang, Y Zhuang, Z Zhang, J Hao, H Zhang, X Ji, W Liu
IJCAI-2020, 2929-2935, 2020
Thompson sampling based Monte-Carlo planning in POMDPs
A Bai, F Wu, Z Zhang, X Chen
ICAPS-2014, 28-36, 2014
Multi-Agent Incentive Communication via Decentralized Teammate Modeling
L Yuan, J Wang, F Zhang, C Wang, Z Zhang, Y Yu, C Zhang
AAAI-2022, 9466-9474, 2022
Efficient policy detecting and reusing for non-stationarity in Markov games
Y Zheng, J Hao, Z Zhang, Z Meng, T Yang, Y Li, C Fan
Autonomous Agents and Multi-Agent Systems 35 (1), 1-29, 2021
Efficient reinforcement learning in continuous state and action spaces with Dyna and policy approximation
S Zhong, Q Liu, Z Zhang, Q Fu
Frontiers of Computer Science 13 (1), 106-126, 2019
PLEASE: palm leaf search for POMDPs with large observation spaces
Z Zhang, D Hsu, WS Lee, ZW Lim, A Bai
ICAPS-2015, 249-257, 2015
FHHOP: A factored hybrid heuristic online planning algorithm for large POMDPs
Z Zhang, X Chen
UAI-2012, 934-943, 2012
Accelerating point-based POMDP algorithms via greedy strategies
Z Zhang, X Chen
SIMPAR-2010, 545-556, 2010
A framework of dual replay buffer: Balancing forgetting and generalization in reinforcement learning
L Zhang, Z Zhang, Z Pan, Y Chen, J Zhu, Z Wang, M Wang, C Fan
IJCAI Workshop on Scaling Up Reinforcement Learning (SURL), 2019
The system can't perform the operation now. Try again later.
Articles 1–20