Follow
Tengyu Xu
Title
Cited by
Cited by
Year
Finite-sample analysis for sarsa with linear function approximation
S Zou, T Xu, Y Liang
Advances in neural information processing systems 32, 2019
1232019
Two time-scale off-policy TD learning: Non-asymptotic analysis over Markovian samples
T Xu, S Zou, Y Liang
Advances in Neural Information Processing Systems 32, 2019
662019
Improving sample complexity bounds for actor-critic algorithms
T Xu, Z Wang, Y Liang
arXiv preprint arXiv:2004.12956, 2020
60*2020
Non-asymptotic convergence analysis of two time-scale (natural) actor-critic algorithms
T Xu, Z Wang, Y Liang
arXiv preprint arXiv:2005.03557, 2020
382020
A primal approach to constrained policy optimization: Global optimality and finite-time analysis
T Xu, Y Liang, G Lan
36*2020
Algorithms for the estimation of transient surface heat flux during ultra-fast surface cooling
ZF Zhou, TY Xu, B Chen
International Journal of Heat and Mass Transfer 100, 1-10, 2016
342016
Reanalysis of variance reduced temporal difference learning
T Xu, Z Wang, Y Zhou, Y Liang
arXiv preprint arXiv:2001.01898, 2020
312020
Non-asymptotic convergence of Adam-type reinforcement learning algorithms under markovian sampling
H Xiong, T Xu, Y Liang, W Zhang
Proceedings of the AAAI Conference on Artificial Intelligence 35 (12), 10460 …, 2021
232021
Enhanced first and zeroth order variance reduced algorithms for min-max optimization
T Xu, Z Wang, Y Liang, HV Poor
19*2020
When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models?
T Xu, Y Zhou, K Ji, Y Liang
arXiv preprint arXiv:1806.04339, 2018
18*2018
Doubly robust off-policy actor-critic: Convergence and optimality
T Xu, Z Yang, Z Wang, Y Liang
International Conference on Machine Learning, 11581-11591, 2021
142021
Sample complexity bounds for two timescale value-based reinforcement learning algorithms
T Xu, Y Liang
International Conference on Artificial Intelligence and Statistics, 811-819, 2021
142021
Proximal Gradient Descent-Ascent: Variable Convergence under K {\L} Geometry
Z Chen, Y Zhou, T Xu, Y Liang
arXiv preprint arXiv:2102.04653, 2021
102021
Faster algorithm and sharper analysis for constrained markov decision process
T Li, Z Guan, S Zou, T Xu, Y Liang, G Lan
arXiv preprint arXiv:2110.10351, 2021
52021
When will generative adversarial imitation learning algorithms attain global convergence
Z Guan, T Xu, Y Liang
International Conference on Artificial Intelligence and Statistics, 1117-1125, 2021
42021
A unified off-policy evaluation approach for general value function
T Xu, Z Yang, Z Wang, Y Liang
arXiv preprint arXiv:2107.02711, 2021
12021
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward
T Xu, Y Liang
arXiv preprint arXiv:2206.06426, 2022
2022
Deterministic Policy Gradient: Convergence Analysis
H Xiong, T Xu, L Zhao, Y Liang, W Zhang
The 38th Conference on Uncertainty in Artificial Intelligence, 2022
2022
Model-Based Offline Meta-Reinforcement Learning with Regularization
S Lin, J Wan, T Xu, Y Liang, J Zhang
arXiv preprint arXiv:2202.02929, 2022
2022
PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method
Z Guan, T Xu, Y Liang
arXiv preprint arXiv:2110.06906, 2021
2021
The system can't perform the operation now. Try again later.
Articles 1–20