Xujie Song
Xujie Song
Home
Experience
Publication
Awards
Gallery
Contact
1
“Distributional Soft Actor-Critic with Diffusion Policy
We proposed DSAC-D (Distributed Soft Actor Critic with Diffusion Policy) to address the challenges of estimating bias in value functions and obtaining multimodal policy representations.
Tong Liu*
,
Yinuo Wang*
,
Xujie Song*
,
Wenjun Zou
,
Liangfa Chen
,
Likun Wang
,
Bin Shuai
,
Jingliang Duan
,
Shengbo Eben Li
PDF
LipsNet++: Unifying Filter and Controller into a Policy Network
We unified the filtering and control capabilities into a single policy network in RL, achieving SOTA noise robustness and action smoothness in real-world control tasks.
Xujie Song
,
Liangfa Chen
,
Tong Liu
,
Wenxuan Wang
,
Yinuo Wang
,
Shentao Qin
,
Yinsong Ma
,
Jingliang Duan
,
Shengbo Eben Li
PDF
Video
Project page
Enhanced DACER Algorithm with Multimodal Q-value Distribution for Risk-Sensitive Stochastic Vehicle Environments
We proposed DACER++, an online multimodal distributional RL algorithm.
Tong Liu
,
Xujie Song
,
Wenxuan Wang
,
Wenjun Zou
,
Bin Shuai
,
Haoyu Gao
,
Weixian He
,
Jingliang Duan
,
Shengbo Eben Li
PDF
Diffusion Actor-Critic with Entropy Regulator
We proposed DACER, an online reinforcement learning algorithm that utilizes a diffusion model as the actor network to enhance the representational capacity of the policy.
Yinuo Wang
,
Likun Wang
,
Yuxuan Jiang
,
Wenjun Zou
,
Tong Liu
,
Xujie Song
,
Wenxuan Wang
,
Liming Xiao
,
Jiang Wu
,
Jingliang Duan
,
Shengbo Eben Li
PDF
LipsNet: A Smooth and Robust Neural Network with Adaptive Lipschitz Constant for High Accuracy Optimal Control
We proposed LipsNet, a smooth and robust neural network with adaptive Lipschitz constant, to deal with the action fluctuation problem in RL (reinforcement learning).
Xujie Song
,
Jingliang Duan
,
Wenxuan Wang
,
Shengbo Eben Li
,
Chen Chen
,
Bo Cheng
,
Bo Zhang
,
Junqing Wei
,
Xiaoming Simon Wang
PDF
Code
Poster
Video
ODE-based Smoothing Neural Network for Reinforcement Learning Tasks
we proposed a variant of neural ODE, called SmODE, to smooth out control actions in RL. A mapping function is incorporated to estimate the changing speed of system dynamics.
Yinuo Wang
,
Wenxuan Wang
,
Xujie Song
,
Tong Liu
,
Yuming Yin
,
Liangfa Chen
,
Likun Wang
,
Jingliang Duan
,
Shengbo Eben Li
PDF
Cite
×