top of page

Take your time exploring the contents
Create Your First Project
Start adding your projects to your portfolio. Click on "Manage Projects" to get started
Evaluating PPO and SAC Algorithms for Continuous Control
Date
April 2023
Role & Contributions
Lead developer for SAC and DDPG implementations
Conducted comprehensive hyperparameter studies
Performed comparative analysis of algorithms
Generated detailed performance metrics and visualizations
Date
October 2024 - December 2024
- Overview
A comparative study implementing and analyzing three state-of-the-art reinforcement learning algorithms - Proximal Policy Optimization (PPO), Soft Actor-Critic (SAC), and Deep Deterministic Policy Gradient (DDPG) - for solving challenging continuous control tasks in the CarRacing-v2 environment.
Project Goals
Implement and evaluate multiple deep RL algorithms
Compare performance metrics including sample efficiency, stability, and final performance
Analyze hyperparameter sensitivity and optimization
Develop practical insights for algorithm selection in continuous control tasks
Technical Details
Environments
CarRacing-v2: Visual-based racing environment with:
96x96 RGB image observations
Continuous action space for steering, acceleration, and braking
Procedurally generated tracks
Algorithms Implemented
SAC Implementation (My Focus):
Dual critic networks for reduced overestimation bias
Automatic entropy tuning for exploration
Experience replay buffer for sample efficiency
State-of-the-art performance in continuous control
DDPG Implementation (My Focus):
Deterministic policy gradient approach
Actor-critic architecture with target networks
Ornstein-Uhlenbeck process for exploration
Specialized for continuous action spaces
PPO Implementation (Collaborative Work):
Clipped surrogate objective function
On-policy learning with improved stability
Value function and policy optimization
Robust performance across tasks
Key Results
SAC demonstrated superior sample efficiency
PPO showed better training stability
DDPG provided effective learning in continuous spaces
Successful navigation of complex racing scenarios
Technologies Used
Python
PyTorch/TensorFlow
OpenAI Gym
Numpy/Pandas
Matplotlib for visualization






bottom of page