Add extra call to clear_memory() to fix OOM when sending critic data (

#91) Signed-off-by: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
NVIDIA · Jan 25, 2024 · ebe1bcf · ebe1bcf
1 parent a988e9e
commit ebe1bcf
Show file tree

Hide file tree

Showing 2 changed files with 2 additions and 0 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,6 +7,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
 
 ### New features and optimizations
 - Added public-facing official Dockerfile for NeMo-Aligner
+- Memory optimization in PPO that helps avoid OOM in the actor when sending training data to the critic
 
 ### Breaking changes
 

diff --git a/nemo_aligner/algorithms/ppo.py b/nemo_aligner/algorithms/ppo.py
@@ -385,6 +385,7 @@ def fit(self):
                 timing_metrics["rollout_time"] = self.timer.get("rollout_time")
 
                 # send critic train
+                clear_memory()
                 self.rm_critic.train(ppo_rollout_data)
 
                 # logging