Starting from the top left image, here is an interpretation of the graphs displayed:
rollout/ep_len_mean (Episode Length Mean): Shows the average number of steps per episode. Since, the plotted curve shows a steep initial increase followed by a plateau and then a slight dip - this suggests that the agent is learning to complete tasks efficiently (or hitting the terminal conditions quicker). On the whole, this indicates policy convergence.
rollout/ep_reward_mean (Episode Reward Mean): Shows the average reward per episode. The sharp increase and then a plateau - shows learning progress and then eventual performance stabilization. This indicates successful training (as long as the plateau aligns with the desired behaviour).
rollout/exploration_rate: Shows the epsilon decay in an epsilon-greedy policy. Since it drops from ~0.5 to 0.01 early, this confirms the exploration -> explotation shift. This suggests epsilon decay schedule was well-configured.
time/fps (Frames Per Second): Shows the training speed (frames per second). Since it increases and stabilizes, this indicates good training pipeline performance. Although, this is not critical for policy performance, it is helpful in profiling runs.