Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Repository contains codes for the course CS780: Deep Reinforcement Learning

License

NotificationsYou must be signed in to change notification settings

Rajarshi1001/CS780

Repository files navigation

Assignment 1

  • Implementation of Bernoulli and Gaussian Bandit environment usingGymnasium library and simulating them for different combinations of hyper parameters
  • Implementation of different learning strategies likepureExploitation,pureExploration,epsilonGreedyExploration,decayingEpsilonGreedyExploration,softmaxExploration andUCBExploration methods and their corresponding simulations on both environments along with tuning hyper parameters for different environments.
  • Implementation of Random Walk Environment, creation of trajectory usinggenerateTrajectory function for simulation
  • Implementation ofMonteCarloPrediction (both FVMC and EVMC) andTemporalDifferencePrediction for calculation of state values in the environment
  • Plotting the evolution of state values over episodes, log scale episodes, seed averaged plots for effective noise removal
  • Analysing the variation of target values for a particular state for the case of both environments

Assignment 2

  • Implementation of control algorithms likeMonteCarloControl,SARSAControl,Q learning,double Q learning,SARSA($\lambda$) with eligibility traces,Q($\lambda$) with traces
  • Implementation of model based algorithms likeDyna-Q andTrajectory Sampling for optimal policy calculation and values for each of the states in Random Maze Environment
  • Comparison between different off-policy and on-policy control algorithms for this environment

Assignment 3

This assignment primarily includes the implementation of 5Value Based Deep RL models namely:

  • Neural Fitted Q Iteration (NFQ)
  • Deep Q Network (DQN)
  • Double Deep Q Network (DDQN)
  • Dueling Double Deep Q Network (D3QN)
  • Dueling Double Deep Q Network with Prioritized Experience Replay (D3QN-PER)

and 2Policy Based Deep RL models namely:

  • REINFORCE
  • Vanilla Policy Gradient (VPG)

on two different OpenAI gym environments likeCartpole-v0 andMountainCar-v1 respectively.

Assignment 4

This assignment primiarily includes implementation of 3 Deep RL models for continuous action spaces namely:

  • Deep Deterministic Policy Gradient (DDPG)
  • Twin Delayed Deep Deterministic Policy Gradient (TD3)
  • Proximal Policy Optimization (PPO)

on three different OpenAI gym environments likePendulum-v1,Hopper-v4 andHalfCheetah-v1 respectively.

Midsem

  • Implementation of Random Maze Environment and its simulations
  • Implementation ofPolicy Iteration andValue Iteration for optimal policy calculation and values for each of the states in the environment and its comparative analyses.
  • Implementation ofMonte Carlo,Temporal Difference-n step,TD($\lambda$) algorithm for calculation of values for each states using optimal policies and its comparative analyses.

[8]ページ先頭

©2009-2025 Movatter.jp