Shortcuts

# Losses¶

This package lists common losses across research domains (This is a work in progress. If you have any losses you want to contribute, please submit a PR!)

Note

this module is a work in progress

We’re cleaning up many of our losses, but in the meantime, submit a PR to add your loss here!

# Reinforcement Learning¶

These are common losses used in RL.

## DQN Loss¶

pl_bolts.losses.rl.dqn_loss(batch, net, target_net, gamma=0.99)[source]

Calculates the mse loss using a mini batch from the replay buffer

Parameters
• batch (Tuple[Tensor, Tensor]) – current mini batch of replay data

• net (Module) – main training network

• target_net (Module) – target network of the main training network

• gamma (float) – discount factor

Return type

Tensor

Returns

loss

## Double DQN Loss¶

pl_bolts.losses.rl.double_dqn_loss(batch, net, target_net, gamma=0.99)[source]

Calculates the mse loss using a mini batch from the replay buffer. This uses an improvement to the original DQN loss by using the double dqn. This is shown by using the actions of the train network to pick the value from the target network. This code is heavily commented in order to explain the process clearly

Parameters
• batch (Tuple[Tensor, Tensor]) – current mini batch of replay data

• net (Module) – main training network

• target_net (Module) – target network of the main training network

• gamma (float) – discount factor

Return type

Tensor

Returns

loss

## Per DQN Loss¶

pl_bolts.losses.rl.per_dqn_loss(batch, batch_weights, net, target_net, gamma=0.99)[source]

Calculates the mse loss with the priority weights of the batch from the PER buffer

Parameters
• batch (Tuple[Tensor, Tensor]) – current mini batch of replay data

• batch_weights (List) – how each of these samples are weighted in terms of priority

• net (Module) – main training network

• target_net (Module) – target network of the main training network

• gamma (float) – discount factor

Return type
Returns

loss and batch_weights

Built with Sphinx using a theme provided by Read the Docs.
Versions
latest
stable
0.1.1
0.1.0