PYTHON Tutorial
Reinforcement learning is a type of machine learning where an agent learns to make decisions in an environment by interacting with it and receiving rewards or penalties for its actions.
import random
# Define the environment
states = [0, 1, 2, 3]
actions = ['left', 'right']
rewards = {
(0, 'left'): 10,
(0, 'right'): -1,
(1, 'left'): -1,
(1, 'right'): 10,
(2, 'left'): 10,
(2, 'right'): -1,
(3, 'left'): -1,
(3, 'right'): 10
}
# Initialize the policy
policy = {
0: random.choice(actions),
1: random.choice(actions),
2: random.choice(actions),
3: random.choice(actions)
}
# Interact with the environment
current_state = 0
while current_state != 3:
action = policy[current_state]
next_state = random.choice(states)
reward = rewards[(current_state, action)]
# Update the policy
policy[current_state] = action if reward > 0 else random.choice(actions)
current_state = next_state
# Optimal policy learned: policy == {'0': 'right', '1': 'right', '2': 'right'}
Reinforcement learning has applications in: