Overview
RFAS (Roblox's First AI System) is a reinforcement learning-based solution for creating intelligent racing AI in Roblox games. The system uses PPO (Proximal Policy Optimization) algorithm to train an Actor-Critic neural network that controls vehicle throttle and steering.
The system consists of two main components:
- A Flask API server that communicates with Roblox
- A PyTorch-based AI model that makes driving decisions
Key Features
- Real-time control decisions based on game state
- Adaptive learning through reinforcement feedback
- Continuous training to improve performance
- TensorBoard integration for monitoring training metrics
API Reference
The RFAS system exposes two main API endpoints for communication with Roblox games:
POST /state
Receives the current state of the vehicle and returns control commands.
Request Body
{
"distance": float, // Distance to track boundary or obstacles
"speed": float, // Current speed of the vehicle (normalized 0-1)
"steer": float, // Current steering value (-1 to 1)
"throttle": float // Current throttle value (-1 to 1)
}
Response
{
"status": "State received",
"throttle": float, // Throttle command (-1, 0, or 1)
"steer": float // Steering command (-1, 0, or 1)
}
POST /reward
Sends reward feedback to the system for training purposes.
Request Body
{
"reward": float, // Base reward value
"speed": float, // Current speed for reward processing
"steer": float, // Current steering value
"throttle": float, // Current throttle value
"collision": boolean, // Optional: Whether a collision occurred
"lap_completion": boolean // Optional: Whether a lap was completed
}
Response
{
"status": "Reward received"
}
Model Architecture
RFAS uses an Actor-Critic architecture, which is a common approach in reinforcement learning. The model consists of:
Network Structure
- Input Layer: 4 neurons (distance, speed, steer, throttle)
- Hidden Layer 1: 128 neurons with ReLU activation
- Hidden Layer 2: 64 neurons with ReLU activation
- Actor Output: 2 neurons (throttle and steer) with tanh activation
- Critic Output: 1 neuron (state value estimation)
class ActorCritic(nn.Module):
def __init__(self, input_dim, action_dim=2):
super(ActorCritic, self).__init__()
self.fc1 = nn.Linear(input_dim, 128)
self.fc2 = nn.Linear(128, 64)
self.actor = nn.Linear(64, action_dim)
self.critic = nn.Linear(64, 1)
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.relu(self.fc2(x))
return x
def act(self, state):
x = self.forward(state)
action = torch.tanh(self.actor(x))
throttle, steer = action[0].item(), action[1].item()
throttle = 1 if throttle > 0.5 else -1 if throttle < -0.5 else 0
steer = 1 if steer > 0.6 else -1 if steer < -0.5 else 0
return throttle, steer
def evaluate(self, state):
x = self.forward(state)
return self.critic(x)
The Actor component determines the actions to take (throttle and steering), while the Critic component evaluates how good the current state is, which helps in calculating the advantage for training.
Training Process
RFAS uses Proximal Policy Optimization (PPO) with Generalized Advantage Estimation (GAE) for efficient and stable training. The process runs continuously in a separate thread while the API server handles incoming requests.
Key Training Components
PPO Algorithm
PPO is used to update the policy in a way that avoids destructively large updates, leading to more stable training. Key parameters:
clip_epsilon = 0.2
- Multiple epochs (4) per batch
- Batch size of 32 experiences
Advantage Estimation
GAE is used to compute advantages that inform the policy update:
gamma = 0.99
(discount factor)lambda_gae = 0.95
(smoothing parameter)- Balances bias and variance in advantage estimates
Reward Structure
The reward function promotes desired behaviors:
def process_reward(reward, state):
speed = state['speed']
steer = state["steer"]
throttle = state["throttle"]
collision = state.get('collision', False)
lap_completion = state.get('lap_completion', False)
total_reward = reward
if 0.5 <= speed <= 0.8:
total_reward += 1 # Optimal speed range
elif speed > 0.8:
total_reward += 0.5 # Fast but not optimal
if collision:
total_reward -= 2 # Penalty for collisions
if lap_completion:
total_reward += 10 # Significant bonus for completing a lap
return total_reward
Training Flow
- Collect state data from Roblox through the API
- Generate actions using the current policy
- Receive rewards for the taken actions
- Accumulate experiences in batches of 32
- Compute advantages using GAE
- Update the policy using the PPO objective
- Log metrics to TensorBoard for monitoring
Integration Guide
Follow these steps to integrate RFAS into your Roblox racing game:
1. Setup the Server
First, ensure you have Python installed with the required dependencies:
pip install torch flask numpy tensorboard
Then, run the RFAS server script:
python rfas_server.py
2. Roblox HTTP Service Setup
In your Roblox game, enable the HTTP Service and create scripts to communicate with the RFAS API.
-- Example Roblox script
local HttpService = game:GetService("HttpService")
local serverUrl = "http://localhost:5000" -- Adjust if running on a different server
local function sendState(vehicle)
-- Get vehicle state
local state = {
distance = vehicle.DistanceToEdge.Value,
speed = vehicle.Speed.Value,
steer = vehicle.SteerValue.Value,
throttle = vehicle.ThrottleValue.Value
}
-- Send to API
local response = HttpService:PostAsync(
serverUrl.."/state",
HttpService:JSONEncode(state)
)
-- Parse response
local result = HttpService:JSONDecode(response)
-- Apply controls to vehicle
vehicle:ApplyControls(result.throttle, result.steer)
end
-- Call sendState periodically
while true do
sendState(vehicleObject)
wait(0.1) -- 10 times per second
end
3. Reward Feedback
Implement a system to send rewards to the RFAS server based on vehicle performance:
-- Example reward function
local function sendReward(vehicle, reward, collision, lapCompleted)
local data = {
reward = reward,
speed = vehicle.Speed.Value,
steer = vehicle.SteerValue.Value,
throttle = vehicle.ThrottleValue.Value,
collision = collision,
lap_completion = lapCompleted
}
HttpService:PostAsync(
serverUrl.."/reward",
HttpService:JSONEncode(data)
)
end
-- Example usage
vehicle.Collision:Connect(function()
sendReward(vehicle, -1, true, false)
end)
checkpoints.LapCompleted:Connect(function()
sendReward(vehicle, 5, false, true)
end)
Monitoring and Optimization
RFAS includes TensorBoard integration for monitoring training progress. The system logs:
- Policy Loss: Measures how much the policy is changing
- Value Loss: Measures how well the critic predicts state values
- Total Loss: Combined loss used for optimization
- Average Rewards: Tracks overall performance improvements
Viewing TensorBoard Logs
To view training metrics:
tensorboard --logdir=runs/TrainingMetrics
Then open your browser at http://localhost:6006
to view the dashboard.
Optimization Tips
- Adjust reward function: Customize
process_reward
to better reflect desired behaviors in your specific racing game. - Tune hyperparameters: Experiment with
gamma
, lambda_gae
, and clip_epsilon
values for better performance. - Expand state information: Add more input features like track curvature or upcoming obstacles for more informed decisions.
- Adjust network size: For more complex environments, consider increasing the size of the hidden layers.
Full Code Examples
Server Code (Python)
from flask import Flask, jsonify, request
import torch
import torch.nn as nn
import torch.optim as optim
import json
import time
from threading import Thread
from torch.utils.tensorboard import SummaryWriter
write = SummaryWriter("runs/TrainingMetrics")
app = Flask(__name__)
state_data_path = 'state_data.json'
reward_data_path = 'reward_data.json'
class ActorCritic(nn.Module):
def __init__(self, input_dim, action_dim=2):
super(ActorCritic, self).__init__()
self.fc1 = nn.Linear(input_dim, 128)
self.fc2 = nn.Linear(128, 64)
self.actor = nn.Linear(64, action_dim)
self.critic = nn.Linear(64, 1)
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.relu(self.fc2(x))
return x
def act(self, state):
x = self.forward(state)
action = torch.tanh(self.actor(x))
throttle, steer = action[0].item(), action[1].item()
throttle = 1 if throttle > 0.5 else -1 if throttle < -0.5 else 0
steer = 1 if steer > 0.6 else -1 if steer < -0.5 else 0
return throttle, steer
def evaluate(self, state):
x = self.forward(state)
return self.critic(x)
model = ActorCritic(input_dim=4)
optimizer = optim.Adam(model.parameters(), lr=0.001)
clip_epsilon = 0.2
gamma = 0.99
lambda_gae = 0.95
# ... rest of the server code ...
Download Complete Server Script
Roblox Client Code (Lua)
-- RFAS Client Implementation
local HttpService = game:GetService("HttpService")
local RunService = game:GetService("RunService")
local RFAS = {}
RFAS.__index = RFAS
function RFAS.new(serverUrl, vehicle)
local self = setmetatable({}, RFAS)
self.serverUrl = serverUrl or "http://localhost:5000"
self.vehicle = vehicle
self.isRunning = false
self.connection = nil
return self
end
function RFAS:start()
if self.isRunning then return end
self.isRunning = true
self.connection = RunService.Heartbeat:Connect(function()
self:update()
end)
print("RFAS started")
end
function RFAS:stop()
if not self.isRunning then return end
if self.connection then
self.connection:Disconnect()
self.connection = nil
end
self.isRunning = false
print("RFAS stopped")
end
function RFAS:update()
-- Collect state data
local state = self:getState()
-- Send to API and get response
local success, result = pcall(function()
local response = HttpService:PostAsync(
self.serverUrl.."/state",
HttpService:JSONEncode(state)
)
return HttpService:JSONDecode(response)
end)
if success and result then
-- Apply controls
self:applyControls(result.throttle, result.steer)
else
warn("Failed to communicate with RFAS server:", result)
end
end
function RFAS:getState()
-- Implement this based on your vehicle setup
return {
distance = self.vehicle.DistanceSensor.Value,
speed = self.vehicle.Velocity.Magnitude / 100, -- Normalize to 0-1
steer = self.vehicle.SteerValue,
throttle = self.vehicle.ThrottleValue
}
end
function RFAS:applyControls(throttle, steer)
-- Implement this based on your vehicle control system
self.vehicle.ThrottleValue = throttle
self.vehicle.SteerValue = steer
end
function RFAS:sendReward(reward, collision, lapCompleted)
local data = {
reward = reward,
speed = self.vehicle.Velocity.Magnitude / 100,
steer = self.vehicle.SteerValue,
throttle = self.vehicle.ThrottleValue,
collision = collision or false,
lap_completion = lapCompleted or false
}
spawn(function()
pcall(function()
HttpService:PostAsync(
self.serverUrl.."/reward",
HttpService:JSONEncode(data)
)
end)
end)
end
return RFAS
Download Complete Roblox Client Script
Need Help?
If you encounter any issues or have questions about implementing RFAS in your project, reach out to our community: