RFAS Documentation

Overview

RFAS (Roblox's First AI System) is a reinforcement learning-based solution for creating intelligent racing AI in Roblox games. The system uses PPO (Proximal Policy Optimization) algorithm to train an Actor-Critic neural network that controls vehicle throttle and steering.

The system consists of two main components:

  • A Flask API server that communicates with Roblox
  • A PyTorch-based AI model that makes driving decisions

Key Features

  • Real-time control decisions based on game state
  • Adaptive learning through reinforcement feedback
  • Continuous training to improve performance
  • TensorBoard integration for monitoring training metrics

API Reference

The RFAS system exposes two main API endpoints for communication with Roblox games:

POST /state

Receives the current state of the vehicle and returns control commands.

Request Body

{ "distance": float, // Distance to track boundary or obstacles "speed": float, // Current speed of the vehicle (normalized 0-1) "steer": float, // Current steering value (-1 to 1) "throttle": float // Current throttle value (-1 to 1) }

Response

{ "status": "State received", "throttle": float, // Throttle command (-1, 0, or 1) "steer": float // Steering command (-1, 0, or 1) }

POST /reward

Sends reward feedback to the system for training purposes.

Request Body

{ "reward": float, // Base reward value "speed": float, // Current speed for reward processing "steer": float, // Current steering value "throttle": float, // Current throttle value "collision": boolean, // Optional: Whether a collision occurred "lap_completion": boolean // Optional: Whether a lap was completed }

Response

{ "status": "Reward received" }

Model Architecture

RFAS uses an Actor-Critic architecture, which is a common approach in reinforcement learning. The model consists of:

Actor-Critic Architecture

Network Structure

  • Input Layer: 4 neurons (distance, speed, steer, throttle)
  • Hidden Layer 1: 128 neurons with ReLU activation
  • Hidden Layer 2: 64 neurons with ReLU activation
  • Actor Output: 2 neurons (throttle and steer) with tanh activation
  • Critic Output: 1 neuron (state value estimation)
class ActorCritic(nn.Module): def __init__(self, input_dim, action_dim=2): super(ActorCritic, self).__init__() self.fc1 = nn.Linear(input_dim, 128) self.fc2 = nn.Linear(128, 64) self.actor = nn.Linear(64, action_dim) self.critic = nn.Linear(64, 1) def forward(self, x): x = torch.relu(self.fc1(x)) x = torch.relu(self.fc2(x)) return x def act(self, state): x = self.forward(state) action = torch.tanh(self.actor(x)) throttle, steer = action[0].item(), action[1].item() throttle = 1 if throttle > 0.5 else -1 if throttle < -0.5 else 0 steer = 1 if steer > 0.6 else -1 if steer < -0.5 else 0 return throttle, steer def evaluate(self, state): x = self.forward(state) return self.critic(x)

The Actor component determines the actions to take (throttle and steering), while the Critic component evaluates how good the current state is, which helps in calculating the advantage for training.

Training Process

RFAS uses Proximal Policy Optimization (PPO) with Generalized Advantage Estimation (GAE) for efficient and stable training. The process runs continuously in a separate thread while the API server handles incoming requests.

Key Training Components

PPO Algorithm

PPO is used to update the policy in a way that avoids destructively large updates, leading to more stable training. Key parameters:

  • clip_epsilon = 0.2
  • Multiple epochs (4) per batch
  • Batch size of 32 experiences

Advantage Estimation

GAE is used to compute advantages that inform the policy update:

  • gamma = 0.99 (discount factor)
  • lambda_gae = 0.95 (smoothing parameter)
  • Balances bias and variance in advantage estimates

Reward Structure

The reward function promotes desired behaviors:

def process_reward(reward, state): speed = state['speed'] steer = state["steer"] throttle = state["throttle"] collision = state.get('collision', False) lap_completion = state.get('lap_completion', False) total_reward = reward if 0.5 <= speed <= 0.8: total_reward += 1 # Optimal speed range elif speed > 0.8: total_reward += 0.5 # Fast but not optimal if collision: total_reward -= 2 # Penalty for collisions if lap_completion: total_reward += 10 # Significant bonus for completing a lap return total_reward

Training Flow

  1. Collect state data from Roblox through the API
  2. Generate actions using the current policy
  3. Receive rewards for the taken actions
  4. Accumulate experiences in batches of 32
  5. Compute advantages using GAE
  6. Update the policy using the PPO objective
  7. Log metrics to TensorBoard for monitoring

Integration Guide

Follow these steps to integrate RFAS into your Roblox racing game:

1. Setup the Server

First, ensure you have Python installed with the required dependencies:

pip install torch flask numpy tensorboard

Then, run the RFAS server script:

python rfas_server.py

2. Roblox HTTP Service Setup

In your Roblox game, enable the HTTP Service and create scripts to communicate with the RFAS API.

-- Example Roblox script local HttpService = game:GetService("HttpService") local serverUrl = "http://localhost:5000" -- Adjust if running on a different server local function sendState(vehicle) -- Get vehicle state local state = { distance = vehicle.DistanceToEdge.Value, speed = vehicle.Speed.Value, steer = vehicle.SteerValue.Value, throttle = vehicle.ThrottleValue.Value } -- Send to API local response = HttpService:PostAsync( serverUrl.."/state", HttpService:JSONEncode(state) ) -- Parse response local result = HttpService:JSONDecode(response) -- Apply controls to vehicle vehicle:ApplyControls(result.throttle, result.steer) end -- Call sendState periodically while true do sendState(vehicleObject) wait(0.1) -- 10 times per second end

3. Reward Feedback

Implement a system to send rewards to the RFAS server based on vehicle performance:

-- Example reward function local function sendReward(vehicle, reward, collision, lapCompleted) local data = { reward = reward, speed = vehicle.Speed.Value, steer = vehicle.SteerValue.Value, throttle = vehicle.ThrottleValue.Value, collision = collision, lap_completion = lapCompleted } HttpService:PostAsync( serverUrl.."/reward", HttpService:JSONEncode(data) ) end -- Example usage vehicle.Collision:Connect(function() sendReward(vehicle, -1, true, false) end) checkpoints.LapCompleted:Connect(function() sendReward(vehicle, 5, false, true) end)

Monitoring and Optimization

RFAS includes TensorBoard integration for monitoring training progress. The system logs:

  • Policy Loss: Measures how much the policy is changing
  • Value Loss: Measures how well the critic predicts state values
  • Total Loss: Combined loss used for optimization
  • Average Rewards: Tracks overall performance improvements

Viewing TensorBoard Logs

To view training metrics:

tensorboard --logdir=runs/TrainingMetrics

Then open your browser at http://localhost:6006 to view the dashboard.

Optimization Tips

  • Adjust reward function: Customize process_reward to better reflect desired behaviors in your specific racing game.
  • Tune hyperparameters: Experiment with gamma, lambda_gae, and clip_epsilon values for better performance.
  • Expand state information: Add more input features like track curvature or upcoming obstacles for more informed decisions.
  • Adjust network size: For more complex environments, consider increasing the size of the hidden layers.

Full Code Examples

Server Code (Python)

from flask import Flask, jsonify, request import torch import torch.nn as nn import torch.optim as optim import json import time from threading import Thread from torch.utils.tensorboard import SummaryWriter write = SummaryWriter("runs/TrainingMetrics") app = Flask(__name__) state_data_path = 'state_data.json' reward_data_path = 'reward_data.json' class ActorCritic(nn.Module): def __init__(self, input_dim, action_dim=2): super(ActorCritic, self).__init__() self.fc1 = nn.Linear(input_dim, 128) self.fc2 = nn.Linear(128, 64) self.actor = nn.Linear(64, action_dim) self.critic = nn.Linear(64, 1) def forward(self, x): x = torch.relu(self.fc1(x)) x = torch.relu(self.fc2(x)) return x def act(self, state): x = self.forward(state) action = torch.tanh(self.actor(x)) throttle, steer = action[0].item(), action[1].item() throttle = 1 if throttle > 0.5 else -1 if throttle < -0.5 else 0 steer = 1 if steer > 0.6 else -1 if steer < -0.5 else 0 return throttle, steer def evaluate(self, state): x = self.forward(state) return self.critic(x) model = ActorCritic(input_dim=4) optimizer = optim.Adam(model.parameters(), lr=0.001) clip_epsilon = 0.2 gamma = 0.99 lambda_gae = 0.95 # ... rest of the server code ...

Download Complete Server Script

Roblox Client Code (Lua)

-- RFAS Client Implementation local HttpService = game:GetService("HttpService") local RunService = game:GetService("RunService") local RFAS = {} RFAS.__index = RFAS function RFAS.new(serverUrl, vehicle) local self = setmetatable({}, RFAS) self.serverUrl = serverUrl or "http://localhost:5000" self.vehicle = vehicle self.isRunning = false self.connection = nil return self end function RFAS:start() if self.isRunning then return end self.isRunning = true self.connection = RunService.Heartbeat:Connect(function() self:update() end) print("RFAS started") end function RFAS:stop() if not self.isRunning then return end if self.connection then self.connection:Disconnect() self.connection = nil end self.isRunning = false print("RFAS stopped") end function RFAS:update() -- Collect state data local state = self:getState() -- Send to API and get response local success, result = pcall(function() local response = HttpService:PostAsync( self.serverUrl.."/state", HttpService:JSONEncode(state) ) return HttpService:JSONDecode(response) end) if success and result then -- Apply controls self:applyControls(result.throttle, result.steer) else warn("Failed to communicate with RFAS server:", result) end end function RFAS:getState() -- Implement this based on your vehicle setup return { distance = self.vehicle.DistanceSensor.Value, speed = self.vehicle.Velocity.Magnitude / 100, -- Normalize to 0-1 steer = self.vehicle.SteerValue, throttle = self.vehicle.ThrottleValue } end function RFAS:applyControls(throttle, steer) -- Implement this based on your vehicle control system self.vehicle.ThrottleValue = throttle self.vehicle.SteerValue = steer end function RFAS:sendReward(reward, collision, lapCompleted) local data = { reward = reward, speed = self.vehicle.Velocity.Magnitude / 100, steer = self.vehicle.SteerValue, throttle = self.vehicle.ThrottleValue, collision = collision or false, lap_completion = lapCompleted or false } spawn(function() pcall(function() HttpService:PostAsync( self.serverUrl.."/reward", HttpService:JSONEncode(data) ) end) end) end return RFAS

Download Complete Roblox Client Script

Need Help?

If you encounter any issues or have questions about implementing RFAS in your project, reach out to our community: