IRI - Vision and ML Based PvP Video Game Agent

Master Thesis

Vision and ML Based PvP Video Game Agent

Supervisor/s

Antonio Agudo Martínez

Information

If you are interested in the proposal, please contact with the supervisors.

Description

Introduction
This project aims to develop an AI agent capable of playing a player-versus-player (PvP) video game using only visual input, without direct access to the game’s internal state. The agent will process real-time screen captures to identify relevant elements (players, objects, environmental features) and use this information to make strategic decisions. By relying solely on vision, the system mimics human perception, increasing its adaptability and potential to generalize across different games and genres.

Recent advances in computer vision and machine learning have enabled agents to interpret complex visual environments and make informed decisions based on observations rather than internal game data. This approach emphasizes perception and real-time analysis, allowing AI to operate under constraints similar to those faced by human players. Vision-based AI has applications beyond gaming, including robotics, autonomous navigation, and any system requiring real-time interpretation of dynamic visual information, making it a versatile and relevant research area.

Objectives
1. Vision Pipeline: Capture and process visual data to detect and track game elements.
2. Decision Model: Train an ML-based agent (reinforcement or supervised learning) to optimize its win rate.
3. Evaluation: Measure the AI’s performance relative to human players, assessing the time and number of games required to reach specific in-game metrics such as ranks, trophies, medals...

Methodology
- Acquire gameplay footage and process it using computer vision techniques.
- Convert visual data into structured state representations.
- Train and test the agent in PvP scenarios, iteratively refining the model.
- Compare AI progress to human benchmarks in terms of skill development and achievement milestones.

Motivation
- The video game for this project can be chosen freely, though Clash Royale is the initial candidate.
- The project is inspired by previous work on vision-based agents, such as the demonstration shown in this video: https://www.youtube.com/watch?v=6Gm-pnNieMU.
- Additionally, the project aims to develop an AI agent or framework that can perform well in the chosen game, while also being adaptable: it should be possible to retrain it using data gathered from other games and still achieve competent performance.

Requirements:
- Strong programming skills in Python are a requirement.
- Solid understanding of machine learning concepts is a requirement.

Preferable:
- Interest in computer vision, reinforcement learning, or AI for gaming is preferable.
- Experience with AI frameworks such as PyTorch is preferable.

This project will be supervised by Oriol Jiménez Ayguadé and Antonio Agudo.