AI-Driven Player Identification System for Pro Golf Broadcasts

Introduction

This project is designed to develop an intelligent system to solve a specific person identification challenge in the sports domain.

The system works by ingesting an input image (e.g., a frame extracted from a video stream). It then employs a dual-approach methodology:

Text-based Analysis: It uses Optical Character Recognition (OCR) to read any text that might appear on-screen (a player’s nameplate).
Image-based Analysis: It extracts visual features from the player in the image and compares them against a pre-existing database of known athletes to find the best match.

The ultimate purpose is to provide the player’s correct name, solving a task that is tedious and time-consuming when performed manually.

Team size

4 members

Industry

Technology

Python, EasyOCR, CLIP (OpenAI), Llama3.1, Ollama, MMCV

Highlights

Feasibility Demonstration: The PoC successfully proved that combining OCR+LLM with the CLIP model is a viable and effective approach for automated player identification.
Potential for Increased Efficiency: This solution lays the groundwork for a fully automated system that could dramatically reduce manual labor, accelerate content production workflows, and improve the accuracy of player tagging.
Innovative Aspect: The creative combination of a Large Language Model to “understand” OCR output and an advanced vision model to “see” and compare images is the solution’s most unique and powerful feature, allowing it to overcome challenges that traditional methods struggle with.

Visual Identification Difficulties:
- Images may be captured from a distance, making the player appear small and difficult to identify in detail.
- A player’s appearance can change significantly (e.g., hairstyle changes from long to short), which reduces the effectiveness of traditional recognition methods based on simple facial or appearance matching.
Data Dependency and Uncertainty:
- The system requires a predefined list of potential players (specifically Japanese golfers) to perform the matching.
- The presence of the player’s name on-screen is a critical assumption that still needs to be confirmed (“to be confirmed”). If no text is available, the system must rely solely on visual identification.
Inefficiency of the Current Solution:
- The implied current process is manual identification by a human operator. This process is slow, labor-intensive, prone to error, and requires specialized knowledge of the players.

We proposed and designed a hybrid AI solution to comprehensively address the challenges.