Computer Vision and Cognitive Systems Project 2023/2024
The accurate detection and description of players and balls in tennis match images is crucial for detailed match analysis. This project addresses the challenge of achieving high-quality detection of tennis players and balls to facilitate comprehensive analysis of tennis matches, including tracking in video data. Our approach leverages the YOLO (You Only Look Once) model for object detection and the BLIP model for generating natural language descriptions that capture the spatial relationships between detected objects on the court.
To enhance our analysis, we incorporated a tracking component using TrackNet, allowing us to extend the detection capabilities to video data and analyze the continuous movements of players and balls. This holistic approach ensures that both static and dynamic aspects of tennis matches are effectively captured.
Our results demonstrate that the combined use of YOLO and BLIP models achieves remarkable accuracy and speed in detecting players and balls. The YOLO model's state-of-the-art performance in object detection, coupled with BLIP's ability to generate detailed spatial descriptions, provides a robust solution for tennis match analysis. Additionally, the integration of TrackNet for tracking significantly enhances the system's applicability to real-time and recorded videos, offering a comprehensive tool for tennis analytics.
Repository
- RepositoryGitHub
Paper
- PaperGithub
Credits
- Andrea Grandi@andrea-grandi
- David Wuttke@DInoWDave
- Daniele Vellani@franzione1