How SkillShow automates youth sports video processing using Amazon Transcribe

SkillShow, a leader in youth sports video production, films over 300 events yearly in the youth sports industry, creating content for over 20,000 young athletes annually. This post describes how SkillShow used Amazon Transcribe and other Amazon Web Services (AWS) machine learning (ML) services to automate their video processing workflow, reducing editing time and costs while scaling their operations.

Jun 24, 2025 - 17:00
How SkillShow automates youth sports video processing using Amazon Transcribe

This post is co-written with Tom Koerick from SkillShow.

The youth sports market was valued at $37.5 billion globally in 2022 and is projected to grow by 9.2% each year through 2030. Approximately 60 million young athletes participate in this market worldwide. SkillShow, a leader in youth sports video production, films over 300 events yearly in the youth sports industry, creating content for over 20,000 young athletes annually. This post describes how SkillShow used Amazon Transcribe and other Amazon Web Services (AWS) machine learning (ML) services to automate their video processing workflow, reducing editing time and costs while scaling their operations.

Challenge

In response to the surge in youth sports video production, manual video editing processes are becoming increasingly unsustainable. Since 2001, SkillShow has been at the forefront of sports video production, providing comprehensive video services for individuals, teams, and event organizers. They specialize in filming, editing, and distributing content that helps athletes showcase their skills to recruiters, build their personal brand on social media, and support their development training. As a trusted partner to major sports organizations including the Perfect Game, 3Step Sports, USA Baseball, MLB Network, Under Armour, Elite11 football combines and more, SkillShow has filmed hundreds of thousands of athletes and thousands of regional and national events across different sports and age groups.

Despite their market leadership, SkillShow faced significant operational challenges. With only seven full-time employees managing their expanding operation, they had to outsource to over 1,100 contractors annually. This reliance on outsourced editing not only increased operational costs but also resulted in a lengthy 3-week turnaround time per event, making it difficult to keep pace with the growing demand for youth sports content.

Managing approximately 230 TB of video data per year created significant operational challenges. This massive volume of data meant lengthy upload and download times for editors, expensive storage costs, and complex data management requirements. Each event’s raw footage needed to be securely stored, backed up, and made accessible to multiple editors, straining both technical resources and IT infrastructure. These challenges led to SkillShow halting new events mid-2023, limiting their growth potential in a rapidly expanding market. The need for an efficient, scalable solution became critical to maintaining SkillShow’s position and meeting the growing demand for youth sports content, particularly in the post-COVID era where recruiting videos have become essential for leagues and athletes alike.

Solution overview

To address these challenges, SkillShow partnered with AWS to develop an automated video processing pipeline. The team initially explored several approaches to automate player identification.

Facial recognition proved challenging due to varying video quality, inconsistent lighting conditions, and frequent player movement during games. Additionally, players often wore equipment such as helmets or protective gear that obscured their faces, making reliable identification difficult.

Text-based detection of jersey numbers and colors seemed promising at first, but presented its own set of challenges. Jersey numbers were frequently obscured by player movement, weather conditions could affect visibility, and varying camera angles made consistent detection unreliable.

Ultimately, the team settled on an audio logging and automated clip generation solution, which proved superior for several reasons:

  • More reliable player identification, because announcers consistently call out player numbers and team colors
  • Better performance in varying environmental conditions, because audio quality remains relatively consistent even in challenging weather or lighting
  • Reduced processing complexity and computational requirements compared to video-based analysis
  • More cost-effective due to lower computational demands and higher accuracy rates
  • Ability to capture additional context from announcer commentary, such as play descriptions and game situations

This solution uses several key AWS services:

  • Amazon Simple Storage Service (Amazon S3):
    • Used for storing the input and output video files
    • Provides scalable and durable storage to handle SkillShow’s large video data volume of 230 TB per year
    • Allows for straightforward access and integration with other AWS services in the processing pipeline
  • AWS Lambda:
    • Serverless compute service used to power the automated processing workflows
    • Triggers the various functions that orchestrate the video processing, such as transcription and clip generation
    • Enables event-driven, scalable, and cost-effective processing without the need to manage underlying infrastructure
  • Amazon Transcribe:
    • Automatic speech recognition (ASR) service used to convert the video audio into text transcripts
    • Provides the foundation for analyzing the video content and identifying player details
    • Allows for accurate speech-to-text conversion, even in noisy sports environments

The following diagram illustrates the solution architecture.

Workflow diagram of AWS services for audio processing: S3, Lambda, and Amazon Transcribe

SkillShow AWS Architecture Diagram

The architectural flow is as follows:

  1. The authorized user uploads a .csv file containing roster information (such as jersey color, number, player name, and school) and the video footage of players.
  2. A Lambda function is triggered by the upload of the video.
  3. The auto-transcript Lambda function uses Amazon Transcribe to generate a timestamped transcript of what is said in the input video.
  4. The transcript is uploaded to the output S3 bucket under transcripts/ for further use.
  5. The authorized user can invoke the auto-clipper Lambda function with an AWS Command Line Interface (AWS CLI) command.
  6. The function parses the transcript against player information from the roster.
  7. When identifying players, the function clips videos based on a specified keyword (in SkillShow’s case, it was “Next”) and uploads them to the output S3 bucket under segments/.

By using this suite of AWS services, SkillShow was able to build a scalable, cost-effective, and highly automated video processing solution that addressed their key operational challenges. The cloud-based architecture provides the flexibility and scalability required to handle their growing data volumes and evolving business needs.

Example processing workflow

Let’s explore an example processing workflow. As shown in the following screenshots, we first upload a player roster .csv and video file to the input bucket.

Amazon S3 management console showing two files in skillshow-input-videos bucket with metadata and actions

The auto-transcribe function processes the audio.

Amazon S3 management console displaying transcripts folder contents, including JSON output and temp file

The auto-clipper function segments the video based on player information.

AWS Lambda console displaying test event configuration with S3 bucket and file path parameters

Final clips are uploaded to the output bucket between two separate folders: a prefix of the input video name or Unnamed/ if the transcription was unclear or missing the player name within the segment.

Amazon S3 management interface showing two empty folders within skillshow-output-videos/segments path

Named videos can be viewed in the first folder where SkillShow’s current naming convention (jersey color_number_event video name) is followed for editors to download on demand.

S3 bucket interface showing four timestamped MP4 video segments with metadata and storage details

Unnamed videos can be seen in a similar naming convention, only missing the unique player name. Now, the editors only have to review files in this folder and manually rename the file instead of having to do this for entire event videos.

Amazon S3 interface showing segments/Unnamed folder containing unnamed MP4 file with creation date and storage details

Results and benefits

After implementing this AWS powered solution, SkillShow transformed their video processing operations. The automated pipeline reduced video production time from 3 weeks to 24 hours per event, enabling faster delivery to athletes and scouts. A recent event in Chicago showcased the system’s effectiveness. The automated pipeline processed 69 clips, accurately cutting and naming 64 of them—achieving a 93% success rate. This high accuracy demonstrates the solution’s ability to handle real-world scenarios effectively. The system also proved adaptable, quickly addressing initial challenges such as color naming inconsistencies.

The Northwest Indoor event further illustrated the system’s scalability and versatility. Here, the automated process handled a larger volume of approximately 270 clips, maintaining an estimated accuracy rate of over 90%. Notably, this event included batting practice footage, highlighting the solution’s adaptability to various types of sports activities.

With this streamlined workflow, SkillShow has expanded its capacity to process multiple events simultaneously, significantly enhancing its ability to serve youth sports leagues. The standardized output format and improved player identification accuracy have enhanced the viewing experience for athletes, coaches, and scouts alike. Although the time savings varies depending on specific event conditions and filming techniques, the system has demonstrated its potential to substantially reduce manual editing work. SkillShow continues to refine the process, carefully balancing automation with quality control to provide optimal results across diverse event types. These improvements positioned SkillShow to meet the growing demand for youth sports video content while maintaining consistent quality across all events.

Conclusion

This solution demonstrates how AWS ML services can transform resource-intensive video processing workflows into efficient, automated systems. By combining the scalable storage of Amazon S3, serverless computing with Lambda, and the speech recognition capabilities of Amazon Transcribe, organizations can dramatically reduce processing times and operational costs. As a leader in automated sports video production, SkillShow has pioneered this approach for youth sports while demonstrating its adaptability to various content types, from educational videos to corporate training. They’re already exploring additional artificial intelligence and machine learning (AI/ML) capabilities for automated highlight generation, real-time processing for live events, and deeper integration with sports leagues and organizations.

For organizations looking to further enhance their video processing capabilities, Amazon Bedrock Data Automation offers additional possibilities. Amazon Bedrock Data Automation can streamline the generation of valuable insights from unstructured, multimodal content such as documents, images, audio, and videos. This fully managed capability could potentially be integrated into workflows similar to SkillShow’s, offering features such as automated video summaries, content moderation, and custom extraction of relevant information from video content. Furthermore, Amazon Bedrock Data Automation can generate custom insights from audio, including summaries and sentiment analysis, providing even deeper understanding of spoken content in sports videos.

SkillShow’s success highlights the broader potential of cloud-based video processing. As demand for video content continues to grow across industries, organizations can use AWS ML services to automate their workflows, reduce manual effort, and focus on delivering value to their customers rather than managing complex editing operations.

Are you interested in implementing similar automated video processing workflows for your organization? Contact SkillShow to learn how their pipeline built with AWS services can transform your content production process.


About the Authors

Ragib Ahsan is a Partner Solutions Architect at Amazon Web Services (AWS), where he helps organizations build and implement AI/ML solutions. Specializing in computer vision, he works with AWS partners to create practical applications using cloud technologies. Ahsan is particularly passionate about serverless architecture and its role in making solutions more accessible and efficient.

Tom Koerick is the owner and CEO of SkillShow, a sports media network company that has been filming youth sporting events nationwide since 2001. A former professional baseball player turned entrepreneur, Tom develops video solutions for event organizers and families in the youth sports industry. His focus includes college recruiting, social media sharing, and B2B services that provide added value and revenue generation opportunities in youth sports.

Jat AI Stay informed with the latest in artificial intelligence. Jat AI News Portal is your go-to source for AI trends, breakthroughs, and industry analysis. Connect with the community of technologists and business professionals shaping the future.