Classifying Video Labels from Youtube

Abstract

Goal for my thesis: Teach an AI to label a genre (e.g., Makeup, Games, Art & Entertainment, etc.) on a YouTube video using a series of deep learning algorithms and compare each one to understand their strengths and weaknesses, and scalability properties. This algorithm can potentially automate repetitive labor organizing YouTube videos with similar content in a recommendation search engine and classify copyright material. I will be using Google’s YT-8M dataset [1]—initially 0.5 petabytes of data, compressed to 1.5 terabytes for the research dataset. There will be two types of features: video-level (2-dimensions) and frame-level (3-dimensions) data. I coded eight deep learning models: 4 in Keras and the same models in PyTorch to compare not only models but frameworks against each other.

Feature engineering pipeline (YouTube-8M)

Code and Extras

Find additional resources on Github, including:

Training/test code (uses TensorFlow)
Pretrained model
Live webcam demo
Dense Captioning metric evaluation code

BibTeX

@inproceedings{densecap,                title     = {Classifying Video Labels from Youtube},                author    = {Roberto Chavez},                advisors  = {Ram Akella, Roberto Manduchi},                booktitle = {UCSC Jack Baskin School of Engineering},                year      = {2017}                }

Roberto Chavez Jr

Software Engineer

Resume

Classifying Video Labels from Youtube

Abstract

Code and Extras

BibTeX