Classifying Video Labels from Youtube
Abstract
Goal for my thesis: Teach an AI to label a genre (e.g., Makeup, Games, Art & Entertainment, etc.) on a YouTube video using a series of deep learning algorithms and compare each one to understand their strengths and weaknesses, and scalability properties. This algorithm can potentially automate repetitive labor organizing YouTube videos with similar content in a recommendation search engine and classify copyright material. I will be using Google’s YT-8M dataset [1]—initially 0.5 petabytes of data, compressed to 1.5 terabytes for the research dataset. There will be two types of features: video-level (2-dimensions) and frame-level (3-dimensions) data. I coded eight deep learning models: 4 in Keras and the same models in PyTorch to compare not only models but frameworks against each other.


Code and Extras
Find additional resources on Github, including:
- Training/test code (uses TensorFlow)
- Pretrained model
- Live webcam demo
- Dense Captioning metric evaluation code
BibTeX
@inproceedings{densecap, title = {Classifying Video Labels from Youtube}, author = {Roberto Chavez}, advisors = {Ram Akella, Roberto Manduchi}, booktitle = {UCSC Jack Baskin School of Engineering}, year = {2017} }