[Home] [Setup] [Documentation] [Tools] [Toolbar] [Research]

TubeKit is a toolkit for creating YouTube crawlers. It allows one to build one's own crawler that can crawl YouTube based on a set of seed queries and collect up to two dozen different attributes. TubeKit assists in all the phases of this process starting database creation to finally preparing analysis reports from the collected data.

Requirements: a UNIX-based system (Linux, Solaris, Mac), Web server with PHP and MySQL support, PHP 5 or higher, and MySQL 4 or higher. Most UNIX-based systems should have these already installed. Optionally, if you want to download flash videos from YouTube, you need youtube-dl (a copy included here) and Python 2.4 or higher. For converting them to mpeg, you'll require ffmpeg tool.


Steps to create your YouTube crawler with TubeKit:
  1. Provide basic information (project name, directories to store Flash and MPEGE videos).
  2. Set up the database.
  3. Select up to 17 different attributes to collect for a YouTube video.
  4. Set up various schedules for crawling.
  5. Enter seed queries.
You can find detailed information in the documentation. If you still have any problem, feedback/comment, or for the bug report, you can contact Chirag Shah. To get started with creating a crawler, go to Setup.

Creative Commons License TubeKit by Chirag Shah is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License and is meant to be used in a research setting. You are requested to look at the licence agreements of the third party tools and services that you may use for TubeKit. The author does not take any responsibility for license violation of those products or misuse of TubeKit.

| Send comments | Visit TubeKit Website | © 2007-2012 Chirag Shah |