Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

itslab-kyushu/youtube-comment-scraper

Repository files navigation

Youtube comment scraper

MIT License npm version Code Climate Japanese

Scraping comments from Youtube.

Installation

To install youtube-comment-scraper in your global environment,

$ npm install -g youtube-comment-scraper

after that, you can use scraper command.

Usage

Usage: scraper url [options]

        url
                URL for a Youtube video page or video ID.

        --help, -h
                Displays help information about this script

        --version
                Displays version info

Output is a JSON format text. Its schema looks like

{
  "id": "the video ID.",
  "channel":{
    "id": "ID of the channel the video belongs to.",
    "name" : "the channel name."
  },
  "comments": [
    {
      "root": "root (parent) comment body.",
      "author": "author of the root comment.",
      "author_id": "ID of the author",
      "like": "like score (summation of +1 for like and -1 for dislike).",
      "children": [
        {
          "comment": "reply comment.",
          "author": "author of the reply comment.",
          "author_id": "author ID",
          "like": "like score."
        },
        ...
      ]
    },
    ...
  ]
}

Method

var scraper = require("youtube-comment-scraper");

scraper.comments(url)

Scraping a given Youtube page and return a set of comments.

  • Args:
    • url: URL of the target page or video ID.
  • Returns: Promise object. Use "then" to receive results.

scraper.channel(url)

Scraping a Youtube channel page and return a description of the channel.

  • Args:
    • id: channel ID.
  • Returns: Promise object. Use "then" method to receive results.

scraper.close()

Cleanup this module. After all scrapings have done, this method should be called. Otherwise, some instances of PhantomJS will keep running and it prevents finishing main process.

example

scraper.comments(some_url).then(function(res) {
  // Printing the result.
  console.log(JSON.stringify({
    url: some_url,
    comments: res
  }));

  // Close scraper.
  scraper.close();
});

For developers

Build

Run the following two command.

$ npm install
$ npm run build

Run

$ ./bin/cli.js <url>

<url> is a Youtube url.

License

This software is released under the MIT License, see LICENSE.