Skip to content

Latest commit

 

History

History
57 lines (48 loc) · 2.37 KB

archive.org.md

File metadata and controls

57 lines (48 loc) · 2.37 KB

Playdrone snapshots on archive.org

We worked with the archive.org team to provide snapshots of the Google Play Android market. We crawled the market 13 days, from 2014-10-19 to 2014-10-31, and released the data on archive.org for public use.

For each day, a file https://archive.org/download/playdrone-snapshots/2014-10-{19..31}.json contains the list of all applications present on the market for that specific day.

These daily snapshots contain a list of applications as an array of json objects. The following show an example of an application object:

{
  "app_id":"com.google.android.youtube",
  "title":"YouTube",
  "developer_name":"Google Inc.",
  "category":"MEDIA_AND_VIDEO",
  "free":true,
  "version_code":51405300,
  "version_string":"5.14.5",
  "installation_size":10191835,
  "downloads":1000000000,
  "star_rating":4.08009,
  "snapshot_date":"2014-10-31",
  "metadata_url":"https://archive.org/download/playdrone-metadata-2014-10-31-c9/com.google.android.youtube.json",
  "apk_url":"https://archive.org/download/playdrone-apk-c9/com.google.android.youtube-51405300.apk"
}

Notes:

  • Applications are sorted by download counts.
  • The provided metadata in the snapshot files only contains a subset of available metadata for a given application. To fetch additional metadata, the metadata_url link can be fetched. Its content is the Google API response in JSON format when asked for the application full metadata. To interpret the raw metadata, you may use this reference.
  • The apk_url link can be fetch to retrieve the corresponding APK binary.
  • Overall, Playdrone failed to download the APK for 0.1% of the free applications. For these applications, no apk_url property is provided.

If you need a single snapshot of the market, we recommend you use the latest snapshot: https://archive.org/download/playdrone-snapshots/2014-10-31.json. It contains 1,402,894 applications.

To quickly download 1000 APKs starting from the most popular, the following can be used to download APKs in 12-way to speed things up:

wget https://archive.org/download/playdrone-snapshots/2014-10-31.json
grep apk_url 2014-10-31.json | head -n 1000 | sed -e 's/"apk_url":"\(.*\)"$/\1/' | xargs -L1 -P12 wget -q