-
Notifications
You must be signed in to change notification settings - Fork 2
03. 🔍Search →
The search tab is the main crux of the scraper. It's what will control what request will be made to the YouTube API.
The full reference to the YouTube API is here: https://developers.google.com/youtube/v3/docs
This will show you two things:
- The full query string that will be sent to the API, INCLUDING your API key.
- The response that was sent back from YouTube.
The search instance is all the settings for one request type.
This will do two things:
- Populate the scrape tab selection box for the search you want to use.
- Will be the name of the term used in the taxonomy that any new posts will be tagged with. This means that you can see exactly which posts have been inserted from the scrape by navigating to the taxonomy of the import and select the term with the name of the Search ID.
Purely informational. Used to describe what the search is doing. Use for notes.
There are currently four different request types:
This will call the Search: list endpoint of the API. It does not require any additional 'Extra Search Parameters'. This is a default search.
This will loop a call to the Activities: List endpoint of the API. This is used to collect video items from multiple channels.
You can specify the channels to query within the 'Extra Search Parameters' textarea. This is an associative array with a key of channels
and a comma-separated value listing all channels.
For example:
[ 'channels' => 'UCdPui8EYr_sX6q1xNXCRPXg, UC2lFz1NdmLCciJp5yG19oyw, UCwO-9tpNU8ThpeU0W2BXRiw' ]
Used to query the Playlistitems : list endpoint to retrieve items from a specified playlist. Does not require any 'Extra Search Parameters'.
This will query the Playlists : list endpoint. Good for retrieving all of the playlists from a specific channel. Does not require any 'Extra Search Parameters'.
This is where you enter the full string of the search parameters you wish to pass. The code will do an HTTP GET call to this URL string. Things to remember:
- DO NOT include the API Key. This will automatically be added.
- No need to add the google domain either. This will be prefixed to the string.
- Only specify the parameters of the search string... So everything after the question-mark (
?
) on the URL string.(https://www.googleapis.com/youtube/v3/playlistItems?...)
The Search string can contain two extra components:
A search token is recognised with double-moustache brackets {{WORD}}
and is used to run some extra function to pass to the search string BEFORE it is sent to the API. At the moment, there is only one token available:
The date token.
{{date=-24 hours}}
The date token allows you to pass in an ATOM datetime to the search string. You can control the token by using PHP DATETIME class inputs.
So, for instance, if you wanted to specify that the search string only got video items that have only been published in the last 10 minutes, you could use the searchstring:
part=snippet&maxResults=10&q=parkour&publishedAfter={{date=-10 minutes}}
This will grab the exact time of -10 minutes, convert it to the required ATOM datetime and replace it into the querystring, becoming something like:
part=snippet&maxResults=10&q=parkour&publishedAfter=2020-04-28T16%3A39%3A45%2B00%3A00
Search substitutions are recognised by using double square brackets [[WORD]]
.
These are simpler string substitutions that are run after any tokens are run. The idea here is to make shortcuts for the searchstring. By using the search substitution you can replace anything. To do this, you can use the Search Substitution section below the Search Instance area.
This is where you can define any custom substitutions you wish to make on the search string line.
Define the word you wish to use on the searchstring line within the double square brackets. Note: You DO NOT need to add the square brackets in this textbox. The brackets will automatically be added.
Whatever textstring you wish to add into the search query. For instance, you could have a blacklist
word that will become [[blacklist]]
on the search string line. When processed, this will be replaced with the text string:
-Roblox -Fortnite -Fortnight -Minecraft -crossfire -tank -tanks -gameplay -GTAv -GTA -Gaming -lego -streamlabs -playstation -xbox -nintendo -BlockStarPlanet
The search string line becomes:
part=snippet&maxResults=10&publishedAfter={{date=-10 minutes}}&q=parkour[[blacklist]]
You can also include a token within the search replacement. So, you could have the word last10mins
become a replacement for &publishedAfter={{date=-10 minutes}}
Which makes the search string even easier:
part=snippet&maxResults=10[[last10mins]]&q=parkour[[blacklist]]
This makes the search functionality incredibly flexible and powerful to make the query search you require. It also means you can reuse the search substitutions for parameters that are used frequently.