Examples
In this tutorial, we’ll assume that youtube is already installed on your system. If that’s not the case, see Installation guide.
We are going to download data for a youtube channel by Wabosha Maxine, who is a Kenyan Youtuber. This will include:
The Channel Details
The Channel Playlists
The Playlist Items
Videos in each playlist.
Comments for videos.
The tasks that we will perform include:
Obtaining data for various resources mentioned above.
Exporting the data for later use
youtube is written in Python. If you’re new to the language you might want to start by getting an idea of what the language is like, to get the most out of youtube.
For this tutorial, you will need a valid google Account and an API key. These are obtained from the Google developer console.
To get the credentials follow this tutorial https://blog.hubspot.com/website/how-to-get-youtube-api-key
Getting Channel Data
To obtain channel information, we need a channel id. To get a channel id, we will get a video from
youtube and query it for its channel id. For this example we will use this video titled
'LTMYS: Trust In Your Abilities 💌 The Dapper Brother' published on June 2023 as part of the LTMYS
playlist on Wabosha Maxine's channel.
Get Channel id
The video id is the part after the v in a youtube video url i.e for https://www.youtube.com/watch?v=pIzyo4cCGxU
the video id is pIzyo4cCGxU`. To get the channel id:
from youtube import YouTube
client_secret_file = '/home/downloads/client_secret.json'
youtube = YouTube(client_secret_file)
youtube.authenticate()
def get_channel_id():
videos = youtube.find_video_by_id('pIzyo4cCGxU')
channel_id = videos[0].channel_id
print(channel_id)
if __name__ == '__main__':
get_channel_id()
The code snippet creates an instance of youtube.YouTube
passing in a path to the downloaded client_secret.json file:
clients_secret_file: the path to the file containing your identification details downloaded from Google.authenticate(): uses the details from the client secret file to generate and store your credentials. Subsequest calls to this method retrieve the stored credentials.find_video_by_id(): returns a list of videos from youtube with the provided id. Each video is an instance ofVideowith a bunch of details including the channel_id.channel_id: the channel id for the channel to which this video belongs to.
The get_channel_id function retrieves the first video and then prints out the channel id.
How to execute
Put this in a text file, name it to something like channel_data.py
and run it using the python command:
python channel_data.py
When done executing, it will print the channel id to the terminal:
UC5WVOSvL9bc6kwCMXXeFLLw
What just happened under the hood?
First the call to youtube.authenticate generates credentials for use when querying the youtube API
.These are then stored in your computer for use in later requests. The call also checks if the credentails
are valid incase this is not the first time you’ve used the library. If the credentails are expired new
ones are generated. This call may open a browser window that requests you to authorize the application.
The call to youtube.find_video_by_id then queries the youtube api for the given video and if it exist,
returns the video details.
Get Channel Data
Next, let us get the channel details. Let us extend the channel_data.py script with a new
function to get the channel details:
from youtube import YouTube
client_secret_file = '/home/downloads/client_secret.json'
youtube = YouTube(client_secret_file)
youtube.authenticate()
def get_channel_id():
videos = youtube.find_video_by_id('pIzyo4cCGxU')
channel_id = videos[0].channel_id
return channel_id
def get_channel_details(channel_id):
channel = youtube.find_channel_by_id(channel_id)
return channel
def main():
channel_id = get_channel_id()
channel = get_channel_details(channel_id)
print(channel)
if __name__ == '__main__':
main()
find_channel_by_id(): takes in a channel id and returns a channel from Youtube with the provided id. The channel is an instance ofChannel
The get_channel_id method now returns the channel id.
The get_channel_details method uses the channel id to find the channel details.
The main method then uses the above two methods to find and print the channel details to the terminal.
How to execute
Run the script using the python command:
python channel_data.py
When done executing, it will print the channel details to the terminal:
[
Channel(channel_id='UC5WVOSvL9bc6kwCMXXeFLLw', channel_title='Wabosha Maxine',
published_at='2013-10-13T11:30:10Z', custom_url='@waboshamaxine',
channel_description='Hey there! Welcome to my channel. Subscribe to see all things beauty,
travel and lifestyle. Thanks for popping by!\n~ Wabosha \n\nProfessional inquiries:
beautybywabosha@gmail.com',
channel_thumbnail='https://yt3.ggpht.com/ytc/AGIKgqPwUCm7OLuVZeTpTxQ5QSQNA1c1K79Ne_ayzR-c3g=s240-c-k-c0x00ffffff-no-rj',
views_count='20800438', videos_count='377', subscribers_count='236000')
]
Get Channel Playlists
Now that we have a channel, as well as its details, we can get the playlists that are part of this channel.
Let us extend the channel_data.py script with a new
function to get the channel playlists:
from youtube import YouTube
client_secret_file = '/home/downloads/client_secret.json'
youtube = YouTube(client_secret_file)
youtube.authenticate()
def get_channel_id():
videos = youtube.find_video_by_id('pIzyo4cCGxU')
channel_id = videos[0].channel_id
return channel_id
def get_channel_details(channel_id):
channel = youtube.find_channel_by_id(channel_id)
return channel
def get_channel_playlists(channel_id):
channel_playlists = youtube.find_channel_playlists(channel_id)
return channel_playlists
def main():
# channel_id = get_channel_id()
# channel = get_channel_details(channel_id)
channel_playlists = get_channel_playlists('UC5WVOSvL9bc6kwCMXXeFLLw')
print(channel_playlists)
if __name__ == '__main__':
main()
find_channel_playlists(): finds the playlists for a given channel. It returns a list of instances ofPlaylist
The get_channel_playlists method now returns a list of playlist.
How to execute
Run the script using the python command:
python channel_data.py
When done executing, it will print the channel details to the terminal:
[
Playlist(playlist_id='PLouh1K1d9jkZQE0ITJH820mS6s8J5PyxH', published_at='2022-10-12T18:15:53Z',
channel_id='UC5WVOSvL9bc6kwCMXXeFLLw', playlist_title='VLOGS', playlist_description='',
playlist_thumbnail='https://i.ytimg.com/vi/EcRg4X1ftrQ/sddefault.jpg',
channel_title='Wabosha Maxine', privacy_status='public', videos_count=355),
Playlist(playlist_id='PLouh1K1d9jkbKgYLnO8csSJONqCBxM7Bj', published_at='2022-02-02T20:39:46Z',
channel_id='UC5WVOSvL9bc6kwCMXXeFLLw', playlist_title='TUMA PIN', playlist_description='',
playlist_thumbnail='https://i.ytimg.com/vi/qVHhcn_r3bs/sddefault.jpg',
channel_title='Wabosha Maxine', privacy_status='public', videos_count=5),
Playlist(playlist_id='PLouh1K1d9jkYZo8h1zPH3P1ScAWA8gxbu', published_at='2021-08-19T08:49:34Z',
channel_id='UC5WVOSvL9bc6kwCMXXeFLLw', playlist_title='LTMYS', playlist_description='',
playlist_thumbnail='https://i.ytimg.com/vi/27FnpZNmJ8M/mqdefault.jpg',
channel_title='Wabosha Maxine', privacy_status='public', videos_count=21),
Playlist(playlist_id='PLouh1K1d9jkbgO4hIHvabpyxUTqruqFq-', published_at='2018-08-11T13:28:00Z',
channel_id='UC5WVOSvL9bc6kwCMXXeFLLw', playlist_title='MAKE-UP REVIEWS', playlist_description='',
playlist_thumbnail='https://i.ytimg.com/img/no_thumbnail.jpg', channel_title='Wabosha Maxine',
privacy_status='public', videos_count=0),
Playlist(playlist_id='PLouh1K1d9jkaepF8uq2aEZt-fF4KasycG', published_at='2018-08-11T13:26:21Z',
channel_id='UC5WVOSvL9bc6kwCMXXeFLLw', playlist_title='FOOD REVIEWS', playlist_description='',
playlist_thumbnail='https://i.ytimg.com/img/no_thumbnail.jpg', channel_title='Wabosha Maxine',
privacy_status='public', videos_count=0),
Playlist(playlist_id='PLouh1K1d9jkbac3J9sOkkvTGiA-6xZ5BD', published_at='2018-05-25T16:37:00Z',
channel_id='UC5WVOSvL9bc6kwCMXXeFLLw', playlist_title='HAULS', playlist_description='',
playlist_thumbnail='https://i.ytimg.com/img/no_thumbnail.jpg', channel_title='Wabosha Maxine',
privacy_status='public', videos_count=0)
]
Get Playlist Items
Each playlist in a YouTube channel has various items, known as playlist Items. The playlist Item contains information such as when the item was added to the playlist, by who , the video as well as the channel to which the video belongs to.
Let us extend the channel_data.py script with a new
function to get a single playlist’s items:
from youtube import YouTube
client_secret_file = '/home/downloads/client_secret.json'
youtube = YouTube(client_secret_file)
youtube.authenticate()
def get_channel_id():
videos = youtube.find_video_by_id('pIzyo4cCGxU')
channel_id = videos[0].channel_id
return channel_id
def get_channel_details(channel_id):
channel = youtube.find_channel_by_id(channel_id)
return channel
def get_channel_playlists(channel_id):
channel_playlists = youtube.find_channel_playlists(channel_id)
return channel_playlists
def get_playlist_items(playlist_id):
search_iterator = youtube.find_playlist_items(playlist_id, max_results=10)
playlists = list(next(search_iterator))
return playlists
def main():
# channel_id = get_channel_id()
# channel = get_channel_details(channel_id)
# channel_playlists = get_channel_playlists('UC5WVOSvL9bc6kwCMXXeFLLw')
playlist_items = get_playlist_items('PLouh1K1d9jkYZo8h1zPH3P1ScAWA8gxbu')
print(playlist_items)
if __name__ == '__main__':
main()
find_playlist_items(): finds the playlist items for a given playlist. It returns an iterator.
The get_playlist_items method returns a list of PlaylistItem by iterating through
the results returned by the call to find_playlist_items()
How to execute
Run the script using the python command:
python channel_data.py
When done executing, it will print the channel details to the terminal:
[
PlaylistItem(
playlist_item_id='UExvdWgxSzFkOWprWVpvOGgxelBIM1AxU2NBV0E4Z3hidS41NkI0NEY2RDEwNTU3Q0M2',
date_added='2021-08-19T08:49:42Z',
channel_adder_id='UC5WVOSvL9bc6kwCMXXeFLLw',
item_title='SOMETHING IS COOKING // Wabosha Maxine',
item_description='MENTIONED IN THIS VIDEO:\n-Get yourself some of the merch in this',
item_thumbnail='https://i.ytimg.com/vi/27FnpZNmJ8M/mqdefault.jpg',
channel_title='Wabosha Maxine', video_owner_channel_title='Wabosha Maxine',
video_owner_channel_id='UC5WVOSvL9bc6kwCMXXeFLLw',
playlist_id='PLouh1K1d9jkYZo8h1zPH3P1ScAWA8gxbu', position=0, video_id='27FnpZNmJ8M',
resource_id='27FnpZNmJ8M', video_published_at='2021-08-19T08:51:27Z',
privacy_status='public')
]
Get Playlist Videos
So far we have a channel and its playlists. And for a given playlist, we can get the playlists’ items. Each playlist item contains a video id. We will use this id to load the videos for a given playlist.
Let us extend the channel_data.py script with a new
function to get a single playlist’s videos:
from youtube import YouTube
client_secret_file = '/home/downloads/client_secret.json'
youtube = YouTube(client_secret_file)
youtube.authenticate()
def get_channel_id():
videos = youtube.find_video_by_id('pIzyo4cCGxU')
channel_id = videos[0].channel_id
return channel_id
def get_channel_details(channel_id):
channel = youtube.find_channel_by_id(channel_id)
return channel
def get_channel_playlists(channel_id):
channel_playlists = youtube.find_channel_playlists(channel_id)
return channel_playlists
def get_playlist_items(playlist_id):
search_iterator = youtube.find_playlist_items(playlist_id, max_results=10)
playlists = list(next(search_iterator))
return playlists
def get_playlist_item_video_id(playlist_item):
video_id = playlist_item.video_id
return video_id
def get_videos(video_ids):
videos = youtube.find_video_by_id(video_ids)
return videos
def main():
# channel_id = get_channel_id()
# channel = get_channel_details(channel_id)
# channel_playlists = get_channel_playlists('UC5WVOSvL9bc6kwCMXXeFLLw')
playlist_items = get_playlist_items('PLouh1K1d9jkYZo8h1zPH3P1ScAWA8gxbu')
playlist_video_ids = list(map(get_playlist_item_video_id, playlist_items))
playlist_videos = get_videos(playlist_video_ids)
print(playlist_videos)
if __name__ == '__main__':
main()
The get_playlist_item_video_id method returns a list of video ids from a list of
PlaylistItem instances.
The get_playlist_item_video_id method returns a list of Video instances
How to execute
Run the script using the python command:
python channel_data.py
When done executing, it will print the channel details to the terminal:
[
Video(video_id='27FnpZNmJ8M', video_title='SOMETHING IS COOKING // Wabosha Maxine',
channel_id='UC5WVOSvL9bc6kwCMXXeFLLw', channel_title='Wabosha Maxine',
video_description='MENTIONED IN THIS VIDEO',
video_thumbnail='https://i.ytimg.com/vi/27FnpZNmJ8M/mqdefault.jpg',
video_duration='PT6M49S', views_count='45073', likes_count='2312', comments_count='194',
published_at='2021-08-19T08:51:27Z')
]
Get a Video’s Comments
For the final task, we will load a given videos’s comments.
For this example we will use this video titled
'LTMYS: Trust In Your Abilities 💌 The Dapper Brother' published on June 2023 as part of the LTMYS
playlist on Wabosha Maxine's channel.
from youtube import YouTube
client_secret_file = '/home/downloads/client_secret.json'
youtube = YouTube(client_secret_file)
youtube.authenticate()
def get_video_comments(video_id):
search_iterator = youtube.find_video_comments(video_id, max_results=20)
video_comments = list(next(search_iterator))
return video_comments
if __name__ == '__main__':
get_video_comments('pIzyo4cCGxU')
find_video_comments(): finds the comments for a given video. It returns an iterator.
The get_video_comments method returns a list of VideoComment by iterating through
the results returned by the call to find_video_comments()
How to execute
Put this in a text file, name it to something like video_comments.py
and run it using the python command:
python video_comments.py
When done executing, it will print the video’s comments to the terminal:
[
VideoCommentThread(video_id='pIzyo4cCGxU', top_level_comment=VideoComment(video_id=
'pIzyo4cCGxU', comment=Comment(comment_id='UgxgAvOCX2S3XpWnH2t4AaABAg',
comment_author=CommentAuthor(author_display_name='Sharzy Kish',
author_profile_image_url='https://yt3.ggpht.com/ytc/AGIKgqPdDBYMmEahLmqmKuWnb3Hl2Oghh4adoE7dnw=s48-c-k-c0x00ffffff-no-rj', author_channel_url='http://www.youtube.com/channel/UCT-jlAGTexG3Ts35ZhhiXew',
author_channel_id='UCT-jlAGTexG3Ts35ZhhiXew'), text_display='Love love this episode😍😍',
text_original='Love love this episode😍😍', like_count=2, published_at='2023-06-04T12:40:24Z',
updated_at='2023-06-04T12:40:24Z', parent_id='')),
comment_thread=CommentThread(comment_thread_id='UgxgAvOCX2S3XpWnH2t4AaABAg',
total_reply_count=1, is_public=True), replies=[VideoComment(video_id='pIzyo4cCGxU',
comment=Comment(comment_id='UgxgAvOCX2S3XpWnH2t4AaABAg.9qXuda6x5Xg9qXvfxe66DU',
comment_author=CommentAuthor(author_display_name='Sharzy Kish',
author_profile_image_url='https://yt3.ggpht.com/ytc/AGIKgqPdDBYMmEahLmqmKuWnb3Hl2Oghh4adoE7dnw=s48-c-k-c0x00ffffff-no-rj',
author_channel_url='http://www.youtube.com/channel/UCT-jlAGTexG3Ts35ZhhiXew',
author_channel_id='UCT-jlAGTexG3Ts35ZhhiXew'), text_display='Guys, he has a beautiful voice😭😍',
text_original='Guys, he has a beautiful voice😭😍', like_count=1,
published_at='2023-06-04T12:49:27Z', updated_at='2023-06-04T12:49:27Z', parent_id=''))]),
]
Exporting the Data
In this section we will see how to export the data. The options here include:
Saving to a file as json Object
Saving to an SQL database using an ORM or through barebones sql.
Creating a Pandas dataframe for further analysis.
Next steps
You can check out the youtube Tutorial for example project walk throughs or browse the library API.