In my last post I have expressed my desire to use common statistical methods and multivariate analysis to analyse the Social Media data.Here I am going to explore if Cluster Analysis can be used to segment the videos according to the audience attributes in my current blog and future blogs .This is divided into two parts.The current post will explain how the data can be obtained from YouTube platform.The second will go deeper into understanding how the data can be cleaned and how Cluster Analysis can be used.
How many of us log into our YouTube account everyday and watch one video regularly? The number is staggering. This made me choose a platform like YouTube.As the viewership of YouTube videos rise, we the people of analytics have wondered if it is possible to get meaningful results by analyzing this data. The first question is where is this data available? It is quite interesting to know that the YouTube insights feature has a ready solution. The insights feature in YouTube gives you all possible data which can be used for detailed analysis of your audience. If you are the owner of an account in YouTube it is quite easy to have a graphical dashboard view of your audience characteristics. Just go to the link where your account name is written. Click on the drop-down. A list will emerge which has the following links.
Click on any link and you have a toolbar on the top. Following this click on the link called insight on the toolbar and your dashboard is here.
The dashboard is divided into six groups as follows:
In the summary dashboard you can customize the time frame to see the number of views over a period of time. The demographics of the audience in terms of their age and gender and in which regions they are popular. Even we can see which videos got what kind of attention and the views as percentage of total views in the channel.
The views link opens another dashboard and gives a few more options. We can watch the daily totals, 7-day totals and 30-day totals and the regional popularity.
The discovery dashboard gives the link followed to this video or what we call the referral statistics. It is available by absolute numbers and also as a percentage of total views. It also gives the location of the player when it is viewed as percentage of total views.
The 3rd link contains the demographics view which is a part of the summary views. It divides the age into 7 groups starting from 13 to above 65 and reveals the percentage of male and female in each age group. This further provides us with the pie chart of overall male and female percentage of video views.
In the community link we can get four kinds of views. one is all community engagements which gives the summed up view of all engagement activity done by the audience like the Sharing, Rating, Comments and Audience and individually views on the basis of sharing Rating etc are also available. Also a list of top countries and top videos are available on the basis of engagement activity by all countries.
The last link is the subscribers link for the channel which gives a list of top countries on the basis of maximum change in subscribers. For individual videos this is replaced by a tab called hotspots which gives a graphical representation how the length of the video has been watched over time and it shows at what point it has been watched more and at what points most audience has left the video. Each of this links can be adjusted by date according to the needs.
For forming thesethere must be some data at the back end. The good news is YouTube provides us this data for free. This is really exciting as it allows us to process the data according to our needs and find meaningful results from it. We just need to click on the reports for this . There is a “csv” and “csv for Excel” option which we can choose from.
When we click on the “download as” link YouTube provides us a with the video ID name as the name of the zip file. The zip file contains four excel files and they cover four aspects of the data namely the views, location referrers and demographics. Each of this file contains the video ID as the unique key in the data which allows us to process the data into one single file using the unique key.
There is one pitfall to it. The data that YouTube gives currently is for only 28 days at one click. To download the data for an entire period of one year requires adjusting the data time period about 13 times and clicking to get the data. This may sound very easy for one video but if you have a channel which contains more than 600 videos this is a daunting task. One option to that we can explore is http://code.google.com/apis/youtube/2.0/developers_guide_protocol_insight.htmlprogramming. This is the link for the Documentation related to API calls
All the programmers out there can try a hand at this.
Now after you have downloaded the data and we need to process the data for carrying out any multivariate analysis. Now what kind of multivariate analysis technique can be used? Well, it depends on the needs of the business or the person concerned and what we need to answer. Our motto of carrying out the analysis is to group the individuals on the basis of similar characteristics and profile the personas on the basis of these characteristics. So we considered using a common called clustering By clustering the data on the basis of video IDs we will be able to group the audience characteristics. I will explain in my next blog how we can go about this process of clustering.
Please feel free to leave a comment and let me know if this excites you and also provide me with your blog or website if you have done something like this. I will very enthusiastic to learn from it.
- Video Analytics YouTube Insight (ferreemoney.com)
- Youtube Insights for Audience (google.com)
- YouTube Reveals Video Analytics Tool for All Users(youtube-global.blogspot.com)