As many people in industries such as ours will have noticed – YouTube is being slow at updating the view count for some videos at the moment. Luckily we have our own numbers to go by, so it’s not affecting us as much as it is affecting many companies, but I thought I’d put my explanation up here so we can refer people to it.
According to youtube, this seems to be due to an algorithm change made on the 25th of February. (they have made similar comments elsewhere)
Quote:
We’ve made a change in our public-facing view counts across the site
that will enable us to consistently reflect what is considered a
‘view,’ based upon video consumption, video streaming and spam
filtering. This only affects view counts from February 25 moving
forward.Implementing this change also caused view count updates to slow down a
bit in general; many people have noticed this and we’re aware of the
issue.
This raises some very interesting points (these are my observations, have not been confirmed with Google and may not reflect the opinions of Team Rubber):
First, for people who don’t deal with software like this every day (like I do for the viral ad network), I’ll explain the common way that numbers like this are updated:
- There are one or more “tracking servers”, running all over the place – these are the servers that actually record a “view”, “hit”, or “action” – and they simply record lots of information about each action, which will be looked over later.
- Every few minutes the main algorithm runs over all the data it hasn’t looked at yet and updates the numbers that are shown on the dashboards.
The important thing to notice is that the views are recorded right at the beginning and they will be updated at some point. Even if the main algorithm is stopped entirely for a few days, it will carry on in the future if you’re patient.
Prioritizing videos (“Why does this only happen once I reach 200/300 views?”)
You may have noticed that the number of views per video has always been updated quicker for videos with few views than for videos with more views. For example, a newly uploaded video will normally update it’s view count within a few minutes of a video being watched, where a video that has already had several thousand views will update it’s view count more slowly.
This suggests that when Google run their main script, they tend to update the numbers for videos with less views more often than for videos with a higher number of views – and leave the other data to be processed less often (say every few hours)
This makes a lot of sense, because people with 50 views are more likely to be watching their numbers every few minutes to see if they have another 5 views than people who have had 200,000 views – who may only care about their views increasing by 1,000. It keeps users happier.
This explains why we (and others affected by this issue) have seen view counts rising as normal until they get above 200-300 views – at which point the numbers appear “stuck”.
Balancing the work (“Why doesn’t this affect all videos?”)
Clearly a massive site like YouTube getting so many views need more than one computer running to update these numbers. I’m going to assume that Google run this over their normal map-reduce system.
They may tens, hundreds, or even thousands of computers running their view-counting algorithm (and I don’t expect to ever find out…), but all views for a video have to be counted by the same computer, so they need some manner of splitting up the millions of views they have recorded into batches of work to be done.
They almost certainly do this using some form of hash function – you can picture this as saying that every video on YouTube is grouped into various buckets – each of these buckets will have it’s views processed on the same machine (or at the same time).
The problem comes when a hash function doesn’t split up the items equally (i.e. one “bucket” has significantly more/less videos in it than another one). This appears to be the problem here – only some videos have been affected, and my assumption is that this is because one of these “buckets” has ended up with far more views than the others – meaning that one set of machines (or one job) gets over-loaded and ends up being incredibly slow.
Lessons Learned
For me, working with a similar system to the above, the number one thing that I have learned is that for tasks like this that might be incredibly sensitive to hash functions it’s not safe to assume that a hash function that’s theoretically good is going to remain good.
I don’t know if they are able to, but the situation would be better if YouTube chose the hash function at the beginning of each main job. I.e. each time that they run the main script that updates the information on the dashboards, they chose to use a different hash function. This way, if a video ends up in a bucket that’s overloaded one time, it will end up in a different bucket next time (which shouldn’t be overloaded).
Of course, this is all theoretical, and is based on a large number of assumptions – YouTube may perform their hashing at a far earlier stage, and they may not be able to change the hash function each time they run the job.
Tim Wintle