Skip to main content



YouTube recommendations and “children’s” videos

TED talk: https://www.youtube.com/watch?v=v9EKV2nSU8w

Toys Trek video: https://www.youtube.com/watch?v=odd04DhsnMM

When a user browses YouTube, YouTube will recommend videos that “go well” together. That is, it will recommend videos that are similar to each other, or it will sacrifice some of that similarity if the video is very popular. This is reasonable. If a viewer watches a video, they will probably like similar videos. And if a video is popular, more viewers are likely to enjoy it. Correspondingly, YouTube tends not to recommend videos that are unpopular or dissimilar to those the viewer watches. This is also reasonable. There is no point in recommending such videos if YouTube knows which videos will get them views.

Then, we can think of YouTube videos as a massive network, where videos are connected if they “go well” together. We can imagine YouTube’s video network as having massive clusters of similar topics. For example, videos on carpentry will probably be clustered together, videos on gaming will probably be clustered together, and so on.

But who decides who decides which videos “go well” together? On one hand, it is the viewers, since they determine the popularity. But when a video is first put onto YouTube, who decides? In this case, it is a complex algorithm that looks at the title, the description, the channel, and maybe even some of the video, and then says, “Yes, this goes well with carpentry.” or “Yes, this goes well with gaming.” Of course, the algorithm doesn’t actually have any idea what carpentry or video games are. It really just mashes the data it receives together and places the video “close” to videos it thinks are similar.

When I put it this way, it seems like a wonderful system. No human curation! No man-hours wasted! The recommendation engine JUST WORKS! But unfortunately, this engine had – or maybe still has – a major flaw. 

This flaw is simple. You can trick the algorithm. Consider the cuckoo bird. The cuckoo bird lays its egg in other bird’s nests. The bird that owns the nest looks at the cuckoo egg and thinks, “It looks egg shaped and is in my nest, so it must be one of my eggs.” In this way the cuckoo bird can have offspring without ever having to care for them. In the same way, content creators can and do mash words, tags, and video content together, to trick the algorithm into thinking, “Ah, yes, this MUST belong in carpentry!” when in reality it doesn’t.

But where this is most prevalent is not carpentry. It is in children’s videos. Very, very young children. So you end up with titles like “NEW 101 SURPRISE EGG OPENING PAW PATROL MOANA COCO SHOPKINS PJ MASKS MICKEY DISNEY MLP MARVEL PEPPA” (This one is courtesy of the channel Toys Trek) which only serve to mangle the algorithm into planting this video firmly in the “kid’s” cluster of YouTube (as illustrated by the TED talk).

Which is honestly fine. The video itself is just someone opening eggs and other containers. But the problem arises when people use these same techniques but also begin to lace these videos with disturbing and highly inappropriate imagery, considering the target audience. These videos show things like animated gore, cartoon characters engaging in violence or being a victim of it, and characters in sexual situations (also shown in the TED talk). And because the algorithm has no real intelligence, it plops these videos firmly in the kid’s section.

For these content creators, they seem to have a clear formula. First, they load every aspect of their video and its posting with content that will signal to the algorithm that it belongs in the kid’s section. Second, they fill their video and posting with content suggestive or explicit of violent or sexual themes, which serves to “clickbait” kids into watching it. Third, they sit back and watch that sweet, sweet ad revenue roll in.

This phenomenon of disturbing “children’s” videos was termed “Elsagate”. Back when it was big, YouTube tried to clear out most of the inappropriate videos. And it seemed to have worked… mostly. Still, it serves as a word of caution: in a network, be careful of what connects to what.

Comments

Leave a Reply

Blogging Calendar

September 2019
M T W T F S S
 1
2345678
9101112131415
16171819202122
23242526272829
30  

Archives