Why use Captions and Subtitles for your Video

Ahmed Khalifa with Jason Barnard at BrightonSEO September 2019

Ahmed Khalifa talks with Jason Barnard about why you would use captions and subtitles for your video.

 
Firstly, accessibility. All sorts of people benefit from captioned / subtitled videos, not just deaf people says Ahmed Khalifa: non-native speakers, people watching with the sound off, when the speakers’ accents aren’t clear (think Glaswegian 🙂 … and some people just like to read along. Automatic captions need to be corrected. Machines simply cannot get everything right (especially the scene directions and ambiance descriptions).

Apparently, professionals can write captions almost in real time, including descriptions about background noise. I personally hadn’t thought about how important that can be for context. It’s not just what we say, but the context in which we say it.

Ahmed is deaf, and relies on captions. But all sorts of people benefit from captioned videos… Of course, we get onto Glaswegian accents. Heads up Craig Campbell. Auto captioning is far from perfect. It needs to be corrected. Then we get onto pushing that to transcripts and adapted transcripts. Apparently, professionals can write captions almost in real time, including descriptions about background noise. Guess who hadn’t thought about how important that can be for context. It’s not just what we say, but the context in which we say it. I speak too much without engaging my brain. Ahmed saves the day and makes me sound intelligent. Phew !

Jason Barnard
I’m going to reread your name to make sure I get it right. #SEOisAEO, welcome to the show. Ahmed Khalifa.
Ahmed Khalifa
Actually, yeah, that’s quite like intro. People should do that more often.
Jason Barnard
Yeah, but the shame was I actually had to read it.
Ahmed Khalifa
I don’t blame you for that.
Jason Barnard
Thank you. Lovely to meet you. I believe you started your career in Brighton.
Ahmed Khalifa
Kind of. I did have an agency experience in Worthing. So I lived there for like a year and I did agency work. I did a few things building up to it, then from then on, I just kept up the momentum.
Jason Barnard
And you worked for a company called Fresh Egg? My first thought when I saw fresh egg was rotten egg.
Ahmed Khalifa
My first thought was some kind of farm or something And that was about seven or eight years ago now. So it’s been a while.
Jason Barnard
Okay. Right. Brilliant stuff.
 

Video Captions

Jason Barnard
You were talking about that earlier on, and shamefully I didn’t see your talk… but you gave me a quick idea of what it was all about. Captions on videos and how rubbish they can be when they’re auto generated on YouTube and people leave them idiotically.
Ahmed Khalifa
Pretty much. And they don’t give it a second thought.
Jason Barnard
Yeah. And, and you were telling me why that’s particularly interesting for you?
Ahmed Khalifa
Well, I mean, if I’m going to go from my personal experience, I depend on captions because I’m deaf and I depend on captions just to access videos. But then even if it’s not for me, it can be for all for people for whom English is not their first language. They’re learning the language. Or it could be some kind of learning disability and they need captions to keep up. It could be attention deficit disorder. It could be even be, for all of us, because you may be an urban transport or maybe you’re in a library or whatever, and you just want to watch the video in silence.
Jason Barnard
Or it could be that the person speaking has a really, really thick Glaswegian accent like Craig Campbell.
Ahmed Khalifa
That is one thing that I mentioned in the talk. I had the example of Kevin Bridges, the comedian. He’s from Glasgow, and the auto captions struggled. to get it right – and it was interesting what they come up with. So, of course, a strong Glaswegian accent can affect things (smiles).
Jason Barnard
That leads us really neatly onto the idea that if you make a video, do the transcript. Stick it on your site, then you get both the text – remember, people people prefer reading to watching videos. People talking about this more and more in this industry. Is that a great strategy moving forward?
 

Transcripts and Video

Jason Barnard
Put a transcript on your site, then you get both the text – remember, people people prefer reading to watching videos. People talking about this more and more in this industry. Is that a great strategy moving forward?
Ahmed Khalifa
It is. I mean, it’s a way for you to target both YouTube search and web search. They’re two separate things. If you have captions and you’ve done it correctly, you’ve done all the editing and it is clean and accurate and correct, then great – YouTube will be able to understand better what the video is about. Then you can use the captions and turn them into a transcript and that can be your blog post where you can embed the same video. So then you target the web search as well. So with one video. You can target both web search and YouTube search.
Jason Barnard
I suppose if somebody exported the auto-thingy – sorry, I can’t remember what it’s called – (captions) and put it on a page and they would read it and they would think, yes, I do look stupid.
Ahmed Khalifa
You’d be surprised. I think more often than not, if you export auto-generated captions, it is not possible to read it.
Jason Barnard
Yeah. I tried doing auto texts for the podcast using Google’s language analysis thing, and it was just rubbish.
Ahmed Khalifa
It’s not there for a complete transcript. You’re not replacing people who do it manually. It gives you a head start and then you can edit manually. So it will save some time. Doing it word by word – manually stop and type in, play. Listen. Stop it. Play it. – That’s gonna take forever. Although there are experts out there, who can do it quite quickly. They can do it in real time, more or less.
Jason Barnard
I’ve got a mate who does English subtitles for French films. And he says he more or less types it in real time.
Ahmed Khalifa
People can can do that. It’s quite impressive.
Jason Barnard
But you must be typing something slightly after you’ve heard it and still trying to listen to the current words.
Ahmed Khalifa
And not only that… you may have to put in for example, an explosion in the background, a car crash or whatever. So it’s not just about what the person said, it’s about the sound effects. So that person will also have to think about that. – the sentence ends, new line, then car crash or explosion or whatever.
Jason Barnard
And of course YouTube auto-captioning doesn’t do that.
 

Captions and Context

Ahmed Khalifa
No, it doesn’t. So you’ve got no idea if there’s been a crush in the background. Not at all. And that was one of the arguments I gave in the talk I did today. For example, if you are in a room and there’s a background noise of some wind… that could mean anything. It could be the wind outside. A very windy day. It could be the wind noise coming from a TV. It could be wind noise coming down the chimney coming out of the fireplace.
Jason Barnard
Oooh ! Context.
Ahmed Khalifa
Yeah, that’s what I talk about a lot. It’s about context. You have to provide context – we communicate not just by content, but with context as well. And that’s the whole point of re-touching, rewriting. It’s not just the words, you need to really understand the emotion, the story, the context of that video.
Jason Barnard
Oh, that’s brilliant. So it’s not just the words we’re saying, it’s the context. And when you watch subtitles films – I live in France and we’ve got quite a lot of subtitled films – you see all these stage directions and descriptions of what’s going on.
Ahmed Khalifa
It’s one of the most common things people say like “Oh, I hate that channel, I don’t really watch it”, but for some reason it’s okay for them when it’s a foreign film. Okay. Well, it’s kind of a similar concept because if you can accept it in a foreign film, then why can’t you accept in your spoken language.
Jason Barnard
And I would imagine that one of the things that stops brands during the subtitles themselves is cost.
Ahmed Khalifa
Yep. And that’s a false economy. It’s very false, because the are very cheap ways of doing it. First of all, you can do it for free. You can upload your video on YouTube, have them auto-generated and then edit them .
Jason Barnard
Yup. That’s free.
Ahmed Khalifa
And there are AI machine that do it for you for 10 cent a minute.
Jason Barnard
Can I just ask a quick question? The services with AI machines doing it for 10 cents a minute… are they better than YouTube?
Ahmed Khalifa
Very similar. I couldn’t tell the difference.
Jason Barnard
So YouTube are pretty good within their field, but just not good enough for humans.
Ahmed Khalifa
I just can’t see robots replacing human in terms of being able to interpret emotions and context, you know, how can you, how can you expect him to do all the sound effect correctly? I just can’t see that.
Jason Barnard
I like the idea of emotions because as well as the words you’re using or the… Oh no, that’s completely the wrong idea. I was thinking about translations. I was getting very confused.
Ahmed Khalifa
The idea of interpretation IS there. There is some overlap.
Jason Barnard
Oh good! Help me sound intelligent.
Ahmed Khalifa
I can’t do that 🙂
Jason Barnard
So that the idea that there’s overlap between translations and interpreting and sub subtitles.
 

Interpretation and Accuracy

Ahmed Khalifa
Yeah. People interpret what one person said to another…
Jason Barnard
That was my point. When you’re doing the the subtitles, you can simplify it. For example, when I did do the transcripts for this podcast, first thing I did was cut out stuff because it was redundant and change the sentences slightly so that they read better. Is that a bad idea?
Ahmed Khalifa
I think it’s a good idea to do that for transcripts, but not for video subtitles. Now, I think because when you read it, you don’t really want to read too much of the “ummmmmm. Um”… there’s not really any point. And even to some extent you don’t wanna leave too much of that in the subtitles. Put those hesitations in, but you don’t go overboard. The point I made is that the captions must be true verbatim. So if that person started a sentence like, “I, I’m, I don’t, I don’t know”… Put that in. You need that in the caption because you understand in that person it’s not gonna get the message out yet. If you write it cleanly, then it’s completely different to what that person said. But I do think it’s important that the way people read captions and subtitles is different to the way people will read a text.
 

Repurposing Video

Jason Barnard
Yeah. So we’ve actually ended up with three levels. We’ve got the video with the audio, we’ve got the captions, and then we’ve got the transcript. It’s basically three different versions of the same thing.
Ahmed Khalifa
Kind of stacked on top of each other. And you’re asking about whether it’s a good thing or not. It’s great because you’ve given people the option either to watch your video or to read a transcript. You said you prefer to read over watching a video… so you want to do the same thing that you do for any blog post – get the headings, get the paragraphs and organise. Don’t leave a one big chunk of text and just throw on the blog. You want to make an effort to make it easy to read.
Jason Barnard
Yeah. And also easy to digest for Google. When you’ve got the transcript in your page, Google sees the video, but it digests the transcript much more easily, much more cheaply. And I think people forget that there is a question of resources. It costs Google money to analyze that video. It can, of course… and then when it gets the transcripts wrong, that implies that it’s misunderstood. And with the transcript will actually set it right.
Ahmed Khalifa
Well it is true that Google has to work extra hard to understand what that content is about in that page. But even if you have everything in there, if it’s not correct, it’s still going to struggle to understand what it’s all about. Accuracy is important. And I’ve mentioned the fact that with auto-captions, on average one in three words are wrong…
Jason Barnard
One in three???
Ahmed Khalifa
One in three on average. To be more exact it’s between 60 and 80% of the words are correct.
Jason Barnard
Ah ! But that also depends on the very strong Glaswegian accent.
Ahmed Khalifa
Well, I mean, if you’re Glaswegian, let’s make that two out of three words incorrect or something 🙂 But yes, accents are a big point – you can’t expect everyone to understand that.
Jason Barnard
Yeah. I think I’ve got a neutral accent. But people have told me it’s not true. And in fact, Google had quite a lot of trouble understanding me when I did the transcriptions. So even a slight accident becomes a problem.
Ahmed Khalifa
I think so. I think so.
Jason Barnard
Google is also American accent centric, no?
Ahmed Khalifa
Well, they do have options. When you go to YouTube and you upload it, you can see option: English UK, English US, English Australian, English Canada and so on… I don’t know if it’s true or not, but I’m assuming that they’re able to differentiate at least the text format, but they’re not going to be able to differentiate when they hear it. They can’t do that. In the future, one day maybe, but right now they can’t. Even for me, I’ve got like various accents, so it’s like for them it’s a nightmare. And I know every time I have anything like transcribe or caption, I have to check it.
 

Video is Rather Great Content

Jason Barnard
Yeah. Now, moving onto the next topic, which is the idea of video as a source of content… and video is becoming more and more important. Listening to Gary Ilyes about what kind of things he’s interested in right now – It’s images and video. He’s completely obsessed by it, and I think that’s a BIG signal to us all that blog posts are maybe dying out. I’ve been looking at the idea that basically your blog post is competing for the blue links, but video can come in and leapfrog that content.
Ahmed Khalifa
It can, yeah. I’m an advocate of videos, not only because of SEO, but it just makes my website more human because I put my face on there. So we’ve talked about captions, and the benefits of those… but there are many other benefits. Everyone jumping on the bandwagon. And rightly so, because it’s a big thing. Just bite your tongue and do it. Obviously do all the things for SEO, like structuring your page, having the images, the text – all the things that make it a multimedia page and video plays a part. But beyond that, it’s also a part of my sales funnel because if I want to talk about a topic and my audience can see my face and I talk to them in person… I kind of invite them in and that’s more powerful, than reading the text. If you can talk to them face to face like that in the video, I think that you can really connect with your audience a lot better than just depending on blog posts.
Don’t get me wrong, there are people who write amazingly well, and they can connect amazingly well with just blog posts, but there’s something about videos that you can’t replicate elsewhere.
 

Videos and Brands as Entities

Jason Barnard
You put a face to your brand, which is incredibly important. But one thing I’ve been talking to my clients about is – don’t just have one face for your brand because if that person leaves your company, your face has gone. So if you’ve got three, maybe more, if one leaves you can replace them with somebody else and that the effect is a lot less.
Ahmed Khalifa
Multiple people, yes. This person does, this person does this. Why not? If you are a solopreneur, fair enough, it’s only you. But if you have multiple businesses or experts in different fields, then why not have different people talking?
Jason Barnard
And then E-A-T. If you’ve got a site or a brand that covers several different domains, you need several experts. You need an expert who is an authority in each specific domain. And the same would be true of video.
Ahmed Khalifa
Yes, because you can’t just get a person who is an SEO to talk at the same level as a person who is a developer. Two different computer languages, two different contexts. You can’t ask someone to pretend to be a developer on video. People can see through that. People are getting smarter and smarter – you can’t fake it online as much as you could maybe 20 or 30 years ago. I think videos can add a level of honesty and transparency. They make you more trustworthy when you actually talk to people in the video. So that’s why feel it’s becoming a big thing now.
Jason Barnard
And now also the fact that people interact with shares and so on is a signal that people are actually interested in your brand. And then if you have the video boxes on your brand SERPs, that’s a really strong signal to your audience that you have a face, a message.
Ahmed Khalifa
That you’re an entity. And that’s the big one, isn’t it? Google is more and more about who you are: whether personal or a brand, whatever. That’s been discussed today a few times at BrightonSEO – that Google looks at entities more and more now. This website is that entity and that brand… And content
Jason Barnard
And another thing about videos – I’m writing a blog post. Let’s say that takes eight hours. If I do a video, I’m just recording myself saying more or less what I would’ve written anyway. And that takes me, perhaps an hour and then a couple of hours to clean it up. So it’s much quicker.
Ahmed Khalifa
It can be. People think that it takes forever. I think it’s because they aim for perfection and say, “Oh, I’ll do it again. Oh, I’ll do it again”. Deleting, again and again… and that’s the problem. Don’t keep doing that. First of all, you can practice basic editing, and the takes can jump from one scene to another. That makes it easier. But I know some people out there who write their blog posts just by speaking into their online Dictaphones.
Jason Barnard
I tried that the other day.
Ahmed Khalifa
In some posts I think that can work in some other type of content, I’m not sure. I’m trying to think of an example… Right, how to install Google analytics on your site. I think it a bit tricky because you need to see everything you need to see the images. But if you’re going to tell a story, you can record it to video or just audio in five minutes.
Jason Barnard
You were also saying that people are scared to put themselves on camera because they might look ugly. And then the idea that you have to be incredibly professional… but I was talking to Anders Hjorth about that and he’s using a cell phone and he just records himself two minutes every day to promote his business.
Ahmed Khalifa
That’s all you need. I started with my smartphone
Jason Barnard
I’m sorry, I don’t even know what you’re doing in video 🙂
Ahmed Khalifa
I talk about a lot of WordPress SEO, but nowadays I talk more about deaf awareness … and that’s my main topic on YouTube.
Jason Barnard
Spreading awareness.
Ahmed Khalifa
Branding is awareness. Learn how to use video. Learn how to communicate. Learn how to use video to your advantage in your business. There’s so many benefit of videos.
Jason Barnard
Well, in SEO at the moment, we’re all talking about entities. We’re going to also move more to content. I’ve started saying to my clients “You want to write a blog? Why?” And the answer is “because that’s how it works”. But that’s not good enough… make a video and you’ve got some rich content that gets your message out there and you might even get one of the rich elements in the SERPs.
 

Video Quality

Ahmed Khalifa
I think people overemphasize too much on the idea it has to be a hundred percent professional video. When you look at earlier videos from any influencer, any big name YouTuber out there, you’ll be shocked. You think that they actually did that. But they got started with a smartphone or a simple pocket cameras or whatever, and they’ve got better and better and better. And that’s the thing. You have to get started. What’s my best advice? Get started. That’s the only way you can get better.
Jason Barnard
But if you’re a big brand or a decent sized brand, you’re thinking “I can’t be unprofessional. I have to have this professional image”. But then I bought a Sony reflex camera that has an absolutely amazing image, this Zoom sound recorder and microphones that give incredibly good quality. Total cost 1,500 €. And if you’re a decent size brand, you’ve got professional quality for a reasonable cost if you’re willing to put the time in to learn how to use it and to do it properly.
Ahmed Khalifa
Yes, but it’s one thing having professional quality videos, but if you don’t have right content on the video that isn’t going to work. And also, people underestimate audio quality… in a lot of cases audio is more important than video quality.
Jason Barnard
Brilliant. David Bain was talking about the five stages of podcasting, and you’re saying you start with just sound. You get the sound right, and then you build up to video.
Ahmed Khalifa
You can do that. I started podcasting. I do put as well about deaf learning and sometimes I do both at the same time, record video and podcast. And I record sound in two different ways because sometimes the quality of the audio for one use, if not great for another… and also I just want to have backups. But I’ve been told so many times, audio quality in videos is more important than image quality. Once you got that, then you can get the visual quality right.
Jason Barnard
That was a brilliant interview. I really enjoy talking about that. Great. SEOisAEO, thank you. Ahmed.
Ahmed Khalifa
I need to get more people doing that. I love it. Thank you so much.for having me. Brilliant.

By Jason BARNARD

Jason Barnard has over 2 decades of experience in digital marketing.

He currently teaches Brand SERP optimisation to students at Kalicube.pro and writes regularly for leading marketing publications such as Search Engine Journal, SEMrush, OnCrawl, Searchmetrics as well as appearing regularly on digital marketing webinars and speaking at major conferences around the world such as BrightonSEO, PubCon, SMX London, YoastCon.