The new article series by FabricAI

What can AI tell us about the headlines used in online media?

New year, new me. A phrase heard perhaps one too many times? January is almost over already and maybe some of the possible New Year’s resolutions have been broken already, if there were any to begin with. New Year’s resolutions can be seen as a big cliché but that doesn’t stop us. This year we at FabricAI have a new goal that we want to share with you.

This year our goal is to share more information and our thoughts about artificial intelligence and about its interesting possibilities.

This topic is close to our hearts and we want to share our passion with others too.

FabricAI publishes a new article series that focuses on artificial intelligence and on how it can be implemented successfully with the help of FabricAI -solutions. The articles handle different kinds of interesting perspectives and they are published weekly on the company’s website.

Yet, before we dive into the deep end let’s start with something less intense and go back to where it all started – from the love and passion towards AI and solutions that actually work. AI and its many possibilities can seem a bit intimidating and perhaps something that only IT professionals seem to understand.

But, this isn’t necessarily the case.

We want to prove that this topic doesn’t have to be complicated. Instead, AI can be used to investigate things in our everyday lives too. That said, let’s begin with something that we all face everyday – the style of headlines used by online news media.

In the first part of the new article series FabricAI’s Head of AI Juhani shares how he utilized artificial intelligence to investigate the headlines of online news.

I bet many of us have noticed how the headlines of online news articles have become quite long. The lengthy headlines are used in hopes of sparking up the readers interest and unfortunately at the same time the whole essence or the main idea of the news article is left on the sidelines.

However, this hasn’t always been the case.

I researched this topic for the first time five years ago. Back then I gathered 200 00 headlines from the website with their categories, publishing dates and number of clicks. Based on my analysis back then already it was clear that the length of the headline correlated directly with the number of clicks, all the way till 120 characters.

The longer the headline the more clicks it receives.

Yet I was intrigued to investigate this interesting subject a bit more. I headed to the webpage of Helsingin Sanomat (, the biggest news outlet in Finland. I investigated their headlines with the help of artificial intelligence and made some pretty interesting findings. As for the material I gathered 1 355 087 of their online news with the timespan of 1.1.2000-31.5.2020.

What kind of results were found with the help of AI?

From the collected material I found the following features of the length of the headlines:

  • For the first 12 years of the material the headlines were less than five words long.
  • The length of the headlines began to rise since the beginning of 2012.
  • From 2012 the length of the headlines grew with one word a year all the way to July 2018.

In July 2018 the average headline was up to 13 words long. Nowadays the length of the headlines has already decreased a bit and the current average length is a bit over 11 words.

The length of the headlines and especially content have changed significantly during the years

And not necessarily for the better…

The headlines used to be compact and told the most relevant aspect of the news to the reader right away. Nowadays it seems that the main purpose of headlines seems to be sparking up the reader’s interest and possibly making the reader feel some sort of way – confused, angry, happy etc. Unfortunately, compact headlines and sparking the reader’s interest don’t seem to go hand in hand and therefore we are faced with mile long headlines on a daily basis.

Also, the amount of pronouns used in the headlines has changed during the years:

  • The use of pronouns had increased significantly.
  • Headlines with the pronouns “who” or “what” are now four times more common than they were before 2012.
  • Before the year 2012 there were 0,5% of pronouns as now there are over two percent of them.

How did the headlines succeed?

Even if now the headlines were longer it didn’t mean they were better or succeeded right away. There seems to be a bit of A/B -testing in the online news headlines as well. I investigated this aspect by placing a sniffer on the website. The sniffer tracked the published articles and their headlines for 12 hours.

Based on the gathered data this is what I found out:

  • There were 59 articles posted and updated to the website
  • The most common reason for updating the article was fixing misspelling and typos.
  • In addition, a large number of the updates included changes in the sentence structure, performed by a human or the computer.

This made me wonder whether the main goal was to just publish the news articles as fast as possible while leaving the checking of grammar and misspelling till later. I couldn’t find a clear answer to this, but there were definitely signs indicating so.

Who makes the changes afterwards, a human or a machine?

It’s hard to say based on the changes in the sentence structures whether the change was implemented by an actual person or a machine. It seemed that the main goal of these changes was to increase the interest towards the headline and highlight the need to click open the news article.

Is there some deeper meaning that can be found from this analysis?

Based on what was presented above and the facts that are commonly known I dare to claim that everyone who is hoping to add more views and clicks to their sites should increase the lengths of their headlines. Well… in reality maybe not everyone since then we all would have to deal more and more with annoyingly long headlines, but you get the point.

I believe a simple automated “play-the-winner” -system would increase the article views easily. With this system there would be two or three alternative headlines for the article and the one that performs the best would be the one presented.

On the other hand, it could be that Helsingin Sanomat already has a similar system in use.

This kind of procedure could bring out the best of journalism and leave the tricks and clickbaiting behind. Then journalists could focus more on the content of the news first and then if it didn’t perform as desired could some extra measures be taken.

Let’s not forget that optimization has a clear ethical dimension in journalism as well

If it were possible to create a so-called Perfect System, that would always know what people want to read news outlets could easily increase the amount of advertising and perhaps orders as well. On the other hand, you have to remember that the media is subjective: then the online media would only show us what we want to read, leaving behind other less popular topics.

If this were the case the media and the news wouldn’t spark up active conversation either. A scenario like this wouldn’t be beneficial for the media either and the media wouldn’t fulfill its role as the society’s watchdog.

How are the headlines in any way related to FabricAI?

Well, to be honest there isn’t really any deeper connection. But if you look at it this way, whether it comes to analyzing the online news’ headlines or purchase invoice processing – with the help of AI both of these tasks can be performed much more efficiently.

A similar analysis of online news’ headlines could be performed without AI but let’s be real – gathering 1 355 087 headlines from a 20-year timespan would be extremely time consuming or even unrealistic.

Same applies for purchase invoice processing.

Purchase invoice processing is one of the most time-consuming tasks of financial administration. The work can be performed manually by a person, but with the help of FabricAI the task can be done way faster and more efficiently.

With FabricAI you can process invoices even 95 percent faster and therefore the accountant has more time to focus on tasks that bring more value and are more enjoyable.

Our goal is to decrease the amount of unprofitable work and unnecessary paperwork.

With the help of AI, it is possible to expedite processing, decrease costs and increase employee satisfaction. Sounds pretty good, if you ask me.

Head of AI
Juhani Tolvanen

So, what do you think? Any ideas? Thoughts? We’d love to hear them. Head on over to LinkedIn and join the conversation!


AI Inside is the most widely used purchase invoice automation solution. With the help of AI Inside, more than one million purchase invoices from nearly 30,000 companies are processed every month.

“When contemplating the providers we learned that Finland has come further than Norway regarding technological advancements in the public sector and the accounting software market. It was a natural choice to pick a provider from the most advanced market.” JORID Trandem – Product Owner – Tripletex AS

AI Inside offers developers an easy way to update a single part of the purchase invoicing process or the entire purchase invoicing process. AI Inside is a combination of technology and knowhow that allows the software developer to update the purchase invoice process to meet the future challenges of customers.

Trusted by some of the biggest players