I did once a web site audit for a major newspaper in Romania. I told them Google had some trouble accessing very old news on their web site (for 2005, for example). They had a calendar, similar to their current one:
, and I told them Google might have some problems going through the dropdown, selecting 2006, and then going through the dropdown again, selecting, say, March, then picking a day, and then, for that day, going to page 4. The number of clicks necessary to go to page 3 of news from March 19, 2006, was the following: Home => Archive => 2006 => March => 19 => Page 3 => News. It’s just huge. Also, the Google bots generally don’t click on dropdowns, they don’t select from a dropdown.
Nowadays, they have a clickable way of displaying the archive:
, which, although is not a huge improvement, is, still, an improvement (Google can follow a clickable link much easier).
The topic of the current article is to show how various news sources in Romania treat old news.
Cotidianul.ro and EvenimentulZilei.ro (EvZ.ro)
Take this thread from May 2005:
If you click on the link in the blog post:
, you are redirected to the homepage.
Take this blog post from 2003:
, which leads to nowhere.
To have an idea, I think both newspapers were, around 2003-2007, in the top 10 newspapers of their time. EvZ was a leader after 1989.
What have they done?
- They changed the domain name and the URL structure.
- They made no redirect from the old to the new URLs.
What should have they done?
- An automatic redirect from old to new URL.
- If the above step was impossible, they should have paid one person (it needn’t much qualifications, other than know a few things about operating a PC) to take all the articles on the web site, search for them on the new web site, and create a big table, with two columns – on the left the old URL of an article, on the right the new URL. Then they should have used this table to manually redirect old URLs to new ones.
It was once one of the top 3 magazines of IT&C (Information Technology and Communications) in Romania. They had a forum which was probably in the top 10 forums in Romania. They had a blogging community (made mostly from their editors) which was quite intensive. They had a download section, with various software.
Now imagine all of this, and think about the following – they redirected their whole web site to a place where they are selling, for a small price, some of their booklets. They did this because they closed the business.
But even from an economic point of view, they could have put some ads on the web site, and kept it running (they surely had some hosting costs, but the traffic was probably earning more revenues than the cost of the hosting, itself).
All of this is gone. Look at the above image (taken from here), and see this web site looks like a web site you would just close because of hosting costs.
“Dilema Veche” means “old dilemmas”. You would expect such a web site to put more emphasis on their archive, since you pay to view the majority of it, and since their subjects are usually evergreen (not news, but analysis and personal thoughts, that can be read years after an event, and still make sense).
They have a place in the login page, where they say:
I paid for an account, and looked at pretty much every possible link in there, trying to find the place to download PDFs with their past issues (they write about PDFs in two places in the above list). I couldn’t find any, so I emailed them. A person told me (like it was the most natural thing in the world), that they are not currently available, but they might be in the future. To me, it’s like putting an ad saying that the product is green, making a contract, buying it and coming home and discovering it’s blue, and being told that this is a natural thing to be (and in the future, it might turn out green, I should keep my hopes high).
There are two (big) issues I found with the archive of Dilema Veche:
- I opened all the articles by Andrei Pleșu on Dilema Veche. He is the director of the magazine, and he probably has a huge audience, some people (like me) subscribing just to read his articles. He has around 500+ articles on the web site. Out of those articles, somewhere between 1 and 5 percent are either empty, or they contain only a sentence. If you open 30 of his (old) articles, you will find some articles which are empty. This, I think, has happened due to either some poor copy & paste, or due to some problem when they imported their old web site to the new platform. The solution? Pay some low qualified person a small fee, give him access to the old database, and instruct him how to open each article on the new web site, and how to import the content on the old site. It takes about 3 seconds to check an article and takes about 1-3 minutes to make sure the new article is as it should be. Compare that to the time invested by an editor of Dilema Veche to write and then edit an article.
- Each number of the magazine is structured on a theme, a main subject, around which a lot of articles on that specific magazine are created. While you can go through the archive and read each number, you would expect them to mention the title of that magazine. They don’t. You start reading a magazine written an year ago, not knowing what is the main subject of that magazine. They probably have a reason for this.
Hanlon’s razor is an eponymous adage that allows the elimination of unlikely explanations for a phenomenon. It reads:
Never attribute to malice that which is adequately explained by stupidity.
I’m not sure if, in the above examples:
- People are not so smart, so that they don’t understand that they are losing money by making customers unhappy, and that the price to pay one (it just takes one!) person to go through each old article, spend anything between 10 seconds to 3 minutes (it’s worth it!) to make the old content more accessible to users is much lower than the overall price you pay for making customers unhappy.
- People are just lazy and they don’t care. “Yes, I’ve seen this, thank you for noticing it, I know about this, thank you for your message, I’m not paid to care, I’m paid for a specific thing, and I just don’t care about anything else”.
There are a lot of reasons for not caring for your archive:
- Your views on the world change. Probably an article written in 2002 and another one written in 2014, by the same person, has differences in quality. You also may have changed some of your opinions.
- It may not be relevant for most of your audience. You don’t want that on the first page of Google.
- You, personally, wouldn’t read an article from 2003. It’s just pointless.
Why, still, keep and treasure your archive?
- There a big fight for short-tail keywords (Romania, Băsescu, Ponta) instead of fighting for long tail keywords (How to prepare an eggplant salad?). People want to be #1 for “Romania”, not for a long tail. But a lot of traffic comes from long-tail. Around 15-25% of the queries Google receives each month are totally new. Imagine this – people type every imaginable thing on Google. You might have a chance to get some of this traffic by leveraging long-tail. And for long-tail you just need content, irrespective of the fact that it’s new or old, quality or not. Just write something!
- When you’re on Google.com, and searching for something, you don’t care that much if the article is new or old. Google has algorithms which show you mostly new stuff, but, sometimes, if the subject is, in itself, old, you want to see something, whether old or new. Most people care less whether the piece of news is new or not than do journalists.
- The effort to keep the archive live is much low than the effort to create something from the scratch.
- Some people want to also read old news, they don’t think so much as a journalist, eager to know only recent events.
- When I read something online (like a forum, see the EvZ / Cotidianul example), and I click on a link which leads to nowhere, I get frustrated.
- Sometimes, I want to read all the content someone has written (like me, wanting to read everything Andrei Pleșu wrote).
- This might shock you, but sometimes it’s not your decision to take, it’s their authors. When they deleted Chip.ro forum, they also deleted every conversation people had on the forum. You, as a forum owner, have, besides the power to control the forum, the responsibility to take good care of it, because you are not the author. I saw a recent dispute on a forum, recently.
I think the reason for which the editors of the magazine choose to ignore old news is the work volume required to keep things up to date. Most of the news sources above had ten (some hundred) of thousands of articles. It looks like a big volume, right? It looks like you wouldn’t pay one person to go through every one of them, right?
But the thing is, they compare wrongly. The right way to compare things is – “one person worked at least 30 minutes, at most 5 days, but most of the times one-two hours for an article. Am I willing to pay another person 10 seconds – 3 minutes to make sure that content is up & running right now?”.
Compare it like this, and the decision becomes simple.
Note: Also see the Yahoo! Group on which I present similar issues: IMRo. To join, email firstname.lastname@example.org and reply to the confirmation email.