WDF*IDF – The new key for top search engine ranking

In 2014 keyword density is out. This, or a similar presumption could be ascertained when the new star in search engine optimisation cropped up. It is called WDF*IDF and at least forces the keyword density into the background. Keywords for their part remain up to date, after all the new type of analysis discloses which keywords are of interest in the context.

What is the WDF*IDF analysis?

The analysis is actually not as innovative as the SEO communities worldwide treat the topic. The origins of the analysis go back to the seventies, even though they naturally were not drawn for search engines or other rankings. However the formula and the calculation of terms on which the analysis is based dates back to the seventies. This information only mentioned incidentally. It is primarily all about understanding and clarifying the abbreviation, what is hidden behind the six letters:

  • Within Document Frequency – WDF

In words, this section describes nothing but the term weighting within a document. In turn the term in this case stands for the keyword which appears in a certain frequency in the text. This principle could represent the customary keyword density, however, WDF at the same time operates with a correction value which then allows for a uniform reference to result. In addition WDF is limited by the

  • Inverse Document Frequency – IDF

This area of the analysis does not only concentrate on one single document but refers the weighting of a certain term in the entire site. If, for instance in a blog only one single contribution on the site deals with photo albums the word “Garden chair” is important for the analysis – after all it is only mentioned in one single text on the site. However, if a term goes through the whole text, it is not interesting for the analysis as most probably not only one keyword is concerned.

The entire principle is expressed in a formula which will, with the exception of mathematicians, probably confuse many people:



Luckily no one is forced even at this point to compare the term weighting in a document manually with those on a complete site.

If a text has an own DNA, the WDF*IDF analysis will reveal it. The latter of course only functions in a figurative sense, however the practical technique offers the possibility to reveal how well a certain word reflects the contents of a text. As already indicated in the example above, a word such as “and” will hardly point towards a relative content. The conjunction is found in every text and has absolutely no value to offer – other than to connect clauses and sentences. The word “garden chair” on the other hand already has a value and provides the opportunity to clearly localise search results. Those feeling like calculating the value manually can have a try at the following formula:


To really be able to implement the formula, the precise number of documents circulating around the World Wide Web with the term should be known. Of course it is almost impossible to filter it, thus principally only a reference value is used to work with.

This part of the formula can be seriously compared with keyword density, therefor does not represent an innovative achievement in the world of search engine optimisation. Therefore the actual WDF*IDF formula consists of two sections which unite to become a whole. In this case the word acts on the principle that a term is always important the more its presence in a document deviates from the numerous occurrences in other documents on the same website. This mathematical formula is as follows:



If the results of both formulas are multiplied by each other, the frequency of interest for the actual term weighting appears.

What benefit do I get from the analysis?

This is a question which anyone who tackles the subject matter for the first time asks himself. The mathematics may interest mathematical wizards, however, it is not of interest for most of the site operators on their way to the desired ranking in the search engines. Put simply, the analysis reveals at a glance which terms are found in connection with the desired search term. Thus, based on the example “Garden chair of wood” not only

  • Garden chair
  • Wood

could be of importance, but at the same time the terms

  • furniture
  • outdoor
  • garden
  • patio
  • tables
  • outdoor furniture
  • chairs
  • table
  • dining
  • benches

The text created on the topic should therefore not only contain the main keyword “Garden chair of wood” but also the additional terms. Thereby know-how gained from the analysis could, of course, act as guideline for the developing text. It is certain that a good text on the topic will anyway contain the additional words – at least if the topic is viewed from its different aspects and in detail.

For whom does WDF*IDF  analysis make sense?

It is a general problem that the onpage analysis principally reveals numerous words and marks them with a clear keyword curve in the diagram. The tools provided usually operate with a maximum and an average word frequency, so that the text resulting should orientate itself as closely as possible to the average value. Yet what happens if the resulting text should have a mere 250 words, the analysis however already refers to a total of 100 words? In the latter case the text should reveal a clear focus and ignore a number of words displayed in the diagram. Should the text on the garden chairs of wood, for example, particularly emphasis that a manufacturer particularly backing sustainable timer production is concerned, it is recommended that the focus be put on the FSC seal and to associate this with tropical timber.

The WDF*IDF analysis is, however, of particular benefit for long documents for which the entire range of words in connection with the search term can be used.

Finally some know-how

Search terms, links or keyword densities have not completely vanished into thin air with the WDF*IDF analysis. Those wanting to deal with the new tools correctly and wanting to created their documents in line with the new know-how, should consider the analysis results as direction signs and guidelines. If matching terms and synonyms in connection with the topic also occur, the document will be found way up in the ranking of search engines. At least until the term weighting is changed again by Google & Co. and a new marvel comes into being.