HITS Algorithm

Hyperlink-Induced Topic Search (HITS) is a link analysis algorithm that rates Web pages. HITS algorithm is popularly known as hubs and authorities. Hubs are web pages that are not authoritative in themselves but serve as directories to other more authoritative sites. This would help in ranking said websites and properly defining hubs and authorities. With this algorithm, a good hub serves as a page that directs users to other useful pages. In short, a page that has many hubs linked to it ranks well in SERPs.

The Algorithm

The algorithm begins its work by retrieving the list of pages most relevant to the search terms. This list is referred to as the root set. You can get the root set by collecting top pages curated by an algorithm that’s text-based. From this set, all the web pages connected to these top pages in the root set are generated to form the base set. All the hyperlinks and their respective links in the based game constitute a focused subgraph.

The HITS algorithm also computes this focused subgraph to ensure that the computation includes many, if not all, of the most influential authorities. Hub and authority values are interrelated and expressed as mutual recursions. As such, the hub value is calculated as the sum of the computed authority values of the web pages that it directs or links to. On the other hand, authority value is calculated as the total of the scaled hub values that connect to a particular page.

In all, the HITS algorithm undertakes several iterations, each made up of two primary steps, namely, the authority update and the hub update.

