The Web is a massive graph (Web pages are nodes, hyperlinks between them are edges)
• Can use graph and link mining approaches
• Example: Find authoritative Web pages on a certain topic
• i.e a page many other pages point to
• Not as easy, for example www.google.com does not explicitly contain “Web search engine”
• Commercial and competitive interests, such as advertisements, distort the picture, as do many navigational link
• To find authoritative Web pages, use hub pages (pages that provides links to many authoritative Web pages)
• Example hub page: a personal home page with a list of recommended links
• HITS (Hyperlink-Induced Topic Search): Start from search query, get root page set, which is then expanded, and iteratively propagate weights for hub and authoritative page weights