What is Orphan Page?
Definition
A page on your website that has no internal links pointing to it. Search engines can struggle to discover orphan pages, and even when discovered, they receive no internal PageRank flow.
Why it Matters for AI Search
Orphan pages are invisible to most users and severely under-rank. Common sources: old landing pages, sitemap-only URLs, and pages launched without nav updates. Internal linking is the cheapest fix.
More Glossary Terms
301 Redirect
An HTTP status code indicating that a URL has permanently moved to a new location. Browsers and search engines follow the redirect and update their records to the new URL.
302 Redirect
An HTTP status code indicating a temporary URL redirect. Search engines treat 302s as transient and continue to index the original URL.
404 Error
An HTTP status code indicating the requested page does not exist. The browser still loads a response (often the site's "Not Found" page) but with a 404 status header.
AI Overview (Google SGE)
Google's AI-generated summary box that appears at the top of search results, providing a synthesized answer to user queries by pulling information from multiple web sources.
Algorithm Update
A change to Google's ranking systems. Major named updates (Core Updates, Helpful Content, Spam updates) are announced; smaller changes happen daily.
AMP (Accelerated Mobile Pages)
A Google-backed framework for building mobile-fast web pages with restricted HTML and CSS. Once required for Top Stories carousel; that requirement was removed in 2021.
Anchor Text
The visible, clickable text of a hyperlink. Search engines use anchor text as a strong signal of what the linked page is about.
Answer Engine Optimization (AEO)
The practice of optimizing content specifically to appear as direct answers in AI-powered answer engines such as Google AI Overviews, Bing Copilot, and Perplexity AI.
AVIF
An image format based on the AV1 video codec, offering 50%+ smaller files than JPEG at equivalent quality. Supported in Chrome, Firefox, Safari (iOS 16+), and most modern browsers.
Backlink
A hyperlink from one website pointing to another. Search engines treat backlinks as "votes of confidence" — the more high-quality, relevant sites linking to you, the more authority your domain gains.
Bounce Rate
The percentage of single-page sessions on a site — visits where the user leaves without interacting further. Definitions vary across analytics platforms (GA4 measures "engagement" inversely instead).
Canonical Tag
An HTML link element (<link rel="canonical" href="...">) that tells search engines which URL is the preferred version when duplicate or near-duplicate pages exist.
Canonical URL
An HTML element (rel="canonical") that tells search engines which version of a URL is the "master" copy when duplicate or similar content exists across multiple URLs.
CDN (Content Delivery Network)
A network of geographically distributed servers that cache and serve your content from the location nearest each visitor — reducing latency and origin-server load.
ChatGPT Search
OpenAI's search feature inside ChatGPT that retrieves real-time web results and synthesizes them into conversational answers with source links.
CLS (Cumulative Layout Shift)
A Core Web Vitals metric that measures visual stability by tracking how much page content unexpectedly shifts during loading. Google recommends keeping CLS under 0.1.
Content Decay
The gradual loss of organic traffic to previously well-ranking pages as competitors update content, search intent shifts, or freshness becomes a stronger ranking signal.
Content Pruning
Deliberately removing or noindexing low-quality, low-traffic, or outdated pages to improve overall site quality signals — particularly relevant after Helpful Content updates.
Cookie Consent Banner
A user interface element required by GDPR (EU), CCPA (California), and similar privacy regulations that requests consent for tracking cookies before they're set.
Core Web Vitals
A set of three Google metrics — LCP (Largest Contentful Paint), INP (Interaction to Next Paint), and CLS (Cumulative Layout Shift) — that measure real-world user experience on a website.
Crawl Budget
The number of URLs a search engine bot will crawl on your site within a given period. Determined by site authority, server speed, and content freshness.
Crawl-Delay
A non-standard robots.txt directive asking crawlers to wait a specified number of seconds between requests. Honored by Bing, Yandex, and most smaller bots; ignored by Google.
Crawlability
The ability of search engine bots (like Googlebot) to access, read, and index the pages of your website. It's determined by your robots.txt, sitemap, internal link structure, and server response codes.
Critical CSS
The minimum CSS required to render above-the-fold content, inlined directly into the HTML head to avoid blocking render on an external stylesheet request.
CrUX (Chrome User Experience Report)
Google's public dataset of real-user performance metrics aggregated from Chrome users who opt in to anonymous usage statistics. The source of "field data" in PageSpeed Insights.
Disavow Tool
A Google Search Console feature that lets site owners tell Google to ignore specific inbound links — used when toxic or spammy backlinks risk triggering a manual penalty.
Domain Authority (DA)
A search engine ranking score (typically 1-100) that predicts how likely a website is to rank in search results. Originally created by Moz, similar metrics exist from Ahrefs (DR) and SEMrush (AS).
Duplicate Content
Substantively identical content appearing on multiple URLs, either within one site (internal) or across different sites (external). Search engines pick one canonical version and may suppress the rest.
Dwell Time
How long a user spends on a page after clicking from search results before returning to the SERP. A behavioral signal indirectly influencing rankings via Google's click models.
E-A-T vs E-E-A-T
Google added "Experience" to E-A-T (Expertise, Authoritativeness, Trustworthiness) in December 2022, making it E-E-A-T. The Experience pillar specifically values first-hand, practical knowledge.
E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness)
Google's quality framework for evaluating content credibility. It assesses whether content demonstrates first-hand Experience, subject-matter Expertise, site-wide Authoritativeness, and overall Trustworthiness.
Entity SEO
Optimization for the people, places, things, and concepts (entities) that search engines and AI models use to understand content — rather than purely keyword-based optimization.
Evergreen Content
Content that remains relevant and useful for years rather than tied to a current event or trend. Examples: how-to guides, definitions, foundational explanations.
FCP (First Contentful Paint)
The time from page navigation to when the first DOM content (text, image, SVG, or non-blank canvas) is rendered. A user-perceived performance metric.
Featured Snippet
A special search result box that appears at the top of Google's organic results (position zero), displaying a direct answer extracted from a web page. Types include paragraphs, lists, tables, and videos.
FID (First Input Delay)
The time from a user's first interaction (click, tap, key press) to when the browser actually starts processing it. Replaced by INP as a Core Web Vital in March 2024.
GDPR (General Data Protection Regulation)
The European Union's comprehensive data protection law, in force since 2018. Requires explicit consent for tracking, gives users the right to access and delete their data, and applies to any site serving EU users.
Generative Engine Optimization (GEO)
The process of optimizing website content to be discovered, understood, and cited by generative AI engines (like Google AI Overviews, ChatGPT Search, and Perplexity).
Google Business Profile (GBP)
Google's free business listing service (formerly Google My Business) that controls how your business appears in Maps, local pack results, and Knowledge Panels.
Google Search Console (GSC)
Google's free webmaster tool providing search performance data, indexing reports, manual action notices, sitemap submission, and URL inspection — for verified site owners.
GTmetrix
A web performance analysis tool that combines Lighthouse and WebPageTest data into a unified report. Long-running alternative to PageSpeed Insights for performance audits.
Helpful Content Update
A series of Google ranking system changes (first launched 2022, now part of the core algorithm) that demote sites publishing content created primarily for search engines rather than to genuinely help users.
Hreflang Tags
HTML attributes that tell search engines which language and regional version of a page should be shown to users in different countries. They follow the format hreflang="en-US" or hreflang="fr-FR".
HTTP Status Codes
Three-digit codes a server returns with every response: 2xx (success), 3xx (redirect), 4xx (client error), 5xx (server error). Crawlers use these to decide whether to index, follow, or skip a URL.
HTTP/2
A major revision of the HTTP protocol introducing multiplexing, header compression, and server push. Faster than HTTP/1.1 for sites with many resources.
HTTP/3
The third major version of HTTP, built on the QUIC protocol. Reduces connection setup time (especially over flaky mobile networks) and eliminates head-of-line blocking that affects HTTP/2.
Index Coverage Report
A Search Console report showing which of your URLs are indexed, excluded, or have errors. Categorizes pages as Valid, Excluded, or Error with specific reasons.
Indexability
Whether a web page can be stored in a search engine's index and appear in search results. Pages can be non-indexable due to noindex tags, canonical redirects, robots.txt blocks, or server errors.
Interaction to Next Paint (INP)
A Core Web Vitals metric that measures a page's overall responsiveness to user interactions by observing the latency of all click, tap, and keyboard interactions throughout a user's visit.
Intrusive Interstitial
A pop-up, overlay, or modal that covers most of the page content immediately on load — particularly on mobile. Penalized by Google since 2017 for harming user experience.
JSON-LD (JavaScript Object Notation for Linked Data)
A lightweight, script-based format for embedding structured data (Schema.org markup) into web pages. It uses a <script type="application/ld+json"> tag to describe page content to search engines and AI.
Keyword Cannibalization
When multiple pages on your website compete for the same keyword or search query, confusing search engines about which page to rank and splitting your authority across multiple URLs.
Knowledge Graph
Google's database of entities (people, places, things, concepts) and the relationships between them. Powers Knowledge Panels, AI Overview citations, and entity-based ranking.
Lazy Loading
A technique that defers loading of off-screen images and iframes until the user scrolls near them. Reduces initial page weight and improves LCP.
LCP (Largest Contentful Paint)
A Core Web Vitals metric that measures how long it takes for the largest visible content element (image, video, or text block) to render on screen. Google recommends LCP under 2.5 seconds.
Lighthouse
Google's open-source automated tool for auditing web page quality across performance, accessibility, best practices, and SEO. Available in Chrome DevTools and as a Node.js library.
Link Equity (Link Juice)
The ranking value passed from one page to another via a hyperlink. Stronger source pages pass more equity; nofollow, noindex, and broken links pass none or little.
llms.txt
A proposed standard text file placed at the root of a website (like robots.txt) that provides a concise, markdown-formatted directory of information specifically designed for Large Language Models and AI Answer Engines to read and cite.
Local Pack (Map Pack)
The grouped set of three local business listings shown above organic results for queries with local intent — pulled from Google Business Profile data plus on-site signals.
Manual Action
A penalty applied by a human Google reviewer (rather than an algorithm) for violating webmaster guidelines. Visible in Search Console under Security & Manual Actions.
Meta Description
An HTML meta tag that provides a brief summary (typically 155-160 characters) of a web page's content. While not a direct ranking factor, it appears as the snippet text in search results.
Meta Robots Tag
An HTML meta tag in the page head (<meta name="robots" content="...">) that controls indexing and following behavior per page. Common values include noindex, nofollow, noarchive, and nosnippet.
Mobile Usability
A set of criteria Google uses to evaluate whether a page works well on mobile devices: viewport configuration, tap target spacing, text readability, no horizontal scrolling.
Mobile-First Indexing
Google's policy of using the mobile version of a website as the primary basis for indexing and ranking. Default for all sites since 2023.
Nofollow Link
A link with rel="nofollow" attached, telling search engines not to pass ranking signal through to the destination. Used for sponsored content, user-generated links, and untrusted sources.
Noindex
A meta robots directive telling search engines not to include a page in their index, even if they crawl it. Specified via <meta name="robots" content="noindex"> or the X-Robots-Tag HTTP header.
Open Graph
A protocol (originally from Facebook) using <meta property="og:..."> tags to control how URLs render when shared on social media. Standard tags include og:title, og:description, og:image, and og:url.
Page Authority (PA)
Moz's 0-100 scoring metric estimating how strongly a specific page (rather than a whole domain) ranks. Calculated from page-level link signals.
Page Experience Signal
Google's composite ranking signal combining Core Web Vitals, HTTPS, mobile-friendliness, and intrusive interstitial penalties. Confirmed as a ranking factor since 2021.
PageRank
Google's original algorithm for ranking web pages based on the quantity and quality of inbound links. The public PageRank toolbar score was discontinued in 2016, but the underlying algorithm still informs Google's ranking systems.
PageSpeed Insights (PSI)
Google's free web tool that combines Lighthouse lab measurements with real-user field data from the Chrome User Experience Report (CrUX) to evaluate page performance.
People Also Ask (PAA)
An expandable SERP feature that surfaces related questions users frequently ask alongside their original query. Each answer is sourced from a different page.
Perplexity AI
An AI-powered answer engine that generates direct responses to user queries with cited sources. Combines large language models with real-time web search.
Pillar Page
The central comprehensive page in a topic cluster — typically 2,000-5,000 words covering the breadth of a topic, with internal links to detailed cluster pages.
Preload
A <link rel="preload"> declaration that tells the browser to fetch a critical resource (font, hero image, key script) earlier in the page load sequence.
PWA (Progressive Web App)
A web application that uses modern APIs (service workers, manifest, push notifications) to deliver an app-like experience — installable, offline-capable, push-enabled.
Rank Tracking
The practice of monitoring your website's position in search results for specific keywords over time, typically aggregated across hundreds or thousands of queries.
Ranking Factor
Any signal a search engine uses to determine how to order results for a query. Google uses hundreds, with relevance, authority, freshness, and user experience among the most influential.
rel=prev / rel=next
HTML link elements that historically declared paginated sequences (page 1, 2, 3 of a series). Google deprecated official support in 2019 but other engines and AI crawlers still parse them.
Render-Blocking Resources
Stylesheets and scripts that prevent the browser from rendering the page until they finish downloading and parsing. Common culprits: external CSS files, synchronous scripts in the head, third-party widgets.
Rich Results
Enhanced search result formats that go beyond plain text — including review stars, FAQ accordions, recipe cards, event details, and product carousels. Powered by structured data.
robots.txt
A text file at the root of a website that instructs search engine crawlers which pages or sections they are allowed or disallowed from accessing. It also specifies sitemap locations.
Schema Markup (Structured Data)
A standardized vocabulary (Schema.org) used to annotate web content so that search engines and AI models can understand the meaning behind the data — such as products, reviews, FAQs, articles, and organizations.
Search Intent
The underlying goal behind a user's search query, typically classified as informational (learn), navigational (find a specific site), commercial (research before buying), or transactional (purchase or convert).
Search Quality Rater Guidelines
A 170+ page document Google uses to train human contractors who manually rate search results. Defines E-E-A-T, Page Quality (PQ), and Needs Met scoring.
Semantic HTML
Using HTML elements that describe their meaning (article, section, nav, header, main, aside, h1-h6) rather than generic divs and spans. Search engines and accessibility tools rely on semantic structure to understand page content.
SERP (Search Engine Results Page)
The page displayed by a search engine in response to a query. Modern SERPs include organic results, paid ads, featured snippets, AI Overviews, knowledge panels, People Also Ask boxes, and more.
SERP Cannibalization
When two or more pages on the same site rank for similar queries, splitting clicks and ranking signals — and confusing search engines about which page deserves the higher position.
SERP Feature
Any non-organic element on a search results page: featured snippets, AI Overviews, People Also Ask, knowledge panels, image carousels, video results, news boxes, local packs.
SERP Volatility
A measure of how much search results are shifting day-to-day, typically tracked by tools like Semrush Sensor, Mozcast, or Algoroo. Spikes indicate algorithm updates.
Site Architecture
The way pages on a website are organized, linked, and grouped — typically following a hierarchical or hub-and-spoke pattern. Affects crawlability, link equity flow, and topical clarity.
Soft 404
A page that returns HTTP status 200 (OK) but contains content suggesting it is missing or empty — like "no results found" or "product unavailable." Search engines treat them as low-quality.
Speed Index
A Lighthouse metric that captures how quickly the visible content of a page is populated, measured as the average time at which visible parts are displayed.
Topic Cluster
A content organization model where one comprehensive "pillar" page covers a broad topic and links to many narrower "cluster" pages on subtopics, with reciprocal internal links.
Topical Authority
The depth and breadth of expertise a site demonstrates in a specific subject area, measured through interlinked content, citation patterns, and entity coverage rather than just backlinks.
TTFB (Time to First Byte)
The time between a browser requesting a page and receiving the first byte of the response from the server. A foundational performance metric that affects all downstream Core Web Vitals.
TTI (Time to Interactive)
The point at which the page is fully rendered AND can reliably respond to user input within 50ms. Captures both visual completion and JavaScript responsiveness.
Twitter Card
A meta-tag protocol for controlling how URLs render when shared on Twitter/X. The most common type is summary_large_image, which shows a 1200×630 image with title and description.
URL Inspection Tool
A Search Console feature that shows the exact indexing status, last crawl date, canonical determination, and rendering of any URL on your verified property.
WebP
A modern image format developed by Google that produces 25-35% smaller file sizes than JPEG and PNG at equivalent visual quality. Supported in all major browsers since 2020.
WebPageTest
An open-source performance testing tool that runs tests from real browsers in real locations, with detailed waterfall charts and filmstrips of the load process.
XML Sitemap
An XML file that lists all important URLs on your website, including metadata like last modified dates and priority levels. It helps search engines discover and index your content efficiently.
YMYL (Your Money or Your Life)
Google's designation for topics that can significantly affect a user's health, financial stability, safety, or wellbeing. Held to higher E-E-A-T standards than non-YMYL content.
Zero-Click Search
A search query where the user gets their answer directly from the search engine results page (SERP) — via featured snippets, AI Overviews, or knowledge panels — without clicking through to any website.
See if your site is optimized.
Run a free AI Search Readiness scan to detect missing JSON-LD schema, poor Core Web Vitals, and more.
Scan My Website