Inlinks’ Entity Analyzer
1. Executive Summary & Problem Statement
The Problem: Traditional SEO tools analyze strings of text (keywords), but Generative AI models (LLMs) analyze entities (concepts, people, places, brands) and their relationships within a vector space. A brand may rank for a keyword in Google but remain "invisible" or "hallucinated" in ChatGPT because the LLM does not recognize the brand as a defined entity in its training data or Knowledge Graph,.
The Solution: The Inlinks’ Entity Analyzer is a semantic auditing tool that reverse-engineers how an LLM “reads” a webpage. It identifies which entities are detected, calculates their "Semantic Proximity" to the target topic, and verifies if the brand is successfully mapped to the Google Knowledge Graph or Wikidata,.
Value Proposition:
• Prevent Hallucinations: Ensures the AI understands exactly who you are by validating entity associations.
• Semantic Gap Analysis: Identifies missing topics that competitors discuss, which are necessary to build "topical authority" in the eyes of an AI.
• Schema Verification: Automates the creation of "Ground Truth" through structured data.
2. Target Audience (ICPs)
1. The GEO Strategist: Needs to move beyond keyword density to "Token Density" and "Entity Salience" to influence AI summaries.
2. Digital PR Managers: Needs to verify if the brand is recognized as an authoritative entity on Wikipedia and in the Knowledge Graph.
3. Content Architects: Needs to structure content so that machines (bots) can parse and understand the "Aboutness" of the page.
3. Functional Requirements
3.1 Core Analysis Engine (The "Machine Reader")
• FR-01: NLP Entity Extraction: The system must accept a URL or text block and use Natural Language Processing (similar to Google’s Natural Language API) to extract all named entities (Organization, Person, Location, Event, Product).
• FR-02: Knowledge Graph Validation: For every entity detected, the system must query Wikidata and the Google Knowledge Graph API to check for a valid ID (e.g., /m/0k8z). It must flag entities that are detected but not resolved to a Knowledge Graph ID, as these are risks for hallucination.
• FR-03: Sentiment & Salience Scoring: The tool must score each entity based on:
◦ Salience: How central is this entity to the text? (0.0 to 1.0).
◦ Sentiment: How is the entity discussed? (Positive/Neutral/Negative).
3.2 The "Semantic Gap" Visualizer
• FR-04: Vector Space Mapping: The tool should visualize the "Semantic Distance" between the user's content and the target topic. It relies on the concept that LLMs predict words based on proximity.
◦ Output: "Your content is semantically distant from 'Enterprise Software' because you are missing related entities: 'SaaS', 'Cloud Computing', 'API Integration'."
• FR-05: Competitor Entity Gap: Users can input a competitor's URL. The system will compare the entities (not keywords) present on both pages and highlight the "Entity Gap"—concepts the competitor covers that the user lacks.
3.3 Actionable Optimization (The "Fix")
• FR-06: Schema Injection Recommendations: The tool must automatically generate JSON-LD schema recommendations to disambiguate entities.
◦ Feature: "SameAs" Generator. If the text mentions "Apple" (the fruit) but the AI might confuse it with "Apple" (the brand), the tool generates schema explicitly linking to the correct Wikidata entry.
• FR-07: Internal Linking for Topic Clusters: Suggest internal links to other pages on the domain that represent related entities, building a "Topic Cluster" that strengthens the primary entity's authority.
4. User Interface (UI/UX) Flow
1. Input: User enters a URL (e.g., a product page or blog post) and a Target Entity (e.g., "Generative AI").
2. Processing: The system crawls the page, parses the text through an NLP layer, and cross-references external Knowledge Bases.
3. Dashboard Output:
◦ Entity Score: A grade (A-F) representing how clearly the page communicates the target entity.
◦ The Knowledge Graph Visual: A node graph showing the target entity in the center and all detected entities radiating out. Red nodes indicate "Unrecognized Entities" (no Knowledge Graph ID).
◦ Optimization Checklist: List of missing related entities and schema errors.
4. Export: Download valid JSON-LD Schema code to paste into the website header.
5. Success Metrics (KPIs)
• Entity Detection Rate: The % of entities in a text that correspond to a verified Wikipedia/Wikidata ID (Goal: >80%).
• Schema Adoption: The frequency with which users copy/paste the generated JSON-LD schema (Goal: 40% of sessions).
• Ranking Correlation: Tracking if pages with higher "Entity Scores" appear more frequently in AI Overviews (SGE) or Perplexity citations.
6. Future Roadmap
• v2.1 - The "Hallucination Risk" Index: A proprietary score predicting how likely an LLM is to fabricate facts about the page based on sparse entity data.
• v2.5 - Audio/Video Entity Extraction: Ability to process YouTube transcripts or podcast audio to extract entities for "Voice Search Optimization," aligning with the rise of multimodal AI,.
• v3.0 - Direct LLM Injection: Integration with llms.txt generation to list these verified entities directly in the site's AI manifesto file.