Signals & Data Sources

Evidence-based analysis built on publicly observable data. Every signal is traceable to its source.

Data Sources

PrivacyFetch draws from four independent data sources to build each company assessment. No single source is relied on exclusively. When sources conflict, the analysis notes the tension.

Policy Scraping

Web scraping

Fetches privacy policies, terms of service, cookie policies, DPAs, subprocessor lists, and other legal documents from the company website. Raw HTML is converted to structured markdown for analysis.

Breach Monitoring

Public breach databases

Checks for publicly reported data breaches within the last 24 months. Breach recency and severity are factored into the overall assessment.

Tracker Detection

Live page scan

Scans the company website for advertising trackers, analytics services, session recording tools, social media pixels, and cookies. Results feed directly into the Tracking dimension.

AI Analysis

17 parallel extraction tasks

Reads the actual policy text to extract structured data: what data is collected, who it is shared with, what rights are offered, retention periods, AI training practices, deletion difficulty, dark patterns, and more.

What We Extract

Signals are organized by the scoring dimension they feed into. Each signal represents a specific, verifiable claim about the company's data practices.

Data Collection

SignalDescription
Data types collectedBiometric, health, behavioral, browsing history, location, financial, and other categories
Collection methodsHow data is gathered (direct input, automatic tracking, third-party sources)
Sensitive data evidenceDirect quotes from the policy confirming collection of sensitive categories
Data type countTotal number of distinct data types identified, penalized when exceeding 10

Data Sharing

SignalDescription
Sells personal dataWhether the company sells personal information to third parties
Data broker indicatorsEvidence of data broker relationships or broker-like practices
Advertiser sharingWhether data is shared with advertising networks or ad partners
Partner count and namesNumber and identity of data sharing partners and advertising partners
Subprocessor list presenceWhether a public list of data sub-processors is maintained
Sharing categoriesBusiness partners, affiliates, vendors, and other sharing recipient types
Processing purposesTargeted advertising, profiling, and remarketing as stated data processing purposes

Tracking

SignalDescription
Advertising tracker count and namesSpecific ad trackers found on the website (Google Ads, Facebook Pixel, etc.)
Analytics tracker countNumber of analytics services detected beyond the 3-tracker threshold
Session recording toolsTools like Hotjar, FullStory, or similar session replay services
Cookie typesMarketing, analytics, and essential cookies detected on the site
DNT/GPC supportWhether the site honors Do Not Track or Global Privacy Control signals
Social trackersSocial media tracking pixels and share button integrations
Cross-device trackingEvidence of tracking users across multiple devices

Transparency

SignalDescription
Policy presenceWhether a publicly accessible privacy policy exists
Sections foundNumber of standard policy sections identified (data collection, sharing, retention, rights, etc.)
Word countTotal policy length; readable policies are under 6,000 words, excessive policies exceed 10,000
Retention specificityWhether specific data retention periods are stated vs. vague language
ContradictionsInconsistencies found within the policy text, counted and detailed with evidence
DPA publishedWhether a Data Processing Agreement is publicly available
Purposes statedWhether the company explicitly states its data processing purposes

User Rights

SignalDescription
Rights listed8 recognized rights: access, deletion, correction, portability, opt-out, withdraw consent, restrict processing, object to processing
Request channelsNumber and types of channels for exercising rights (web form, email, in-app, postal)
Deletion difficultyScored 0–4 based on barriers, requirements, and friction in the deletion process
Data request formWhether a structured form exists for submitting data access or deletion requests
Privacy emailWhether a dedicated privacy contact email address is published
Appeals processWhether users can appeal denied requests

AI Practices

SignalDescription
Usage disclosureWhether the company explicitly, partially, or does not disclose AI usage (yes/partial/no)
Training on user dataWhether personal user data is used to train AI models
Training on interactionsWhether user interactions (clicks, queries, behavior) are used for AI training
Training on public contentWhether publicly posted user content is used for AI training
Third-party AI sharingWhether data is shared with external AI providers, and whether this is disclosed
Automated decisionsWhether AI is used for automated decision-making that affects users
Opt-out availabilityWhether users can opt out of AI training on their data

Evidence-Based

Every signal extracted during analysis includes evidence: direct quotes from the policy text that support the finding. This means users can verify each assessment themselves by reading the original source material.

When the AI extraction identifies a practice (for example, that a company sells personal data), the specific passage from the privacy policy is preserved alongside the structured signal. Evidence is displayed on company profiles so that the assessment is never a black box.

Scoring factors also include evidence. When a penalty or bonus is applied to a dimension score, the factor entry records both a human-readable description and, where available, the supporting details or policy quote that triggered it.

Analysis Pipeline

Each company assessment follows a five-stage pipeline. The entire process runs asynchronously via background jobs.

1
Discover URLs
Locate the company's privacy policy, terms of service, cookie policy, DPA, and subprocessor list URLs.
2
Scrape policies
Fetch each document and convert to clean, structured markdown. Cookie tables and supplementary pages are appended.
3
Detect trackers
Scan the company's live website for advertising trackers, analytics services, session recording tools, and cookies.
4
Run 17 AI tasks
Execute extraction tasks in parallel: data partners, policy claims, tracking cookies, collected data types, policy summary, broker indicators, deletion difficulty, dark patterns, data purposes, international transfers, retention policies, subprocessors, company info, AI practices, terms core, terms financial, and terms legal.
5
Calculate score
Apply the 5-dimension scoring engine with weights, penalties, and bonuses to produce the composite 0–100 privacy score.

For a deeper look at each stage, see How Analysis Works.

Limitations

Signals are derived from publicly available documents and infrastructure. Internal data handling practices, employee training, or unpublished policies are not captured. Companies without a published privacy policy receive zeroed scores for Data Collection, Data Sharing, and Tracking dimensions to avoid false positives from missing data.

AI extraction is automated and conservative. When the system is uncertain about a finding, it records the uncertainty rather than making an unsupported claim. All signals should be considered one input among many when evaluating a company's privacy practices.