Confidence Scores Explained
How we calculate the 'Trust Score' for every data point.
# The Science of Trust: Understanding Folki's Confidence Scoring and Data Verification Models
Last Updated: January 26, 2026
Level: Advanced
Reading Time: 7 Minutes
In the era of Generative AI, hallucination is the single biggest risk to e-commerce adoption. If an AI "guesses" that your $2,000 camera is waterproof and it isn't, you face returns, chargebacks, and potentially legal liability.
Folki was built on a "Zero-Trust" architecture. We do not trust the AI's output blindly. Instead, we wrap every extraction in a rigorous Confidence Scoring Model. This article explains the mathematics behind that score and how to use it to automate your workflow safely.
1. The scoring Equation
Every single attribute value (e.g., `Battery Life: 12 Hours`) extracted by Folki is assigned a Confidence Score (0-100). This score is not random; it is a calculated probability that the data is factual.
The formula weighs three primary vectors:
`Score = (Source_Authority * 0.5) + (Cross_Verification * 0.3) + (Semantic_Consistency * 0.2)`
Vector A: Source Authority (50% Weight)
Where did this information come from?
- Tier 1 (100 pts): Official Documentation.
- PDF User Manuals hosted on the manufacturer's domain (`sony.com`, `samsung.com/manuals`).
- Technical Datasheets.
- Tier 2 (85 pts): First-Party HTML.
- The official Product Detail Page (PDP) on the brand's website.
- Tier 3 (70 pts): Trusted Major Retail.
- Amazon, BestBuy, Walmart (High trust, but possibility of 3rd party seller error).
- Tier 4 (40 pts): Niche Retail / Blogs.
- Random e-commerce sites, review blogs, forums. (Usually discarded unless verified elsewhere).
Vector B: Cross-Verification (30% Weight)
Can we find this fact in more than one place?
- Singular: Found on Site A only.
- Consensus: Found "100 Watts" on Site A, Site B, and Site C. -> Score Booster.
- Conflict: Site A says "100W", Site B says "120W". -> Massive Score Penalty. (Triggers Manual Review).
Vector C: Semantic Consistency (20% Weight)
Does the value make sense?
- Type Check: If the attribute is `Weight` and the value is `Red`, the score drops to 0.
- Range Check: If `Smartphone Battery` is extracted as `50,000mAh` (car battery size), the anomaly detector flags it as improbable.
2. Interpreting the Visual Indicators
In the Review Dashboard, you will see these scores visualized:
🟢 High Confidence (90-100)
- Meaning: "We found this in the Official Manual."
- Action: Safe to Auto-Approve.
- Risk: Near Zero.
🟡 Medium Confidence (70-89)
- Meaning: "We found this on Amazon and BestBuy, but could not locate the original PDF."
- Action: Likely accurate. Quick glance recommended.
- Risk: Low.
🔴 Low Confidence (<70)
- Meaning: "Conflicting data found" OR "Only found on low-trust blog."
- Action: Human Review Required.
- Risk: High.
3. The "Verification Loop" (E-E-A-T)
Google's ranking algorithm prioritizes E-E-A-T (Experience, Expertise, Authoritativeness, and Trustworthiness). Folki's unique advantage is that it doesn't just store the *data*; it stores the proof.
When we extract a value, we save the Source URL and the Date Accessed.
- In your Schema: This is injected as a `citation` property.
- For the AI Agent: When ChatGPT accesses your site, it sees:
`"Battery: 12 Hours [Source: manufacturer-manual.pdf]"`
This citation transforms your claim from "Marketing Fluff" to "Verified Fact," dramatically increasing the likelihood of citation in AI-generated answers.
4. Automation Thresholds
You can configure Folki's "Auto-Pilot" based on your risk tolerance.
Configuration > Automation Rules:
- Rule: `IF Category == 'Medical Devices' THEN Auto-Approve ONLY IF Score > 98`.
- *Why:* High liability. Zero tolerance for error.
- Rule: `IF Category == 'T-Shirts' THEN Auto-Approve IF Score > 75`.
- *Why:* Low liability. If fabric blend is slightly off, it's not life-threatening.
5. Handling Hallucinations
What happens when the AI is wrong?
1. The Human Intervention: You spot an error in the dashboard (e.g., AI mistook "Package Weight" for "Product Weight").
2. The Correction: You edit the value manually.
3. The Learning: Folki's local learning model tags that specific Source Domain or Extraction Pattern as "Unreliable" for your store, reducing the score of future extractions from that pattern.
FAQ: Trust & Confidence
Can I manually upload a PDF to increase confidence?
Yes! If you have the manual, upload it to the product's "Files" tab. Folki will prioritize this document above all web sources, instantly jumping confidence to 100%.
Why is my confidence score dropping over time?
Data decay. If the Source URL returns a 404 (Link Rot), Folki slowly degrades the confidence score to prompt you to re-verify the data (Re-Enrichment).
Does Folki use "Probabilistic" guessing?
No. Unlike ChatGPT which will guess the next word, our extraction agents are strictly "Extractive." They must find the text on the page. They are forbidden from generative guessing.
Conclusion
Confidence Scores are your shield. They allow you to scale automation without sacrificing data integrity. By understanding this scoring mechanism, you can confidently turn the keys over to the machine, knowing that it knows when to ask for help.