Downstream Evaluation
See how SDF performs in downstream agent tasks.
SDF has been validated in a production deployment across 2,335 documents spanning all 10 parent types and 74 unique type combinations. The results below summarize pipeline performance, extraction accuracy, and token efficiency.
| Metric | Value |
|---|---|
| Total documents processed | 2,335 |
| Parent types covered | 10 (all) |
| Unique type combinations | 74 |
| JSON validity rate | 100% |
| Schema validation pass rate | 100% |
| Average extraction confidence | 0.91 |
| Average processing time | ~400 ms/document |
The SDF conversion pipeline uses a two-stage model cascade:
| Stage | Model | Parameters | Purpose |
|---|---|---|---|
| Classification | sdf-classifier | 1.5B | Determine parent_type and type |
| Extraction | sdf-extractor | 3B | Extract all document fields |
This specialized cascade is 4.1x faster than a monolithic 14B parameter baseline that performs both classification and extraction in a single pass.
| Approach | Parameters | Speed | Accuracy |
|---|---|---|---|
| Monolithic baseline | 14B | 1.0x | 87% |
| SDF cascade | 1.5B + 3B | 4.1x | 90% |
The smaller, specialized models outperform the larger generalist model on both speed and accuracy because:
Overall extraction accuracy: 90% exact match across all field types.
| Field Category | Exact Match | Notes |
|---|---|---|
| Type classification | 95.2% | After normalization cascade |
| Entity extraction | 91.3% | Name + type + salience |
| Claim extraction | 87.6% | Claim text + confidence |
| Relationship extraction | 85.4% | Subject + predicate + object |
| Summary generation | 92.1% | Evaluated by human raters |
| type_data fields | 89.8% | Type-specific structured fields |
The 5-stage type normalization cascade corrected 63 non-standard type inventions across the 2,335 document corpus:
| Normalization Stage | Corrections | Example |
|---|---|---|
| Exact match | 0 (pass-through) | article.news → article.news |
| Alias resolution | 24 | blog_post → article.blog |
| Parent type inference | 15 | article → article.news (from domain) |
| Fuzzy matching | 18 | artcle.analyis → article.analysis |
| Fallback classification | 6 | custom.misc → reference.wiki |
| Total corrections | 63 | — |
After normalization: 100% taxonomy conformance.
SDF achieves a 99% token reduction from raw HTML:
| Representation | Avg. Size | Avg. Tokens | Reduction from HTML |
|---|---|---|---|
| Raw HTML | 89 KB | ~73,000 | — |
| Markdown (cleaned) | 12 KB | ~4,200 | 94.2% |
| SDF JSON | 2 KB | ~750 | 98.9% |
Token efficiency from markdown to SDF is 46% reduction — significant because many agent pipelines already convert HTML to markdown as a first step. SDF provides further compression by extracting only semantic content.
| Section | Avg. Tokens | % of Document |
|---|---|---|
| summary | 180 | 24% |
| entities | 120 | 16% |
| type_data | 150 | 20% |
| claims | 95 | 13% |
| relationships | 65 | 9% |
| source + provenance | 80 | 11% |
| Other fields | 60 | 8% |
Across all 2,335 documents, 100% produced valid JSON. This is achieved through:
If you reference this research, please cite:
Sarkar, P. (2026). “Convert Once, Consume Many: SDF for Cacheable, Typed Semantic Extraction from Web Pages.” Zenodo. DOI: 10.5281/zenodo.18559223
Downstream Evaluation
See how SDF performs in downstream agent tasks.
Type Taxonomy
Review the full type taxonomy with production frequency data.