Complete documentation of how every metric is calculated, what it measures, its limitations, and the rationale behind each design decision.
What it measures: How mature and established the evidence base is for a given research topic. A high EMI indicates the field has progressed from exploratory studies to controlled trials and formal synthesis.
| Component | Weight | How Calculated | Rationale |
|---|---|---|---|
| Synthesis Presence | 35% | Log-scaled count of meta-analyses and systematic reviews in corpus | Existence of synthesis is the strongest signal of field maturity (Ioannidis, 2016) |
| Design Quality | 25% | Proportion of controlled designs (RCT, cohort, case-control) in corpus | Controlled studies indicate the field has moved beyond descriptive/exploratory phase |
| Volume Adequacy | 20% | Log2-scaled article count (diminishing returns: 50 articles ≈ 75/100) | More evidence provides broader base for conclusions, with diminishing returns |
| Temporal Breadth | 20% | Year span of publications (wider span = more mature) | Fields studied over longer periods have more opportunity for consolidation |
| Score | Level | Meaning |
|---|---|---|
| ≥ 80 | HIGH | Consolidated field with synthesis and controlled studies. Formal evidence synthesis is well-supported. |
| ≥ 60 | MODERATE | Maturing field with quality primary studies. Systematic reviews feasible with methodological caveats. |
| ≥ 40 | EMERGING | Developing field, predominantly observational. Suitable for hypothesis generation, not formal recommendations. |
| < 40 | LOW | Sparse or low-quality evidence. Any conclusions would be premature. |
What it measures: How saturated/complete the research coverage is for a topic. A high RSC indicates most research questions have been addressed and the marginal value of new primary studies is low.
| Component | Weight | How Calculated | Rationale |
|---|---|---|---|
| Synthesis Coverage | 30% | Ratio of clusters containing MA/SR to total clusters | Synthesized clusters indicate completed research cycles |
| Gap Scarcity | 30% | 1 / (1 + high_priority_gaps). Zero gaps = 1.0 | Fewer gaps mean more complete coverage |
| Trend Stability | 20% | Proportion of stable or declining clusters | Non-growing fields indicate saturation, not opportunity |
| Coverage Breadth | 20% | Proportion of articles captured in identified clusters | High clustering = well-organized evidence landscape |
| Score | Level | Meaning |
|---|---|---|
| ≥ 0.8 | SATURATED | Extensively studied. Prioritize synthesis and implementation over new primary studies. |
| ≥ 0.5 | MATURE | Well studied with specific gaps. Target research at identified gaps. |
| ≥ 0.25 | DEVELOPING | Significant space for new studies. Original contributions have high potential impact. |
| < 0.25 | UNEXPLORED | Scarcity of studies. Any well-designed study will contribute meaningfully. |
What it measures: How well-formed the evidence pyramid is — whether the distribution of study types follows the expected shape (broad base of observational studies, middle layer of experimental, narrow top of synthesis).
| Component | Weight | How Calculated | Rationale |
|---|---|---|---|
| Tier Completeness | 30% | Are all 3 tiers present? (top: MA/SR, mid: RCT/Cohort, base: Cross-sectional/Case) | A complete pyramid requires evidence at all levels |
| Distribution Shape | 40% | Distance from ideal proportions (top 10%, mid 45%, base 45%) | Core question: does this look like a proper pyramid? Highest weight. |
| Inversion Check | 30% | Penalty if synthesis count exceeds primary study count | An inverted pyramid (more MA than primary studies) indicates possible duplicative synthesis or weak foundation |
| Score | Level | Meaning |
|---|---|---|
| ≥ 0.7 | ROBUST | Well-structured pyramid. High reliability for evidence synthesis. |
| ≥ 0.4 | MODERATE | Some structural gaps. Synthesis possible with caveats. |
| < 0.4 | FRAGILE | Poorly structured or inverted pyramid. Synthesis would be methodologically problematic. |
What it measures: The scientific potential of an individual article for a researcher's purposes — combining evidence level, novelty, relevance, and recency.
Note: This is a research utility score, not a quality score. A novel case report (low evidence level) may score higher than an older meta-analysis if the topic is emerging.
| Component | Weight | Source |
|---|---|---|
| Evidence Hierarchy | 25% | Study design (MA=1.0, RCT=0.85, Cohort=0.70, Case Report=0.30) |
| Novelty | 20% | Emerging terms, Watchtower alignment, challenges consensus |
| Watchtower Match | 15% | Alignment with actively monitored research topics |
| Lexicon Hits | 15% | MeSH and custom terminology matches |
| Recency | 15% | ≤1yr=1.0, ≤2yr=0.85, ≤5yr=0.5, ≤10yr=0.3 |
| Quality Indicators | 10% | Keywords: multicenter, prospective, large sample, validated |
What it measures: The research opportunity within a thematic cluster — identifying areas where a systematic review or new study would have high impact.
A cluster with many primary studies but no meta-analysis scores highest — this is a clear synthesis opportunity.
What it measures: The study design of each article, classified into the standard evidence hierarchy (EBM pyramid).
Classification uses keyword matching on title and abstract. Patterns searched include: "meta-analysis", "systematic review", "randomized controlled trial", "cohort", "case-control", "cross-sectional", "case report", "guideline", etc. The first match determines the classification.
| Layer | Study Types | Evidence Level |
|---|---|---|
| A | Meta-Analysis, Systematic Review | Highest — synthesized evidence |
| B | RCT, Clinical Practice Guideline | High — controlled experimental |
| C | Cohort, Case-Control, Cross-Sectional | Moderate — observational |
| D | Narrative Review | Low — expert synthesis without systematic method |
| E | Case Report, Editorial, Expert Opinion | Lowest — anecdotal or opinion-based |
What it measures: The reporting quality of an abstract — whether it follows structured reporting conventions (IMRAD), clearly states objectives, defines the population, and reports quantitative outcomes.
Not what it measures: This is NOT a risk of bias assessment or a judgment of study validity. A well-written abstract for a poorly designed study will score high.
| Component | Weight | What It Checks |
|---|---|---|
| IMRAD Structure | 20% | Presence of Background, Objective, Methods, Results, Conclusion sections |
| Objective Clarity | 20% | Explicit aim statement ("to determine", "to evaluate", "to compare") |
| Population Definition | 20% | Sample size, age, sex, inclusion criteria, setting mentioned |
| Outcome Reporting | 20% | Primary outcome defined, quantitative results present |
| Quantitative Data | 10% | Statistical measures (p-values, CI, effect sizes) |
| Conclusion Alignment | 10% | Balanced language, no overclaiming |
All EvidenX metrics share these inherent limitations:
1. Oxford Centre for Evidence-Based Medicine. "Levels of Evidence" (2011).
2. GRADE Working Group. "GRADE: an emerging consensus on rating quality of evidence and strength of recommendations." BMJ 336(7650), 2008.
3. Sackett DL et al. "Evidence-based medicine: what it is and what it isn't." BMJ 312(7023), 1996.
4. Murad MH et al. "New evidence pyramid." Evidence-Based Medicine 21(4), 2016.
5. Ioannidis JPA. "The Mass Production of Redundant, Misleading, and Conflicted Systematic Reviews and Meta-Analyses." Milbank Quarterly 94(3), 2016.
6. Bornmann L, Mutz R. "Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references." JASIST 66(11), 2015.
7. PRISMA Group. "Preferred Reporting Items for Systematic Reviews and Meta-Analyses." PLoS Medicine 6(7), 2009.
8. Higgins JPT et al. "Cochrane Handbook for Systematic Reviews of Interventions." Version 6.3, 2022.