How we evaluate the evidence
Every recommendation on evidage is tagged with an explicit evidence level (Level 1–4). “Works” / “Doesn’t work” is a poor binary — the actual research lives on a spectrum. A claim like “red wine is healthy” looks very different when it comes from a test-tube study, a 10-year cohort of 50,000 people, or a double-blind randomized trial. We make that difference visible at the top of every article.
The 4-level scale — at a glance
| Level | Evidence strength | What it is based on |
|---|---|---|
| Level 1 | Strongest 🥇 | Systematic reviews / meta-analyses of large RCTs, reproduced across independent groups |
| Level 2 | Strong 🥈 | Multiple RCTs or large prospective cohorts with consistent findings |
| Level 3 | Moderate 🥉 | Observational studies or small RCTs — suggestive, not yet decisive |
| Level 4 | Limited 🔬 | Animal models, in vitro studies, case reports — no or minimal human trials |
Important distinction: “evidence strength” and “recommendation strength” are different things. Smoking cessation has Level 1 (Strongest) evidence, which leads to a strong recommendation against smoking. We tag [Level X (evidence strength)] and [recommendation] separately throughout the site.
The 4 levels in detail
Level 1 — Strongest
Supported by systematic reviews or meta-analyses of large randomized controlled trials (RCTs), replicated across independent research groups. Examples: smoking cessation, regular aerobic exercise, the Mediterranean dietary pattern, 7–9 hours of sleep.
Level 2 — Strong
Supported by multiple RCTs or large prospective cohorts showing consistent findings. Examples: olive oil, yogurt, oily fish, coffee (moderate intake), green tea, matcha, vinegar. Most of our “Foods in Focus” articles sit here.
Level 3 — Moderate
Supported by observational studies or small RCTs — suggestive but not decisive. “Worth trying, but do not overstate” applies here.
Level 4 — Limited
Supported only by animal studies, in vitro work, or case reports. Interesting as a topic, but not yet ready to act on. Most “latest miracle supplements” and “new biohacks” live here.
The 4 evaluation axes
- 1. Effect size — quantified where possible
- 2. Reproducibility — single group vs. multiple independent confirmations
- 3. Side effects / risks — benefits and harms always reported together
- 4. Study population — healthy adults vs. patients, age, sex, ancestry
How to read our articles
Look at the [Level X (Strongest / Strong / Moderate / Limited)] tag at the top first. “Strongest” and “Strong” are worth implementing now. “Moderate” is wait-and-see. “Limited” is interesting but not ready to act on.
Last updated: May 2026