Effect Size Decision Tree
Not sure which effect size to use? Follow this flowchart:
What type of outcome data do you have?
│
├── Binary (yes/no, death/survival, response/no response)
│ │
│ ├── Case-control study?
│ │ └── → Use OR (Odds Ratio)
│ │
│ └── RCT or cohort study?
│ └── → Use RR (Risk Ratio)
│ (When event rate < 10%, OR ≈ RR, either is fine)
│
├── Time-to-event / survival data (Kaplan-Meier, Cox regression)
│ └── → Use HR (Hazard Ratio)
│ Enter the HR and 95% CI reported in the paper
│
└── Continuous (means, scores, measurements)
│
├── All studies use the same scale/unit?
│ └── → Use MD (Mean Difference)
│ e.g., blood pressure (mmHg), weight (kg), HbA1c (%)
│
└── Studies use different scales?
└── → Use SMD (Standardized Mean Difference)
e.g., different depression scales (PHQ-9, BDI, HAM-D)
OR — Odds Ratio
Definition
OR compares the odds of an event between two groups. Odds = probability of event / probability of no event.
OR = (a/b) / (c/d) = (a × d) / (b × c)
Where a = intervention group events, b = intervention group non-events, c = control group events, d = control group non-events.
When to Use OR
- Case-control studies — the only design where OR is the correct measure (you cannot calculate incidence rates directly)
- Logistic regression output naturally reports adjusted ORs
- When event rate is very low (<10%), OR ≈ RR, so either is acceptable
Interpretation
| OR Value | Meaning |
| OR = 1 | No difference between groups |
| OR < 1 | Lower odds of event in intervention group (protective if the event is adverse) |
| OR > 1 | Higher odds of event in intervention group |
Common pitfall: When event rates are high (>10%), OR exaggerates the effect size. For example, a true RR of 2.0 might correspond to an OR of 3.0 or higher. Do not substitute OR for RR when event rates are substantial.
RR — Risk Ratio (Relative Risk)
Definition
RR compares the probability (risk) of an event between two groups. Risk = number of events / total number of participants.
RR = (a / (a+b)) / (c / (c+d))
When to Use RR
- RCTs — the preferred effect measure for evaluating intervention effects
- Cohort studies — association between exposure and outcome
- More intuitive clinical interpretation: RR = 0.7 → "30% risk reduction"
OR vs RR: How Much Do They Diverge?
| Control Group Event Rate | True RR | Corresponding OR | Divergence |
| 5% | 2.0 | 2.1 | Minimal |
| 20% | 2.0 | 2.7 | Notable |
| 40% | 2.0 | 4.7 | Severe |
Rule of thumb: The higher the event rate, the more OR overstates the effect compared to RR. If your study design is RCT or cohort, always prefer RR over OR.
HR — Hazard Ratio
Definition
HR compares the instantaneous event rate (hazard) between two groups. It comes from survival analysis and accounts for follow-up time and censoring.
HR = hazard(intervention) / hazard(control)
When to Use HR
- Survival analysis — cancer survival (OS, PFS), cardiovascular event time, time to relapse
- Cox proportional hazards regression — papers reporting HR + 95% CI
- Kaplan-Meier curves with log-rank tests
Interpretation
| HR Value | Meaning |
| HR = 1 | No difference between groups |
| HR < 1 | Lower event rate in intervention group (e.g., slower disease progression → protective) |
| HR > 1 | Higher event rate in intervention group (harmful or risk factor) |
HR vs RR: Key Differences
| Feature | HR (Hazard Ratio) | RR (Risk Ratio) |
| Data source | Cox regression / survival analysis | 2×2 frequency table / cumulative incidence |
| Accounts for time | Yes, considers when events occur | No, only cumulative events at a fixed time point |
| Handles censoring | Yes, correctly handles loss to follow-up | No, censored patients are excluded or crudely handled |
| Typical use | Oncology OS/PFS, cardiovascular MACE | RCT binary outcomes (cure rate, mortality rate) |
Critical rule: HR and OR/RR come from fundamentally different data types (survival data vs. frequency tables). Never mix them in the same meta-analysis.
MD — Mean Difference
Definition
MD is the direct difference between group means, preserving the original measurement unit.
MD = Mean(intervention) − Mean(control)
When to Use MD
- All studies measure the outcome on the same scale and unit
- Examples: systolic blood pressure (mmHg), body weight (kg), HbA1c (%)
Why MD Is Preferred Over SMD When Possible
MD has direct clinical meaning. "MD = −5.2 mmHg" tells a clinician exactly how much the intervention lowered blood pressure. This is far more actionable than "SMD = −0.4 standard deviations."
Tip: If a paper reports only the mean difference and its 95% CI (without individual group means and SDs), you can still include it using the generic inverse-variance method. Enter the MD and CI directly.
SMD — Standardized Mean Difference (Cohen's d)
Definition
SMD divides the mean difference by the pooled standard deviation, removing scale differences. The most common variant is Hedges' g (bias-corrected Cohen's d).
SMD = (Mean(intervention) − Mean(control)) / SD(pooled)
When to Use SMD
- Studies measure the same construct on different scales
- Examples: different depression scales (PHQ-9, BDI-II, HAM-D), different quality-of-life instruments (SF-36, EQ-5D, WHOQOL)
Cohen's d Interpretation Benchmarks
| |SMD| | Effect Size | Clinical Meaning |
| 0.2 | Small | Effect exists but is subtle; may not be noticeable to patients |
| 0.5 | Medium | Clinically meaningful improvement; patients can perceive the difference |
| 0.8 | Large | Substantial clinical improvement; clearly noticeable benefit |
Limitation: SMD loses the original unit, making clinical interpretation less intuitive. Reviewers may ask you to convert the SMD back to a representative scale by multiplying it by that scale's SD. For example, SMD = 0.5 × SD of PHQ-9 (typically ~5) = 2.5 PHQ-9 points, which is more clinically meaningful.
Complete Comparison Table
| Feature | OR | RR | HR | MD | SMD |
| Data type | Binary | Binary | Survival | Continuous | Continuous |
| Null value | 1 | 1 | 1 | 0 | 0 |
| Preserves original unit | No | No | No | Yes | No |
| Clinical interpretability | Medium | High | High | High | Low |
| Recommended study design | Case-control | RCT / Cohort | Cox / KM | Same scale | Different scales |
| Log transformation needed | Yes | Yes | Yes | No | No |
| Statistical properties | Good symmetry | Bounded direction | PH assumption | Good normal approx. | Good normal approx. |
Common Mistakes and Their Consequences
- Using OR instead of RR when event rates are high — Overstates the effect size, potentially misleading clinical judgment. This is one of the most frequently flagged issues by peer reviewers.
- Using SMD when all studies use the same scale — Discards clinically meaningful units. "Blood pressure decreased by 0.3 standard deviations" is far less useful than "blood pressure decreased by 5 mmHg."
- Using MD when studies use different scales — Pooling values in different units produces a meaningless combined estimate that cannot be interpreted.
- Mixing OR and RR in the same analysis — All studies in a single meta-analysis must use the same effect measure. You cannot have some studies contribute OR and others contribute RR.
- Mixing HR with OR/RR — HR comes from survival analysis (considers time and censoring), while OR/RR come from frequency tables. They have fundamentally different statistical bases and cannot be combined.
- Using unadjusted and adjusted estimates together — When extracting ORs or HRs, ensure consistency: either use adjusted estimates from all studies or unadjusted estimates from all studies. Mixing creates systematic bias.
How to Switch Effect Sizes in MetaReview
MetaReview supports all 5 effect sizes with one-click switching:
- Enter your raw data on the "Data Extraction" tab
- Switch to the "Results" tab
- Select the effect size type from the dropdown menu (OR / RR / HR / MD / SMD)
- All statistics, forest plot, funnel plot, and narrative paragraph update in real-time
Best practice: Run your primary analysis with the most appropriate effect size for your study design, then switch to an alternative as a sensitivity analysis. For example, if your primary analysis uses RR, re-run with OR as a sensitivity check. If conclusions are consistent, your findings are more robust.
Quick Reference: Which Effect Size Should I Use?
| Your Situation | Recommended Effect Size | Why |
| RCT comparing drug vs. placebo, outcome is mortality | RR | Prospective design, binary outcome, clinically intuitive |
| Case-control study of risk factor and disease | OR | Cannot calculate incidence in case-control designs |
| Oncology trial, outcome is overall survival | HR | Time-to-event data from Cox regression |
| Drug trial, outcome is blood pressure reduction (mmHg) | MD | All studies use same unit; MD preserves clinical meaning |
| Depression treatment, studies use different scales | SMD | Different instruments measuring the same construct |
| Logistic regression output from observational studies | OR | Adjusted ORs from logistic regression are standard |
| Cardiovascular trial, outcome is MACE event time | HR | Survival analysis with censoring and follow-up variation |
Stay Updated
Get notified about new features and meta-analysis tips.
No spam. Unsubscribe anytime.