embarke
← Home

Methodology

What Embarke is doing under the hood, in language a methods section can quote. Versioned as the product evolves; current revision 2026-05-13.

PRISMA 2020

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses 2020 (PRISMA 2020; Page et al., BMJ 2021) is the dominant reporting standard for systematic reviews of health interventions. Embarke's PRISMA 2020 systematic review output format produces the full PRISMA section structure (Rationale, Objectives, Eligibility, Information sources, Study selection, Data items + synthesis methods, Risk of bias, Certainty of evidence, Results, Discussion, Implications, Other information).

Every PRISMA 2020 run auto-generates a flow diagram SVG from pipeline counts (sources identified at Scout time, records included at citation persistence) and embeds it under the Study Selection section. The Critic agent enforces methodology completeness — drafts missing the Methods block, the GRADE certainty rollup, or the limitations acknowledgment fail review with a critical issue.

Lighter PRISMA-aligned outputs are also available: the Evidence brief (1,500–3,000 words, Plus tier) and the Scoping review (PRISMA-ScR) per Tricco et al., Ann Intern Med 2018 (Pro tier).

GRADE evidence ratings

The Grading of Recommendations Assessment, Development and Evaluation (GRADE) working group's framework assigns a quality rating to every finding across five published domains:

  • Risk of bias in the included evidence
  • Inconsistency of results across studies
  • Indirectness of evidence to the question
  • Imprecision (effect-size confidence)
  • Publication bias risk

Embarke's Calibrator agent runs after the Writer on every PRISMA-aligned framework (configurable via FrameworkSpec.calibrator_mode). The published worst-of-domains rule rolls per-domain judgments up to a single H/M/L/Very-Low rating per finding; the rating is rendered inline in the report and summarized in the Certainty section.

Non-PRISMA frameworks (Convergent assessment, ACH) use an “analytical confidence” calibrator mode instead — same H/M/L bucketing without the 5-domain GRADE substructure, since the question shapes don't suit GRADE.

Risk-of-bias tools

Embarke ships four published risk-of-bias instruments, applicable per cited study based on its design:

  • RoB 2 (Sterne et al., BMJ 2019) for randomized trials. 5 domains. Judgments in low / some_concerns / high. Overall = worst across domains.
  • ROBINS-I (Sterne et al., BMJ 2016) for non-randomized interventional studies. 7 domains. Judgments in low / moderate / serious / critical / no_information. Per the published rule, any no_information domain short-circuits the overall to no_information.
  • AMSTAR 2 (Shea et al., BMJ 2017) for appraising systematic reviews. 16 items, 4 of which are critical (protocol registration, comprehensive search, RoB assessment of included studies, RoB accounting in synthesis). Any critical no drives the overall to critically_low.
  • QUADAS-2 (Whiting et al., Ann Intern Med 2011) for diagnostic accuracy studies. 4 domains with low / high / unclear judgments, worst-wins rollup.

Per-domain judgments are LLM-proposed with a short rationale citing the source evidence; the rollup to an overall judgment is algorithmic (not LLM-guessed) so the published rules apply deterministically. Reviewers can override any per-domain judgment in the UI.

Reporting-standard checks

The Critic agent detects the dominant study type across cited findings (regex over title + URL — RCT, observational, diagnostic accuracy, prognostic prediction, systematic review, narrative review, guideline, case report) and applies the corresponding reporting standard's key checklist items:

  • RCT → CONSORT 2010 (Schulz et al.)
  • Observational → STROBE (von Elm et al.)
  • Diagnostic accuracy → STARD 2015 (Bossuyt et al.)
  • Prognostic model → TRIPOD+AI (Collins et al., 2024 update of TRIPOD with AI/ML extensions)
  • Systematic review → PRISMA 2020 (Page et al.)

When the draft skips an item the dominant study design requires, the Critic raises a missing_reporting_standard_item issue naming the specific item. Narrative reviews, case reports, and guidelines don't carry an audit-grade reporting bar; the check is skipped for those.

Retraction-aware citations

Every DOI cited in a synthesis is enriched against Crossref's metadata API, which carries Retraction Watch's retraction-status flags via the updated-by field. Cited papers flagged retracted or expression-of-concern surface in three places:

  • Inline badges in the rendered report (and inside the DOCX / PDF exports).
  • A top-of-report retraction banner summarizing flagged citations.
  • A critical Critic issue (cited_retracted_paper) that blocks Writer approval unless the draft explicitly acknowledges the retraction in context.

Verified live against the Wakefield 1998 retraction (DOI 10.1016/s0140-6736(97)11096-0) on 2026-05-10 — Crossref returns retracted status with date 2010-02-06.

Source-tier classification

Every URL Scout returns is classified into one of ten source categories (peer-reviewed, regulatory filing, preprint, primary interview, analyst report, trade publication, blog post, press release, social media, other) which map to a six-step quality tier (A → F).

Each FrameworkSpec declares a minimum source tier; PRISMA 2020 enforces tier B (peer-reviewed, preprint, regulatory, primary-interview only); PRISMA-light and evidence brief frameworks declare tier B; lighter business-research frameworks accept tier F (everything). Subscription tier additionally caps the ladder — Free is capped at A+B regardless of framework.

Reproducibility ZIP

Every report can be exported as a signed reproducibility package — a ZIP containing:

  • The full Markdown / DOCX / PDF rendering of the report.
  • manifest.json — model IDs per agent, temperature, prompt SHA-256 hash, corpus version, generation timestamp, framework + output format IDs.
  • The full source list with URLs, source tiers, Crossref enrichment status, retraction flags.
  • BibTeX + RIS exports of the citation set.
  • A SIGNATURE file — SHA-256 hash of the manifest, so a reviewer can confirm the package hasn't been edited after issue.

The audit-grade positioning is “drop the ZIP into your regulatory binder.” The manifest documents exactly what produced this report; the signature lets a reviewer detect tampering.

Methodology stamp + audit trail

Every Markdown / PDF / DOCX export opens with a methodology stamp: framework name, output format, total cited sources, retracted-paper count, generation timestamp, prompt hash, corpus version. Provenance up top, body below.

Behind the scenes, every pipeline run persists agent steps, Critic verdicts (per iteration, including revision cycles), Calibrator outputs, RoB assessments, and the raw prompt hashes — queryable via the admin dashboard for audit reconstruction.

PROSPERO pre-check

When a user starts a project on any PRISMA-aligned framework (PRISMA 2020, PRISMA-ScR, Evidence brief, PRISMA-light), the new-project form surfaces a banner linking to PROSPERO — the International Prospective Register of Systematic Reviews at the University of York's Centre for Reviews and Dissemination — pre-filled with the user's question.

The pre-check doesn't block project creation; it surfaces and lets the user decide. Duplicating an already-registered review is a credibility issue, not a technical error.

What we don't claim

The audit-grade label is earned with explicit limitations stated in every report:

  • Search strategy is not externally registered.
  • Per-query search logs are not preserved in the v0.1 pipeline.
  • Screening is single-reviewer (the Critic acts as second reviewer for methodology completeness, not inclusion decisions).
  • Per-study data-extraction tables are a Tier-2 add-on, not standard on every synthesis.
  • The Calibrator's GRADE assessment is LLM-proposed; reviewers should validate before regulatory use.

All five limitations are named in the Limitations section of every PRISMA 2020 output by default. The Critic enforces that they appear.

Questions on methodology not answered here? Email [email protected] — we maintain this page based on what reviewers ask.

Try Embarke free