Skip to main content

How to Validate Your EPUB Before Publishing

·14 min read·
EPUB ValidationEPUBCheckAccessibility

Validation catches the errors your readers and retailers will find if you do not. A single malformed metadata entry can cause Apple Books to reject your upload. A missing navigation document can cost you 40 points on Amazon's suppression risk score. An image without alt text violates the European Accessibility Act. All of these are detectable and fixable before you publish, if you run the right validation tools.

This guide walks through a three-tool validation workflow: EPUBCheck for structural correctness, DAISY Ace for accessibility compliance, and Rahatt for commercial suppression risk. Together, they catch virtually every issue that could hurt your book's distribution or visibility.

The Three-Layer Validation Workflow

LayerToolWhat It ChecksTime Required
1. StructureEPUBCheckValid EPUB format, correct HTML, proper metadata, file integrity1-2 minutes
2. AccessibilityDAISY AceWCAG 2.1 compliance, alt text, heading hierarchy, contrast, navigation2-5 minutes
3. Commercial RiskRahattAmazon suppression triggers, risk scoring, auto-fixable issuesUnder 1 minute

Each layer catches different problems. EPUBCheck will pass a book that has no alt text (it is not checking accessibility). DAISY Ace will pass a book with a broken internal link (it is not checking structural integrity). Rahatt maps accessibility findings to commercial impact (what will actually cost you sales). Running all three takes under 10 minutes and can save you weeks of lost visibility.

Layer 1: EPUBCheck, Structural Validation

EPUBCheck is the official W3C validation tool for EPUB files. It verifies that your EPUB conforms to the EPUB specification, correct file structure, valid HTML, proper metadata, consistent internal references.

How to Run EPUBCheck

Option 1: Web interface, Go to epubcheck.org and upload your file. Results appear in your browser. Best for occasional use.

Option 2: Command line, Download EPUBCheck from the W3C GitHub repository and run it locally:

java -jar epubcheck.jar your-book.epub

This requires Java installed on your system. The command-line version is faster and works with automated workflows.

Option 3: Integrated, Sigil includes EPUBCheck as a built-in feature (Tools > Validate EPUB). Calibre runs a lightweight validation during conversion.

The 10 Most Common EPUBCheck Errors

Based on aggregated data from EPUBCheck's public validator (which processed over 2 million checks in 2025), these are the most frequent errors indie authors encounter:

RankErrorFrequencySeverityFix
1Missing dc:language23%ErrorAdd language code to OPF metadata
2Missing/malformed dc:identifier19%ErrorAdd ISBN or UUID to OPF
3Broken internal links17%ErrorFix href paths in XHTML files
4Invalid HTML in content files15%ErrorFix unclosed tags, invalid attributes
5Missing navigation document12%ErrorAdd nav.xhtml or toc.ncx
6Undeclared manifest items11%ErrorAdd all files to OPF manifest
7Incorrect mimetype file8%ErrorEnsure mimetype is first, uncompressed
8Missing dc:title7%ErrorAdd title to OPF metadata
9Deprecated EPUB 2 constructs6%WarningUpdate to EPUB 3 syntax
10Image file not in manifest5%ErrorAdd image to OPF manifest

How to Fix Common EPUBCheck Errors

Missing dc:language: Open your OPF file and add the language element inside <metadata>:

<dc:language>en</dc:language>

Broken internal links: EPUBCheck tells you exactly which file and line number contains the broken link. Open that XHTML file and correct the href attribute. Common causes: renamed chapter files, deleted content, case-sensitivity mismatches (Chapter01.xhtml vs chapter01.xhtml).

Undeclared manifest items: Every file in your EPUB must be listed in the OPF manifest with the correct media type. If you added an image or stylesheet without updating the manifest, EPUBCheck catches it:

<item id="img-castle" href="images/castle.jpg" media-type="image/jpeg"/>

Incorrect mimetype file: The mimetype file must be the first file in the ZIP archive, must not be compressed, and must contain exactly the text application/epub+zip with no trailing newline. Most formatting tools handle this correctly, but manual ZIP operations can break it.

For a broader understanding of EPUB structure and how these components fit together, see our complete ebook formatting guide.

Layer 2: DAISY Ace, Accessibility Validation

DAISY Ace is the Accessibility Checker for EPUB, maintained by the DAISY Consortium. It evaluates your ebook against WCAG 2.1 (Web Content Accessibility Guidelines) and EPUB accessibility best practices.

What Ace Checks That EPUBCheck Does Not

EPUBCheck validates format compliance. Ace validates accessibility compliance. These are different concerns:

CheckEPUBCheckDAISY Ace
Valid HTML structureYesNo
Image alt text presentNoYes
Heading hierarchy correctNoYes
Color contrast sufficientNoYes
Navigation document functionalYes (exists)Yes (accessible)
Language attribute setYesYes
Accessibility metadata presentPartialYes
Table structure accessibleNoYes
ARIA attributes validNoYes
Reading order logicalNoYes

How to Run DAISY Ace

Ace requires Node.js 18 or later. Install it globally:

npm install -g @daisy/ace

Then run it on your EPUB:

ace your-book.epub -o ace-report

This generates an HTML report in the ace-report directory. Open report.html in your browser to see a detailed breakdown of every accessibility issue found, organized by severity (violation, warning, suggestion).

For complete setup instructions and tips for interpreting Ace results, see our DAISY Ace guide.

The Most Common Ace Violations

According to the DAISY Consortium's 2025 annual report, 73% of independently published EPUBs fail at least one Ace check. The most common violations:

1. Missing alt text (68% of scanned books)

Every <img> element must have an alt attribute. Decorative images should have alt="". Images conveying information need descriptive alt text. This is the most frequent violation because most formatting tools do not prompt authors to add alt text.

Fix: Add alt attributes in Sigil, or use Rahatt's AI alt text feature to generate and inject descriptions automatically. For alt text writing best practices, see our alt text guide.

2. Incorrect heading hierarchy (41%)

Headings must not skip levels. An <h1> followed by an <h3> (skipping <h2>) is a violation. Screen readers use heading hierarchy to build a navigable outline of the book, skipped levels create confusing gaps.

Fix: Review heading levels in each XHTML file and correct any gaps. Rahatt's auto-fix feature corrects heading hierarchy automatically by downgrading improperly leveled headings.

3. Insufficient link contrast (35%)

Links must be visually distinguishable from surrounding text with a contrast ratio of at least 4.5:1 against the background. Many ebooks style links with colors that look different on a computer screen but fail the mathematical contrast ratio test.

Fix: Set link color to #0066CC or darker, with text-decoration: underline. Rahatt's auto-fix injects WCAG-compliant link styles into your CSS.

4. Missing accessibility metadata (31%)

EPUB 3 supports schema.org accessibility metadata (accessibilityFeature, accessibilityHazard, accessibilitySummary). Missing these properties is a violation under the European Accessibility Act and a suppression risk factor on Amazon.

Fix: Add the appropriate metadata to your OPF file. See our metadata guide for copy-paste templates, or use Rahatt's auto-fix to inject metadata automatically.

5. Missing language attribute (18%)

The xml:lang attribute must be set on the root <html> element of every XHTML file. Screen readers use this to select the correct pronunciation engine.

Fix: Add xml:lang="en" (or the appropriate language code) to the <html> element in each content file.

Layer 3: Rahatt, Suppression Risk Assessment

EPUBCheck and Ace tell you what is technically wrong with your EPUB. Rahatt tells you what is commercially dangerous, which issues will actually affect your book's visibility on Amazon and other retailers.

How Rahatt Scoring Works

Rahatt maps accessibility violations to Amazon's known suppression triggers and calculates a risk score from 0 to 100:

IssueRisk PointsCap
Missing alt text-10 per image-40 total
Missing/broken navigation-40-40
Missing accessibility metadata-20-20

Risk levels:

  • 0-19 (Low): Your ebook is in good shape. No action needed.
  • 20-49 (Medium): Issues present that may reduce visibility. Fix recommended.
  • 50-79 (High): Significant suppression risk. Fix before publishing.
  • 80-100 (Critical): Near-certain ranking suppression. Immediate fixes required.

Running a Rahatt Scan

  1. Go to rahatt.co
  2. Drop your EPUB file onto the upload area (or click to browse)
  3. The scan runs automatically and returns results in under 30 seconds
  4. Review your risk score, identified issues, and recommendations

Rahatt Auto-Fix

For issues scoring Medium or above, Rahatt can automatically fix many common problems:

  • Accessibility metadata injection, Adds missing schema:accessibilityFeature, schema:accessibilityHazard, and schema:accessibilitySummary properties
  • Heading hierarchy correction, Automatically downgrades improperly leveled headings (e.g., h4 → h3 when h2 is the parent)
  • Link contrast injection, Appends WCAG 2.1 AA compliant CSS for link styling
  • Alt text generation, Uses AI to generate context-aware alt text suggestions, which you review and approve before injection

A typical auto-fix session takes an ebook from a Medium Risk score (40/100) to Low Risk (0/100) in under 5 minutes. See our guide to fixing EPUB accessibility issues for detailed walkthroughs.

The Complete Validation Checklist

Run through these steps before every ebook upload:

Pre-Validation (Before Running Tools)

  • All chapter files are present and in the correct reading order
  • All images are optimized and under the size budget
  • Every image has an alt attribute (descriptive or empty for decorative)
  • Cover image meets retailer specifications
  • Table of contents matches actual chapter structure

EPUBCheck

  • Run EPUBCheck, zero errors
  • Review warnings, fix any that indicate real problems
  • All files are declared in the manifest
  • dc:language, dc:title, dc:identifier are present
  • Navigation document exists and is declared correctly

DAISY Ace

  • Run Ace, review all violations
  • Alt text present on all meaningful images
  • Heading hierarchy follows correct order (no skipped levels)
  • Link contrast meets 4.5:1 ratio
  • Accessibility metadata is present and accurate
  • Language attribute set on all content files

Rahatt

  • Risk score is Low (0-19)
  • No critical findings remain
  • If issues found, apply auto-fix and re-scan to verify

Platform Preview

  • Kindle Previewer 3, navigation works, images display, formatting correct
  • Apple Books (if available), open and read through key sections
  • Calibre Viewer or Thorium, spot-check formatting

Validation Across the Publishing Lifecycle

Validation is not a one-time event. Run the full validation workflow at these points:

EventWhat to ValidatePriority Tools
After initial formattingFull validation (all three layers)EPUBCheck + Ace + Rahatt
After content editsEPUBCheck + RahattStructural + risk
After cover changeEPUBCheckManifest + cover reference
After metadata updatesEPUBCheck + RahattMetadata + risk
Before each retailer uploadFull validationAll three
Annual backlist reviewRahattRisk score check
After accessibility standards updateAce + RahattCompliance + risk

The annual backlist review is particularly important. Amazon's suppression algorithms evolve, and a book that passed risk assessment in 2025 may accumulate risk as standards tighten. A quick Rahatt scan (under a minute per book) identifies any backlist titles that need attention.

Interpreting Validation Results

When You Can Ignore Warnings

Not every warning requires action. Here are common warnings you can safely acknowledge without fixing:

  • "OPF referenced resource not in spine", This is normal for images, stylesheets, and other non-content files. They should be in the manifest but not necessarily in the spine.
  • "Insufficient contrast for decorative text", If the low-contrast text is purely decorative (e.g., a watermark-style background element), this is acceptable.
  • "Missing page-list navigation", Only required if your ebook maps to a specific print edition page numbering.

When Warnings Are Actually Critical

Some warnings should be treated as errors:

  • "Navigation document has no entries", A navigation document with zero entries is worse than no document at all. E-readers may display an empty menu.
  • "Image referenced but not found", This means readers will see a broken image icon. Always fix.
  • "Multiple dc:identifier elements", Can confuse retailer ingestion systems. Keep one primary identifier.

Automating Validation

If you publish frequently (more than 5 books per year), consider automating your validation workflow:

Command-Line Workflow

# Validate structure
java -jar epubcheck.jar book.epub 2> epubcheck-errors.txt

# Check accessibility
ace book.epub -o ace-report

# Quick error summary
grep "ERROR" epubcheck-errors.txt | wc -l
grep "violation" ace-report/report.json | wc -l

CI/CD Integration

Authors using version control for their books (yes, some do) can add EPUBCheck and Ace to their build pipeline. Both tools exit with non-zero status codes on errors, making them compatible with CI/CD systems.

This level of automation is overkill for most indie authors, but publishing companies and serial publishers producing 20+ titles per year find it valuable.

Frequently Asked Questions

How often should I validate my ebooks?

At minimum, validate every ebook before its first upload and after any significant content changes. For backlist titles, run a Rahatt scan annually to check for suppression risk, Amazon's standards evolve, and a book that was fine in 2024 may need updates. The scan takes under a minute per book and costs nothing.

My EPUB passes EPUBCheck but fails DAISY Ace. Is that normal?

Yes, this is very common. EPUBCheck validates format compliance (is this a valid EPUB?), while Ace validates accessibility compliance (is this an accessible EPUB?). A structurally perfect EPUB can still have missing alt text, broken heading hierarchy, and insufficient contrast. These are different concerns addressed by different tools. You need both to pass.

Can I skip validation if I used a professional formatting tool like Vellum?

No. Vellum produces well-structured EPUBs, but it does not add alt text to images, does not include complete accessibility metadata, and may produce heading hierarchies that depend on your input structure. A Rahatt scan of a Vellum-formatted book with images typically returns a Medium Risk score (20-40) due to missing alt text and incomplete accessibility metadata. Even Vellum output benefits from the validation and auto-fix workflow.

What if I cannot fix all the issues before my deadline?

Prioritize by commercial impact. Fix navigation issues first (worth up to 40 risk points), then accessibility metadata (20 points), then alt text (10 per image, up to 40). These three categories alone account for 100% of the Amazon suppression risk score. Heading hierarchy and contrast issues affect Ace compliance but have a smaller direct impact on Amazon ranking. If time is extremely tight, a single Rahatt auto-fix pass addresses metadata, headings, and contrast in under a minute.

Are there any validation tools for Amazon's KF8 format specifically?

Amazon's free Kindle Previewer 3 is the closest equivalent. It converts your EPUB to KF8 internally and flags quality issues during conversion. However, it does not check accessibility as thoroughly as Ace or Rahatt. The recommended approach is: validate the EPUB thoroughly (EPUBCheck + Ace + Rahatt), then preview in Kindle Previewer as a final layout check before uploading to KDP. For more on the complete formatting and validation process, see our ebook formatting guide and our formatting tools comparison.

Ready to check your EPUB?

Scan Your EPUB Free