Hey Hugo community!
I wanted to share what we’ve built at SermonIndex.net — a Bible study, sermon, and Christian encyclopedia platform now serving over 1.08 million static pages generated with Hugo. We believe this is one of the largest Hugo sites in production today, and I wanted to share how we pulled it off and the Hugo-specific patterns that made it possible.
About SermonIndex
SermonIndex was founded in 2002 with the mission to preserve classical Biblical preaching and promote revival. Over 20+ years it has grown to reach practically every country in the world, distributing over 100 million sermon resources from over 2,300 speakers — names like A.W. Tozer, Leonard Ravenhill, C.H. Spurgeon, David Wilkerson, and many more. The entire library is free.
In 2025 we began a ground-up rebuild of the platform on Hugo, and the scope grew into something far beyond a typical static site.
Live site: https://www.sermonindex.net
What 1.08 Million Pages Looks Like
Here’s the full breakdown of every content section:
| Section | Pages |
|---|---|
| Bible (1,265 translations) | 573,829 |
| Books (classic Christian library) | 186,875 |
| Commentary (173 commentaries) | 88,539 |
| Encyclopedia | 75,843 |
| Speakers (2,315) + Sermons (59,886) | 57,233 |
| Parallel Bible | 31,086 |
| Devotionals (44 collections) | 15,839 |
| Strong’s Concordance | 14,200 |
| Topics (14,090) | 14,091 |
| Hymns (2,020 authors, 10,461 hymns, 550 categories) | 13,058 |
| Bible Topics (4,807) | 9,666 |
| Interlinear Bible (66 books) | 1,256 |
| Total | ~1,080,268 |
The Bible section alone — with 1,265 translations spanning dozens of languages — accounts for over half the pages. Each chapter page includes study notes, key verse analysis, cross-references, keyword highlights, audio playback, and auto-linked scripture references.
Infrastructure
-
Hugo for static site generation
-
Bunny CDN for hosting (storage zone based)
-
DigitalOcean App Platform running a NestJS API for dynamic features (Bible API, search)
-
Cloudflare DNS in front
-
PostgreSQL on DigitalOcean for the Bible verse database (42 translations, 885K verses, 33,741 chapters)
-
Mac Minis as build machines
How We Built It with Hugo
The Core Problem: You Can’t Build 1M Pages in One Shot
Early on we realized that a single Hugo build of 1 million+ pages wasn’t practical. So we developed a content generation + sectional build approach. Each content section (Bible, Commentary, Books, Hymns, etc.) has its own generation pipeline:
-
Source data lives in
generated_*folders — structured JSON produced by Python scripts from various sources (database exports, audio transcriptions via Whisper AI, commentary archives, linguistic databases, etc.) -
Python build scripts transform JSON into Hugo-ready Markdown files with YAML frontmatter, placed into
content-dev/which is the source of truth for builds. -
Hugo builds are run per-section or in targeted batches, with output uploaded to Bunny CDN.
This means we can rebuild just the commentary section, or just the hymns, without touching the other 900K+ pages.
Hugo Patterns That Made This Possible
1. Frontmatter-heavy content architecture
Our Bible chapter pages are a good example. Rather than cramming everything into the Markdown body, the bulk of the structured data lives in YAML frontmatter — study themes, key verses, Christ-centered notes, preaching outlines, keyword highlights, cross-references, FAQ sections, and more. The Hugo template then renders all of this into a rich study page with sidebar panels, highlight toggles, and interactive features.
This keeps templates in full control of layout and lets us update content structure without touching templates.
2. Auto-linking scripture references with a Hugo partial
We built a linkscripture.html partial that auto-links every Bible reference found in any text on the site. It uses a placeholder strategy to avoid re-matching:
-
Pass 1: Match full references like “1 Corinthians 15:22” → replace with placeholder
XSCRIPTREF0001 -
Pass 2: Match book + chapter like “Genesis 3” → placeholder
-
Pass 3: Match short verse refs like “v.3” → placeholder
-
Final: Swap all placeholders with
<a class="green-link">links
This partial is used everywhere — study notes, commentary text, devotionals, topic descriptions — giving the entire site consistent cross-referencing without any manual linking.
3. The data/ directory for large lookup tables
For sections that need shared reference data across templates (book code lookups, translation metadata, speaker indexes), Hugo’s data/ directory is invaluable. We load JSON files and use index in templates to pull what we need without creating content pages for reference data.
4. renderSegments for development
With over a million pages worth of content sitting in content-dev/, iterating on a single template would be painful without this. Hugo’s --renderSegments flag lets us target just one section:
bash
hugo server --renderSegments commentary
This was essential for the development cycle. We could tweak the commentary chapter template and see results in seconds rather than waiting for a million-page build.
5. Section-specific SEO in baseof.html
With 12+ content sections, each needing different title formats and schema markup, we built section-aware logic in baseof.html:
-
Bible chapters:
Genesis 1 (BSB) | SermonIndex -
Commentary:
Zechariah 5 — Matthew Henry's Commentary | SermonIndex -
Speakers:
F.J. Huegel — Sermons | SermonIndex -
Sermons:
Title by Speaker Name | SermonIndex -
Topics:
83 sermons on Abiding in Christ | SermonIndex -
Hymn categories:
111 Hymns on Praise & Thanksgiving | SermonIndex
Each section also gets appropriate JSON-LD schema — Person for speakers, Article for Bible/commentary, BreadcrumbList everywhere, and WebSite+Organization on the homepage.
6. CDN-editable config without rebuilds
Some things shouldn’t require a Hugo rebuild to change. Our cookie consent and donate popups are controlled by a popups.json file that lives on the CDN. Edit the JSON, the behavior changes instantly — no build, no deploy. The JS reads the config at runtime. We also use head-inject.js and body-inject.js files for any post-deploy code injection needs.
The Content Pipelines
Each section has its own generation story:
-
Bible (573K pages): 1,265 translations pulled from a PostgreSQL database, each chapter getting its own Markdown file with study notes generated via AI. We had to fix book code mismatches across ALL translations (like
NAM→NAH,SNG→SOS) — a global find-and-replace across hundreds of thousands of files. -
Books (186K pages): 3,000+ classic Christian books (Pilgrim’s Progress, Pursuit of God, etc.) broken into chapter-level pages for reading online.
-
Commentary (88K pages): 173 Bible commentaries from authors like Matthew Henry, Darby, Whedon, Spurgeon, and many more. Each chapter gets a rich page with verse-level commentary entries, author bio sidebar, and quick-jump navigation across Old and New Testament books.
-
Speakers + Sermons (57K pages): 2,315 speaker profiles with 59,886 sermons. We imported 4,994 video sermons from CSV files with transcriptions, resolved 308 title conflicts, and merged 10 speaker name misspellings. Each speaker’s
_index.mdhas a verifiedsermon_countmatching the actual content. -
Hymns (13K pages): 10,461 hymns across 2,020 authors and 550 categories, with category pages like “111 Hymns on Praise & Thanksgiving.”
-
Strong’s Concordance (14K pages): Hebrew and Greek word entries with definitions and cross-references.
-
Interlinear Bible (1,256 pages): Word-by-word original language analysis for 66 books.
Challenges and What We Learned
Book code consistency is brutal at scale. When you have 1,265 Bible translations, a single book code mismatch (NAM vs NAH for Nahum) breaks cross-references across tens of thousands of pages. We wrote scripts to audit and fix these across both the generated JSON and the content-dev Markdown — touching hundreds of thousands of files.
Frontmatter escaping will haunt you. When content is generated by scripts processing messy source data, special characters in YAML frontmatter cause silent Hugo build failures. We built escape_html_tags() early and it saved thousands of broken pages.
Verify everything programmatically. With 57 Bible translations in the dropdown and 97+ commentaries listed in the header, we wrote verification scripts to ensure every entry in the UI actually has corresponding content in content-dev/. At this scale, manual checking is impossible.
Thanks to Hugo
Hugo made this feasible. The build speed, Go templates, data/ directory, renderSegments, and the overall architecture gave us the foundation to build something at a scale that most people associate with database-driven platforms. The fact that we can serve over a million pages as pure static files — fast, secure, and cheap to host — is remarkable.
I’d also note that we saw the V&A Explore the Collections post on this forum when we were planning this build, and it gave us confidence that Hugo could handle what we were attempting.
Happy to answer any questions about the architecture, the content pipelines, or any of the Hugo patterns we used.
