Building a Static Business Directory with 5 Million of Pages and huge TAXONOMY with Hugo is Possible?

Website Plan

I am embarking on an ambitious project to rebuild a vast business directory website using Hugo. The website will function as a lookup service where each row is an article. The directory is structured as follows: States → Cities → Specialties of the Business → Businesses specializing in that niche in that city. I aim to create a website that displays links to all the states and a list of the top 50 businesses based on ratings. When someone clicks on a state, a new page will open with a list of cities in that state and the top 50 businesses in that state. The state page will also have a Specialty Listing leading to the top 50 specialist businesses in that niche in that state.

The ideal website structure will look something like this: State (with top 50 list) → Cities → Business Specialty → List of businesses with that specialty in that specific state and city.

My Main Concern

My main concern is the sheer scale of the project as it involves five million businesses. Will Hugo be able to generate this website with all its taxonomies and hierarchy relationships? Can Hugo scale up to handle this project? Time is not a problem as I’m willing to wait for complete website generation on my local machine, even if it takes 10-12 days (I use a normal 16GB PC with 4 Intel cores).

In conclusion, I want to know if it is possible to generate a big static directory website with Hugo. Can Hugo handle this project or is it better suited for smaller projects with only 10-20 thousand pages?

Note

I already have this website up and running in WordPress, It is basically a static website and consuming too much resources as it runs on Database.

Take a look at V&A Explore The Collections, over 1 million pages generated by Hugo

1 Like

Hugo can ran this. But the build time might be several minutes to several hours (I have a site with around 50k pages (no images) that builds in 3-10 minutes). But there are several tips to reduce memory and build times, for example—

  1. Caching partials that don’t change with every build.
  2. Caching files so that they are not regenerated on every build.
  3. Avoid using pagination for such a large site because it generates millions of other pages on top of the ones you have which might significantly lower your build time
  4. Use the command hugo --templateMetrics --templateMetricsHints to diagnose files/templates slowing down your site and how to improve on that.
3 Likes

Due to memory consumption, I cannot build a simple[1] site with with 1M pages on a 12-core machine with 16 GB memory.

Note that there’s plenty of CPU, but memory and swap are maxed out. Ultimately the machine froze.

See:

I would stick with a dynamic CMS backed by memcached and fronted by Varnish.


  1. No taxonomies, no pagination, no partials. ↩︎

1 Like