Best Way to Handle Filters in Hugo (Product Listing Website)

I am building a product marketplace using Hugo. Users can filter products by:

  • Location (always selected, comes from session)
  • Category and Subcategory (subcategory only works if category is selected)
  • Price Range (4 fixed options)
  • Other filters like: product age, usage, condition, availability, and color

For these dynamic filters (age, usage, condition, etc.), I generate a JSON file during build time and store it in the public folder. These filters are applied on the frontend using URL query parameters.

What I’m Doing Right Now:

I am pre-generating pages for many filter combinations during build time. For example:

  • /location/mumbai/
  • /location/mumbai/electronics/
  • /location/mumbai/electronics/phones/
  • /location/mumbai/electronics/upto-5k/

Each page has:

  • an _index.md file
  • a JSON file with filter data

Dynamic filters (like color, condition, etc.) are handled using URL query parameters and json.

My Questions:

  1. Is it a good approach to generate so many pre-built _index.md files for all filter combinations?

  2. I checked that JSON is not very heavy for the system. So instead of generating many markdown files, is it better to use only JSON and handle everything on the frontend?

  3. If I use only JSON + query parameters, how should I handle SEO for filtered pages?
    Do I need canonical URLs or meta tags?

  4. Is there any better way to make this system faster, more reliable, and scalable for handling large data in the future?

(accidentally posted before this was finished)

Spraying every possible intersection of a set of parameters is probably not a good idea. Most of those autogenerated pages will have too much specificity to be useful and clutter the sitemap

Hugo’s taxonomy system is probably a better fit for this. I’m assuming your data is modeled as a flat table of products; start by creating Site taxonomies for each parameter you want to filter by:

[taxonomies]
location = "locations"
category = "categories" #'phones', 'laptops' etc
(...)

Then assign taxonomies per-product

---
title: Samsung Galaxy S99-whatever
location: mumbai
category: phones
(...)
---

Now that Hugo has this data you can pare down to products that intersect two or more taxonomy terms. AFAIK hugo doesn’t provide a method specifically for taxonomies but collections.Intersect probably works

Create a special page type that displays a filtered view of these categories, call it ‘productfilter’ or whatever you want (disclaimer: I have not tested this, possibly bad usage of append here)

{{ $taxonomies := slice }}
{{ $pages := slice }}
{{ range .Params.filterBy }}
  {{ $taxonomies = $taxonomies | append (slice
     (index (index $.Site.Taxonomies (index . 0)) . 1 )) }}
{{ end }}
{{ $pages = index $taxonomies 0 }}
{{ range (after 1 $taxonomies) }}
  {{ $pages = $pages | intersect . }}
{{ end }}

{{ range $pages }}
  (...)
{{ end }}

These are regular pages, so they can have a canonical and SERP-friendly path. Instead of /location/mumbai/electronics/phones/ you can name it something saner like /phones-mumbai/

---
title: (...)

filterBy:
  - [ location, mumbai ]
  - [ category, phones ]
---

Note that Hugo will render a full page for every unique term you define regardless of whether it has an index file. I don’t think there’s any way to disable this globally, if you want to prevent this you’ll have to create an _index.md for each term and put this in:

build:
  list: always
  render: never

sitemap: { disable: true }

Is there any better way to make this system faster, more reliable, and scalable for handling large data in the future?

Depends on what you consider large. I manage an ecommerce site with ~1200 products generating on 15 year old hardware and it’s never been a problem. If you fetch data from outside services (content adapters) APIs will probably bottleneck you long before hugo does