Load data from YAML/XML/JSON files?

matrixise · September 23, 2014, 6:26pm

Hi all,

I would like to know if we can use an external files as a source, because I think we can’t use a database for the generation of some pages.

Thank you

spf13 · September 23, 2014, 6:46pm

Currently you can add data to the sitewide config file. I’m exploring extending to any data file in /data/ . I believe there is already a ticket about this.
It shouldn’t be very hard to do if someone wants to pick it up and contribute this… All the logic is already there. It just needs to be extended to read from more than one file.

matrixise · September 23, 2014, 7:08pm

could you explain, I don’t know the structure and the code of hugo, but why not. and how do you see the implementation?

spf13 · September 23, 2014, 7:36pm

Sure… Here is the Ticket https://github.com/spf13/hugo/issues/476 .

The implementation is straightforward. Any file can be placed into /data/ … say /data/authors.yaml

This is then accessible under Site.Data.authors

Data files can also be placed in sub-folders of the data folder. Each folder level will be added to a variable’s namespace.

I think Jekylls implementation is very correct here. They document it at http://jekyllrb.com/docs/datafiles/ .

I would make two changes. 1. Only support the same formats as we already do (JSON, YAML, TOML). 2. Call the folder data/ … Also add that as a flag --datadir=/a/different/path.

spf13 · September 23, 2014, 7:42pm

Flags are added here

github.com

gohugoio/hugo/blob/master/commands/hugo.go

// Copyright 2016 The Hugo Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// Package commands defines and implements command-line commands and flags
// used by Hugo. Commands and flags are implemented using Cobra.
package commands

import (
	"fmt"
	"io/ioutil"

This file has been truncated. show original

The Params are extracted from the config file here.

github.com

gohugoio/hugo/blob/master/hugolib/site.go

// Copyright 2017 The Hugo Authors. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package hugolib

import (
	"errors"
	"fmt"
	"html/template"
	"io"

This file has been truncated. show original

spf13 · September 23, 2014, 7:44pm

Ohh… One thing.

In Hugo we already call these Params. Jekyll calls them data.

I’m not sure if we should just inject this into Params, or Create a new accessor called .Data … I’m leaning towards the .Data approach, but not sure if that makes sense.

What do others think?

matrixise · September 23, 2014, 7:50pm

great, an other point. how would you generate a page/archetype for each item of a collection? and how to use it?

you have ./data/authors.yml we need to add ./layout/_default/authors.html ?

spf13 · September 23, 2014, 8:14pm

There currently isn’t a way to add a html page without a content file (or metadata in a content file) dictating the creation.

I’ve been thinking about how best to do this.

My current thinking is that you would actually add a go.html content file where you want the file to be created…

For example if I wanted to create http://domain.com/authors I would create

/content/authors.go.html

This would be a node and would have all of the site data available to it, but no page specific data.

Of course today you can do this, but it requires a bit more… Since all pages require a content page you would need to create a content file and a template file.

One way to do this would be to create a markdown file wherever you want it. Set the url and layout in the metadata. Then make sure you create the layout file you specified. It will use this to render a file in the url location you specify.

Another way that’s a bit more in flowing with Hugo’s normal way of operating would be to create /content/authors/index.md (really any name works if it’s in that directory). Then I would create a section template for authors that read from the data source and listed that.

The only disadvantage of this approach is the content file will appear in the RSS feeds and Pages listings.

Another idea I had which would work well is to add a hidden flag to content. Hidden would not include the content in any listings, but would render it.

I think I may like this approach the best. It gives a pretty straightforward way of adding literally any page anywhere without much work.

natefinch · September 25, 2014, 1:08pm

+1 on calling this stuff data, separate from params. It’s nice to have different names for things that live in different places, so you can differentiate. Note that there’s already some variables called “Data” in Hugo, so we should be careful about confusion (maybe rename the old variables).

NkS · January 20, 2015, 11:53am

+1 for loading data from yaml/xml/json

anthonyfok · February 2, 2015, 10:35am

See also https://github.com/spf13/hugo/pull/748: Feature: GetJson and GetJson in short codes or other layout files by @SchumacherFM

bep · February 2, 2015, 10:51am

And

SchumacherFM · February 5, 2015, 12:10am

I have nearly finished a solution which transparently integrates into the current process of reading files from the HDD.

My solution downloads a JSON (or whatever format) stream like that one http://cyrillschumacher.com/sourceStream.json and adds those “virtual files” to the files slice. Even the watcher is fully supported … but only when a local real file changes then the build process will redownload the JSON.

The JSON stream looks like:

{"Path":"path/to/content1.md","Content":"the usual hugo content goes here"}
{"Path":"path/to/content2.md","Content":"the usual hugo content goes here"}
{"Path":"path/to/content3.md","Content":"the usual hugo content goes here"}

A JSON stream is separated by line breaks so in the Content there must be no line breaks.

Why a stream of JSON?

Because my main ambition is to create ~100k pages which are generated from a Magento store. This Magento store creates a JSON stream of product and category pages. Stream parsing is much more efficient that parsing a whole JSON blob at once.

This feature comes in addition to PR https://github.com/spf13/hugo/pull/748 that means in your JSON stream content you can also use the feature of PR 748 to download other JSON.

I will send a PR when PR 748 is merged.

JamesMcMahon · April 2, 2015, 3:18pm

https://github.com/spf13/hugo/pull/748 is now merged (even though Github marks it as closed).

Any updates to this thread? This sounds like a really good idea.

bep · April 2, 2015, 5:52pm

It’s closed as in added/implemented.

This and the new Data Files support should be in the doc.

Topic		Replies	Views
Create pages based on data files support	4	776	March 7, 2023
Create page dynamically from data files? support	8	4940	January 23, 2019
JSON as data for websites support	4	2720	December 7, 2021
How to create page with content from /data/my.yml file? support	4	2637	October 11, 2015
Creating Files from Data (like Product pages) support	12	4416	January 23, 2019

Load data from YAML/XML/JSON files?

Related topics