Performance loading (lots) of data


#1

I have a bunch (+20K) comments in data stored as yaml.
Just loading these takes 10s out of about 12s required for overall building. It feels a bit disproportionate.
Any suggestions? Is toml parsing more efficient?


#2

What are you comparing it to? That doesn’t seem strange to me. :slight_smile:


#3

Compared to JSON it is. Not so sure about YAML. I’m not sure those numbers look all that bad. We have had some reported issues (esp. with JSON) that /data consumes lots of memory for not so big data sets.

That said, you the “disproportionate” feeling you get may come from the fact that the “data loading” is serialized (1 thread/Go routine) – the rest of the build is, what should we say, not serialized. Assuming you have these comments spread into multiple files, we could do something about that, of course, but your use case isn’t typical.


#4

That indeed explains it.
Once I learn Go (one of my 2018 goals) I might decide to look into it.
And I am not sure that’s not “typical”, it’s an average of 20 comments per post, which is ok for niche seasoned (>10y old) blogs. :grin:
Overall though, the running time on my host machine is not bad. It went from 2+ secs to 5+, which is still nothing to what it takes to Netlify to post-process.