Export whole site for NotebookLM AI

Hello,

I would like to try to export the whole site to test NotebookLM AI abilities. Only .txt, .pdf and .md contents and < 50 files are supported.

What would be the smart move to compile ~300 pages (with many shortcodes) in a compatible manner ?

Thank you very much your help and ideas sharing.

Create a custom post type that runs on home only and exports EVERYTHING into one single file.

I have some example code that you can/might fit to your requirements:

The main thing is to NOT load paged content, but EVERYTHING in your template (range site.Pages) and then parse it through whatever creates your content. Add title and content and maybe some data fields (dates, keywords). Go from there.

Now, for your case, what you are not handing over is the format you need. txt, pdf, md can be anything. Can it have links? then do an markdown export with proper formatting. Can it have text only? Text format obviously. PDF is overkill, but I assume (black-box for me) they don’t care about formatting and only want text and links.

Edit: my sample code creates a JSON file with all URLs in the resulting website. For testing. It’s minimal.

Thank you very much.
I gonna dig into it