We're evolving to serve you better! This current forum has transitioned to read-only mode. For new discussions, support, and engagement, we've moved to GitHub Discussions.

Automate generation of Posts

  • #9911
    Avatar photo[anonymous]

    I have about 500 posts I need to add to my site (Catalogue).  Is there any way I can automate/script the generation of 500+ post pages from my existing files?

    I was using the Taste theme but I could try something else if it is helpful.

    Thanks in advance


    Avatar photo[anonymous]


    One obvious way comes to mind. If your 500 posts are / were in a WordPress website format, you can export your WordPress site as a .wxr file, then in Publii, under Tools & Plugins, use the WP Importer tool to import all 500 posts into Publii — using whatever Publii theme you’ve chosen. You may still need to do some additional work depending on which Publii theme you go for.

    If your site isn’t in WordPress format but in say Joomla format, you can export a Joomla site into a temporary WordPress site, then repeat the exercise above.

    If your site is not in WordPress, Joomla, or in any standard web content management system format, but rather as separate HTML web pages with images, etc., you might find you can set up a temporary WordPress site and use a WP plugin to import your existing HTML web pages into WordPress, then repeat the exercise in the first paragraph above.

    Avatar photo[anonymous]

    Right now I don’t have a site.  I have a bunch of individual catalogue pages in Microsoft word.  The pages are all based on a standard template so I was thinking of writing a script to reformat them into html or markdown but was getting stuck on the problem of importing the pages into public posts.  All I can come up with is cutting and pasting them into the markdown editor.

    Avatar photo[anonymous]


    Okay. Maybe you could import your Microsoft Word stuff into a temporary WordPress site. This link sort of outlines the idea: . Or there are probably other WordPress plugins that do the same. Once your Microsoft Word content is in WordPress, you can export the site as a WXR file that Publii can then read (import into Publii).

    Avatar photo[anonymous]

    Yes, I see your point.

    I’ll look at wordpress.


    Avatar photo[anonymous]

    I don’t know how much manual work you want to do, but I just converted my DokuWiki-based blog to Publii.  I used Python and Pandoc for the conversion.  I think Pandoc supports Word files, but I don’t know how well (or if) it supports images and tables in Word.  My Python/Jupyter Notebook is here:

    Since Publii uses SQLite, I could use Python’s  SQLite support to write queries to insert each post into Publii’s “posts” tables.

    I did a quick test since my Jupyter notebook was still running, and it seems like Pandoc could handle the format conversion part for you, you’d just need to get the info into Publii’s database.

    Here’s an example of the conversion, I made a basic Word DOCX file and converted it with Pandoc to HTML.

    Avatar photo[anonymous]

    Ah, this sounds like something I might be able to work with.  Is there some documentation of the location and structure of the database?

    Avatar photo[anonymous]
    [anonymous] wrote:

    Ah, this sounds like something I might be able to work with. Is there some documentation of the location and structure of the database?

    Well, the source code I suppose, but I didn’t look at it too much.  The database is inside the Publii “sites” folder in the “input” subfolder and is called db.sqlite.  I used SQLite Browser and examined the “posts” table, which has a pretty simple schema.  There’s also the “posts_additional_data” table which is where all the options are for a post (like all the stuff in the little gear menu at the top right of Publii’s post editor).

    I think the simple schema is another good sign of Publii’s architecture.  I always worry when a big dynamic CMS uses like a single table that’s just a bunch of key-value pairs, because why even bother with a SQL database at that point?

                          'title' TEXT,
                          'authors' TEXT,
                          'slug' TEXT,
                          'text' TEXT,
                          'featured_image_id' INTEGER,
                          'created_at' DATETIME,
                          'modified_at' DATETIME,
                          'status' TEXT,
                          'template' TEXT)
                                          'post_id' INTEGER, 'key' TEXT, 'value' TEXT)