Importing XML or CSV to Grav as separate posts

Ok, so new to Grav and I’m trying to find a way to import all my posts from a previous blog. I have been using b2evolution, but it’s pretty much dead with no one taking over the development. However, I have a LOT of posts in the system that I really do want to bring over. Eaisest way to pull them is from XML, and I can take that and convert it to text or HTML, but I’m not seeing a way to import them into Grav as actual posts. There’s one old plugin that supposed to do it, but it’s not working. I’ve also got it in Excel, don’t know if that would work better.

I did see one post on importing from a XML feed, and while I can parse it, I don’t know how to separate the posts. Any help would be appreciated!

@dubird,

  • Without any information about the content of XML there is not much to say.
  • There’s one old plugin that supposed to do it
    Which plugin?
  • I did see one post on importing from a XML feed
    Which post?
  • but I’m not seeing a way to import them into Grav as actual posts”.
    Grav is a flat file CMS, which means that posts are a folder containing a plain file. “Importing” is a matter of creating files.
1 Like

Feed Us
I installed it, and from what I can tell, it’s simply a plugin where you add the code to the template and it will display them on a page. Handy for some things, but not what I’m looking for.

The response is to create a plugin, which I’m ok if I have to do it, but I’ve never done one like this, and obviously not for Grav.

I understand that, that’s the problem. If it was a database, I could build something to import what I have. It’s the pulling the info and separating out the posts to individual files that I’m lost on. I don’t actually think you can do it with a plugin, but if anyone knows a script or method that can do it externally, that at least lets me have the individual posts that I can work with. Given that I have almost 200 posts to import, anything to cut down on the manual work would be very helpful!

@dubird, Taking the Feed Us plugin as an inspiration, roughly you could try something like the following:

Note: Not sure what your XML/CSV contains. Markdown or HTML?

  • Create php script
    <?php
    
    $xmlstr = file_get_contents('https://getgrav.org/blog.rss');
    
    $xmldoc = new \SimpleXMLElement($xmlstr);
    
    if (!empty($xmldoc->channel->item)) {
        $items = $xmldoc->channel->item;
    }
    
    if (!empty($xmdoc->entry)) { 
        $items = $xmldoc->entry; 
    }
    
    if (empty($items)) {
        echo "No items\n";
        return false; 
    }
    
    foreach ($items as $item) {
        // Get frontmatter values for Quark blog item
        $frontmatter = [];
        
        // Extract data from RSS/Atom
        $frontmatter['title'] = 'my title';
        $frontmatter['date'] = '2024-01-01';
        $frontmatter['author'] = 'pamtbaau';
    
        // etc.
        
        if(!empty($item->link)) { 
            $parts = explode('/', $item->link);
            $slug = end($parts);
        }
    
        $summary = 'my markdown/html summary'; // Get summary of your blog page
        $content = 'my markdown/html content'; // Get content of your blog page
        
        $page = \yaml_emit($frontmatter);
        $page = rtrim($page, ".\n"); // Remove trailing "...\n"
        $page .= "\n---\n";
        $page .= $summary . "\n";
        $page .= "\n\n===\n\n";
        $page .= $content;
    
        mkdir("./pages/01.blog/$slug", 0755, true);
    
        \file_put_contents("./pages/01.blog/$slug/item.md", $page);
    }
    
  • Generated tree will be:
    pages
    └── 01.blog
        ├── cms-critic-award-nominations-2024
        │   └── item.md
        ├── cms-critic-award-vote-2024
        │   └── item.md
        ├── grav-18-beta-released
        │   └── item.md
        ├── instagram-feed-deprecated
        │   └── item.md
        ├── macos-sequoia-apache-multiple-php-versions
        │   └── item.md
        ├── macos-sequoia-apache-mysql-vhost-apc
        │   └── item.md
        ├── macos-sequoia-apache-ssl
        │   └── item.md
        ├── macos-sequoia-apache-upgrade-homebrew
        │   └── item.md
        ├── new-email-plugin
        │   └── item.md
        └── tailwindcss4-upgrade-for-typhoon-premium-theme
            └── item.md
    
  • Content of item.md:
    ---
    title: my title
    date: "2024-01-01"
    author: pamtbaau
    ---
    my markdown summary
    
    ===
    
    my markdown content
    

Alter the pseudocode to fit your needs.
I’ll leave the creation of the blog.md, the debugging and filling in the pseudocode for you… Have fun!

1 Like