Skip to content

Complete WordPress Atom feed: an XSLT transformation

Previously, I tried obtaining a full Atom feed without pagination from WordPress. I didn’t succeed, so I ended up writing an XSL transformation which merges all the pages of this Atom feed into one valid Atom XML stream.

The transformation: wordpress-full-atom-feed.xsl

<?xml version="1.0" encoding="UTF-8"?>
 
<xsl:transform version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:thr="http://purl.org/syndication/thread/1.0"
xmlns="http://www.w3.org/2005/Atom">
 
<xsl:param name="base_url" select="atom:feed/atom:link[@rel='self']/@href" />
 
 
<xsl:template match="/">
  <xsl:apply-templates select="node()" />
</xsl:template>
 
<xsl:template match="*">
  <xsl:element name="{name(.)}">
    <xsl:apply-templates select="node()" />
  </xsl:element>
</xsl:template>
 
<xsl:template match="@*|node()">
  <xsl:copy-of select="." />
</xsl:template>
 
<xsl:template match="atom:feed">
  <feed>
    <xsl:apply-templates select="@*|node()" />
 
    <xsl:call-template name="process_feed_page">
      <xsl:with-param name="page" select="number('2')" />
    </xsl:call-template>
  </feed>
</xsl:template>
 
<xsl:template name="process_feed_page">
  <xsl:param name="page" />
 
  <xsl:variable name="page_doc" select="document(concat($base_url, '?paged=', $page))" />
 
  <xsl:if test="$page_doc/atom:feed/atom:entry">
    <xsl:apply-templates select="$page_doc/atom:feed/atom:entry" />
  
    <xsl:call-template name="process_feed_page">
      <xsl:with-param name="page" select="$page + 1" />
    </xsl:call-template>
  </xsl:if>
</xsl:template>
 
</xsl:transform>

The transformation works by processing the atom:feed element. Before closing that, the process_feed_page template is called. This template tries to open the next page and process all atom:entry elements in there. Then it tries to recurse to the next page.

The next page’s URL can be guessed because this is the normal feed URL with ?paged=[pagenum] appended to it. The feed URL can be found because WordPress adds it to the feed:

<link rel="self" type="application/atom+xml" href="http://blog.bigsmoke.us/author/halfgaar/feed/atom" />

For very old WordPress versions this doesn’t work, because the paged parameter isn’t supported there. Also, older versions might require you to supply the $base_url param to the XSLT processor, because the rel='self' link is incorrectly set to the URL of the default feed within all other feeds (such as the author feed or the tag feed).

Invocation

Invocation is simple. I use libxslt‘s xsltproc:

xsltproc wordpress-full-atom-feed.xsl http://blog.bigsmoke.us/author/halfgaar/feed/atom

That’s it. You end up with a full feed as if there never was any pagination to begin with; it almost looks as if WordPress does support the nopaging option for feeds.


2 Comments ( Add comment / trackback )

  1. (permalink)
    Comment by Toby
    On November 19, 2009 at 01:12

    Hi, could you give me some more basic step-by-step instructions on how to get this working on my WP site, as my feeds stop at the next page mark. Many thanks.

  2. (permalink)
    Comment by Rowan Rodrik
    On November 14, 2010 at 18:48

    Hi Toby,

    This trick is really only useful when you want to download the whole feed. It’s not so useful if you want your feed to include more posts per “page”.

Post a comment

(required)
(required)

Your email is never published nor shared.

(optional)
Allowed HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>