Smokes your problems, coughs fresh air.

Complete WordPress Atom feed: an XSLT transformation

Previously, I tried obtaining a full Atom feed without pagination from WordPress. I didn’t succeed, so I ended up writing an XSL transformation which merges all the pages of this Atom feed into one valid Atom XML stream.

The transformation: wordpress-full-atom-feed.xsl

<?xml version="1.0" encoding="UTF-8"?>
 
<xsl:transform version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:thr="http://purl.org/syndication/thread/1.0"
xmlns="http://www.w3.org/2005/Atom">
 
<xsl:param name="base_url" select="atom:feed/atom:link[@rel='self']/@href" />
 
 
<xsl:template match="/">
  <xsl:apply-templates select="node()" />
</xsl:template>
 
<xsl:template match="*">
  <xsl:element name="{name(.)}">
    <xsl:apply-templates select="node()" />
  </xsl:element>
</xsl:template>
 
<xsl:template match="@*|node()">
  <xsl:copy-of select="." />
</xsl:template>
 
<xsl:template match="atom:feed">
  <feed>
    <xsl:apply-templates select="@*|node()" />
 
    <xsl:call-template name="process_feed_page">
      <xsl:with-param name="page" select="number('2')" />
    </xsl:call-template>
  </feed>
</xsl:template>
 
<xsl:template name="process_feed_page">
  <xsl:param name="page" />
 
  <xsl:variable name="page_doc" select="document(concat($base_url, '?paged=', $page))" />
 
  <xsl:if test="$page_doc/atom:feed/atom:entry">
    <xsl:apply-templates select="$page_doc/atom:feed/atom:entry" />
  
    <xsl:call-template name="process_feed_page">
      <xsl:with-param name="page" select="$page + 1" />
    </xsl:call-template>
  </xsl:if>
</xsl:template>
 
</xsl:transform>

The transformation works by processing the atom:feed element. Before closing that, the process_feed_page template is called. This template tries to open the next page and process all atom:entry elements in there. Then it tries to recurse to the next page.

The next page’s URL can be guessed because this is the normal feed URL with ?paged=[pagenum] appended to it. The feed URL can be found because WordPress adds it to the feed:

<link rel="self" type="application/atom+xml" href="http://blog.bigsmoke.us/author/halfgaar/feed/atom" />

For very old WordPress versions this doesn’t work, because the paged parameter isn’t supported there. Also, older versions might require you to supply the $base_url param to the XSLT processor, because the rel='self' link is incorrectly set to the URL of the default feed within all other feeds (such as the author feed or the tag feed).

Invocation

Invocation is simple. I use libxslt‘s xsltproc:

xsltproc wordpress-full-atom-feed.xsl http://blog.bigsmoke.us/author/halfgaar/feed/atom

That’s it. You end up with a full feed as if there never was any pagination to begin with; it almost looks as if WordPress does support the nopaging option for feeds.

3 Comments

  1. Toby

    Hi, could you give me some more basic step-by-step instructions on how to get this working on my WP site, as my feeds stop at the next page mark. Many thanks.

  2. Rowan Rodrik

    Hi Toby,

    This trick is really only useful when you want to download the whole feed. It’s not so useful if you want your feed to include more posts per “page”.

  3. ric

    Hi Rowan,

    Is there a way I could adopt the code above to the sample XML Atom feed below?

    <?xml version="1.0" encoding="utf-8" ?>
      <feed xmlns:s="http://feed.example.com/services" xmlns="http://www.w3.org/2005/Atom">
      <title type="text">org select - with 'Bush' in name</title>
      <id>http://feed.example.com/orga/srvic/name/Bush</id>
      <rights type="text">Copyright 2015</rights>
      <updated>2010-05-09</updated>
      <category term="Search" />
      <logo>http://www.org.uk/orgcservices/docs/logo.jpg</logo>
      <author>
      <name>org select</name>
      <uri>http://www.sampleorg.uk</uri>
      <email>wbsrvcs@example.com</email>
      </author>
     <link rel="self" type="application/atom+xml" title="org select - Practices with 'Bush' in name" href="http://feed.example.com/organisations/services/name/Bush?apikey=12345" /> 
      <link rel="first" type="application/atom+xml" title="first" length="1000" href="http://feed.example.com/organisations/services/name/Bush?apikey=12345&amp;page=1" /> 
      <link rel="next" type="application/atom+xml" title="next" length="1000" href="http://feed.example.com/organisations/services/name/Bush?apikey=12345&amp;page=2" /> 
      <link rel="last" type="application/atom+xml" title="last" length="1000" href="http://feed.example.com/organisations/services/name/Bush?apikey=12345&amp;page=9" />
      
      <tracking xmlns="http://feed.example.com/services"><img style="border: 0; width: 1px; height: 1px;" alt="" src="http://webtrendsample.com/dfgfgh56dgd5/ddds.gif?fffuri=/organisations%2fservices%2fname%2fBush&amp;wt.js=no&amp;wt.cg_n=feed"/></tracking> 
      <entry>
      <id>http://feed.example.com/organisations/services/245645634</id>
      <title type="text">Bush Tree House</title> 
      <updated>2015-05-12T08:08:08Z</updated> 
      <link rel="self" title="Bush Tree House Surgery" href="http://feed.example.com/organisations/services/24308?apikey=12345" /> 
      <link rel="alternate" title="Bush Tree House Surgery" href="http://www.org.uk/Srvcs/DR/Default.aspx?id=34343434" /> 
      <content type="application/xml">
      <s:organisationSummary>
      <s:name>Bush Tree House</s:name> 
      <s:odsCode>JD12345</s:odsCode> 
      <s:address>
      <s:addressLine>Bush Tree House</s:addressLine> 
      <s:addressLine>John Street</s:addressLine> 
      <s:addressLine>Lirkam</s:addressLine> 
      <s:addressLine>Stockion</s:addressLine> 
      <s:postcode>DWQ4 22GW</s:postcode> 
      </s:address>
      <s:contact type="General">
     <?xml version="1.0" encoding="utf-8" ?>
      <feed xmlns:s="http://feed.example.com/services" xmlns="http://www.w3.org/2005/Atom">
      <title type="text">org select - with 'Bush' in name</title>
      <id>http://feed.example.com/orga/srvic/name/Bush</id>
      <rights type="text">Copyright 2015</rights>
      <updated>2010-05-09</updated>
      <category term="Search" />
      <logo>http://www.org.uk/orgcservices/docs/logo.jpg</logo>
      <author>
      <name>org select</name>
      <uri>http://www.sampleorg.uk</uri>
      <email>wbsrvcs@example.com</email>
      </author>
     <link rel="self" type="application/atom+xml" title="org select - Practices with 'Bush' in name" href="http://feed.example.com/organisations/services/name/Bush?apikey=12345" /> 
      <link rel="first" type="application/atom+xml" title="first" length="1000" href="http://feed.example.com/organisations/services/name/Bush?apikey=12345&amp;page=1" /> 
      <link rel="next" type="application/atom+xml" title="next" length="1000" href="http://feed.example.com/organisations/services/name/Bush?apikey=12345&amp;page=2" /> 
      <link rel="last" type="application/atom+xml" title="last" length="1000" href="http://feed.example.com/organisations/services/name/Bush?apikey=12345&amp;page=9" />
      
      <tracking xmlns="http://feed.example.com/services"><img style="border: 0; width: 1px; height: 1px;" alt="" src="http://webtrendsample.com/dfgfgh56dgd5/ddds.gif?fffuri=/organisations%2fservices%2fname%2fBush&amp;wt.js=no&amp;wt.cg_n=feed"/></tracking> 
      <entry>
      <id>http://feed.example.com/organisations/services/245645634</id>
      <title type="text">Bush Tree House</title> 
      <updated>2015-05-12T08:08:08Z</updated> 
      <link rel="self" title="Bush Tree House Surgery" href="http://feed.example.com/organisations/services/24308?apikey=12345" /> 
      <link rel="alternate" title="Bush Tree House Surgery" href="http://www.org.uk/Srvcs/DR/Default.aspx?id=34343434" /> 
      <content type="application/xml">
      <s:organisationSummary>
      <s:name>Bush Tree House</s:name> 
      <s:odsCode>JD12345</s:odsCode> 
      <s:address>
      <s:addressLine>Bush Tree House</s:addressLine> 
      <s:addressLine>John Street</s:addressLine> 
      <s:addressLine>Lirkam</s:addressLine> 
      <s:addressLine>Stockion</s:addressLine> 
      <s:postcode>DWQ4 22GW</s:postcode> 
      </s:address>
      <s:contact type="General">
      <s:telephone>111 2223333444</s:telephone>
      </s:contact>
      </s:organisationSummary>
      </content>
      </entry>
     <entry>
      .
      .
      .
      .

    Thanks very much for any help.

    Regards

© 2024 BigSmoke

Theme by Anders NorenUp ↑