Now: Tutorial for Web and Software Design > PHP > PHP Basic > PHP Content
> Using PHP 5s SimpleXML [Bookmark it]
Using PHP 5s SimpleXML
PHP Cookbook

Using PHP 5's SimpleXML

by Adam Trachtenberg, coauthor of PHP Cookbook
01/15/2004

XML is great, but I've constantly wondered why it's so difficult to parse. Most languages provide you with three options: SAX, DOM, and XSLT. Each has its own problems:

  • SAX's event-based design forces you to track elements manually, by pushing and popping them on and off of a stack.
  • DOM is bulky and cumbersome. While comprehensive, it takes seven lines just to read <hello>.
  • XSLT? If I wanted to program in a functional language, I'd use Lisp instead of PHP.

SimpleXML is a new and unique feature of PHP 5 that solves these problems by turning an XML document into a data structure you can iterate through like a collection of arrays and objects. It excels when you're only interested in an element's attributes and text and you know the document's layout ahead of time. SimpleXML is easy to use because it handles only the most common XML tasks, leaving the rest for other extensions.

This article shows how to use SimpleXML to read an XML file, parse the results into a useful form, and query the document with XPath. I use RSS for the examples, since some versions of RSS are nice and easy. Then there's RSS 1.0. It uses RDF, multiple namespaces, and defines a default namespace for its elements. (Not so nice and easy.)

Along the way, there's a brief discussion on XML namespaces and XPath, since they're necessary to process XML documents that expand beyond the basics. In particular, to handle RSS 1.0, you need to work with these XML specifications.

To try SimpleXML, you need a copy of PHP 5 Beta 3, as not everything described here works in earlier versions. SimpleXML also requires libxml2, an open source XML parsing library that all of PHP 5's XML extensions now use. SimpleXML support is enabled by default, so it's automatically installed when you build PHP 5.

Like PHP 5, SimpleXML is beta quality. There are still a few bugs, memory leaks, and unimplemented features, but overall it's coming together nicely.

Reading XML

The first set of examples use the following chunk of RSS, which is stored in rss-0.91.xml:

<?xml version="1.0" encoding="utf-8" ?>

<rss version="0.91">

<channel>

    <title>PHP: Hypertext Preprocessor</title>

    <link>http://www.php.net/</link>

    <description>The PHP scripting language web site</description>

</channel>



<item>

    <title>PHP 5.0.0 Beta 3 Released</title>

    <link>http://www.php.net/downloads.php</link>

    <description>PHP 5.0 Beta 3 has been released. The third beta 

    of PHP is also scheduled to be the last one (barring unexpected 

    surprises).</description>

</item>

<item>

    <title>PHP Community Site Project Announced</title>

    <link>http://shiflett.org/archive/19</link>

    <description>

    Members of the PHP community are seeking volunteers to help 

    develop the first web site that is created both by the community and for 

    the community.</description>

</item>

</rss>

To begin, create a new SimpleXML object. For XML on disk, use simplexml_load_file('/path/to/file.xml'). If it's stored in a PHP variable, use simplexml_load_string($xml). So, to load the RSS, do:

$s = simplexml_load_file('rss-0.91.xml');

Element text is accessed like object properties:

print $s->channel->title . "\n";



PHP: Hypertext Preprocessor

If there's more than one element in the same level in document, they're placed inside an array. In this example, there's only one <channel>, but two <items>s. To access an <item>, use its location in the array:

print $s->item[0]->title . "\n";



PHP 5.0.0 Beta 3 Released

To print all titles, use a foreach loop:

foreach ($s->item as $item) {

    print $item->title . "\n";

}



PHP 5.0.0 Beta 3 Released

PHP Community Site Project Announced

Use array notation to read element attributes:

print $s['version'] . "\n";



0.91

Other XML features, like comments and processing instructions, are unsupported. You can't (yet) access these entities. However, since most XML documents don't place vital information in comments or use processing instructions, this isn't a big drawback.

Querying with XPath

SimpleXML uses XPath to allow you to gather information from a document. Find and print all the text inside title elements with:

foreach ($s->xsearch('//title') as $title) { 

    print "$title\n";

}



PHP: Hypertext Preprocessor

PHP 5.0.0 Beta 3 Released

PHP Community Site Project Announced

The xsearch() method searches a SimpleXML object and returns an array of matching nodes. Pass your XPath query as the argument. In this case, //title finds all title elements regardless of location in the tree. Or, restrict the search to only <title>s inside of <item>s with //item/title.

If you've used XSLT, you're familiar with XPath. XSLT templates use XPath expressions to determine when to process a node. For more on XPath, read John E. Simpson's XPath and XPointer (O'Reilly) or John's XML.com article, Top Ten Tips to Using XPath and XPointer. Additionally, Chapter 9 of XML in a Nutshell, by Elliotte Rusty Harold and W. Scott Means (O'Reilly), covers XPath and is available free online.

While these examples are somewhat trivial, XPath is quite useful with complex documents, as you can create sophisticated queries to return finely tuned results.

[1] [2] Next

[Bookmark][Print] [Close][To Top]
  • Prev Article-PHP:

  • Next Article-PHP:
  • Related Materias
    Scaling Dynamic Websites w
    Whats New in ModSecurity
    A Day in the Life of #Apac
    Custom-Compiling Apache an
    Apaches eXtended Server Si
    A Day in the Life of #Apac
    A Day in the Life of #Apac
    A Day in the Life of #Apac
    A Day in the Life of #Apac
    A Day in the Life of #Apac
    Topics
    Photoshop Tutorial
     

    Special Effect

      3D Effect
      Photoshop Articles
    Programming Tutorial
     

    C/C++ Tutorial

      Visual Basic
      C# Tutorial
    Database Tutorial
     

    MySQL Tutorial

      MS SQL Tutorial
      Oracle Tutorial
    Graphic Design Tutorial
     

    Coreldraw Tutorial

      Illustrator Tutorial
      3D Graphics Articles
    Webmaster Articles
     

    Domain Service

      Web Hosting
      Site Promotion
    Java Tutorial&Articles
     

    Java Servlets

      JavaEE Tutorial
     

    JavaBeans Tutorial

    XML Tutorial&Articles
     

    XML Style Tutorial

      AJAX Tutorial
      XML Mobile
    Flash Tutorial&Articles
     

    Flash Video

      Action Script
      Flash Articles
    OS Tutorial&Articles
     

    Linux Tutorial

      Symbian Tutorial
      MacOS Tutorial