The LibXML2 Discovery Project

Lion Kimbro : Projects : The LibXML2 Discovery Project

Just some notes playing around with libxml2.

Installation

  1. ftp ftp.gnome.org, go to /pub/GNOME/stable/redhat/i386/libxml, mget libxml2*
  2. rpm -i libxml2*

All ready to play!

Makefile

CFLAGS=-g -Wall `gnome-config --cflags gnome gnomeui xml2`
LDFLAGS=`gnome-config --libs gnome gnomeui xml2`

clean:
rm test *.o core

You might not need "gnome" or "gnomeui" in the CFLAGS and LDFLAGS lists.

API to Create XML Documents

Commands necessary to create an XML document:

Example Use (Create XML Document)

The following example is ripped from somewhere on the Internet:

#include <libxml/tree.h>

int main()
{
  xmlDocPtr doc;
  xmlNodePtr tree, subtree;

  doc = xmlNewDoc("1.0");
  doc->children = xmlNewDocNode(doc, NULL, "EXAMPLE", NULL);
  xmlSetProp(doc->children, "prop1", "gnome is great");
  xmlSetProp(doc->children, "prop2", "& linux too");
  tree = xmlNewChild(doc->children, NULL, "head", NULL);
  subtree = xmlNewChild(tree, NULL, "title", "Welcome to Gnome");
  tree = xmlNewChild(doc->children, NULL, "chapter", NULL);
  subtree = xmlNewChild(tree, NULL, "title", "The Linux adventure");
  subtree = xmlNewChild(tree, NULL, "p", "bla bla bla ...");
  subtree = xmlNewChild(tree, NULL, "image", NULL);
  xmlSetProp(subtree, "href", "linus.gif");

  return 0;
}

XML output:

<?xml version="1.0"?>
<EXAMPLE prop1="gnome is great" prop2="& linux too"><head><title>Welcome to GNOME</title></head><chapter><title>The Linux adventure</title><p>bla bla bla ...</p><image href="linus.gif"/></chapter></EXAMPLE>

Internal memory structures:

Image of tree produced by code.

API to Read XML Documents

To explore the XML document, use the following relationships from xmlDocPtr:

Learn more about these elements in the actual structure declaration.

If you notice that you have a lot of whitespace empty "text" nodes, that's because there is white space between xml tags. I wish there were an easy way to skip white space text tags automatically in the API.

How does the API distinguish <text>(stuff)</text> from actual text? Actual text's content variable points to characters. Everything else's content variable points to NULL. Thus, <text></text> would render to an xmlNodePtr with name "text", but content NULL. But there's an even better way: Call int xmlNodeIsText( xmlNodePtr ), which returns 1 if it is, and 0 if it isn't.

Example Usage (Read XML Document)

This program requires an XML file; You can use the XML file created in the previous example. When I wrote it, I had the OpenSource Directory in mind.

#include <stdio.h>
#include <libxml/tree.h>
#include <libxml/parser.h>


void print_spaces( int num )
{
  while( num-- > 0 )
    printf( " " );
}

void print_recursive( xmlDocPtr doc, xmlNodePtr p, int depth )
{
  if( p == NULL )
    return;

  while( p != NULL ) {
    print_spaces( depth );
    printf( "%s\n", p->name );
    if( depth == 3)
      if( xmlStrcmp( p->name, "text" ) == 0) {
	print_spaces( depth );
	printf( "content: %s\n", p->content );
      }
    print_recursive( doc, p->children, depth+1 );
    p = p->next;
  }
}

void main()
{
  xmlDocPtr  doc;
  
  doc = xmlParseFile( "osd.xml" );
  
  print_recursive( doc, doc->children, 0 );

  xmlFreeDoc( doc );
}

A Tree with Text Nodes

Here's an example XML file, and a graph of it.

<?xml version="1.0">
<A>Hello, World!</A>
<B>Sub Elements:
   <NUM1></NUM1>
   <NUM2>some words</NUM2>
   <NUM3></NUM3>
</B>

You may be wondering why there is a text node after <A> and before <B>. That's because of the white space- there's a newline character between <A>Hello, World!</A> and <B>Sub Elements:.... That newline shows up as a text node.

Another Example Usage (Read XML Document)

This program requires the OpenSource Directory XML file, stack space, and a sense of humor.


#include <stdio.h>
#include <libxml/tree.h>
#include <libxml/parser.h>


xmlNodePtr find_node( xmlNodePtr p, char* name )
{
  xmlNodePtr result;

  if( p == NULL )                      return NULL;
  if( xmlStrEqual( p->name, name ) )   return p;

  result = find_node( p->children, name );
  return result?result:find_node( p->next, name );
}

void list_groups( xmlNodePtr p )
{
  if( p == NULL ) return;

  printf( "%s\n", find_node( find_node( p, "group_name" ), "text" )->content );

  list_groups( find_node( p->next, "group" ) );
}

void main()
{
  list_groups( find_node( xmlParseFile( "osd.xml" )->children, "group" ) );
}

Projects

A nice tool would be one that would use an XML file from the OpenSource Directory and allow you to quickly browse through projects on your local client.

Give the user the following options:

Look up "Computers Programming Practice CommandLineOptions" in my weird file to get the table layout.

Relevant Web Pages