Simplifying XML navigation in Ruby
So I have been developing a bit of Ruby on Rails code that queries web services and map the results to rhtml. The REXML library is very nice and results in clean code to get the resulting content and provides a nice navigation API. However, I really wanted an even simpler api to navigate across XML elements in a similar way that you can navigate through a network of database objects using ActiveRecord.
Since REXML provided most of what I wanted, it was pretty straight forward to get what I wanted. All I really needed was a class that basically wraps the REXML::Element and allows navigation through basic properties.
Here is a quick example of its use from the weather.com web service
response = Net::HTTP.get_response("xoap.weather.com",
"/weather/local/#{@zip}?cc=*&link=xoap&prod=xoap" +
"&par=#{@@par_id}&key=#{@@weather_key}";
@weather = XMLElementWrapper.new(REXML::Document.new(response.body).root)
Then you can simply access the network of objects as:
@weather.cc.wind.s # city wind speed
@weather.cc.hmid # city humidity
@weather.loc.sunr # time of sunrise
This allows me to think of the XML graph as a network of objects instead of xml contents. The weather and Yahoo demos on openrico.org both use this approach.
Notice that each property is either an attribute or another XML wrapper that we can continue navigating through. Now, if the XML tags are too cryptic, we could add a mechanism to define mappings ('cc' and 'loc' are a bit cryptic). This would be trivial in Ruby. However, I have not implemented user defined mapping in this example.
Also the wrapper has an 'each' method that would support iteration through children of a specific tag.
Now the code for the wrapper is very simple due to the nice features provided in REXML:
class XMLElementWrapper
def initialize( element);
@element = element;
@cache = {}
end
def method_missing(method_id)
elem = @element.elements[method_id.to_s];
if elem == nil && @element.attributes != nil then
return @element.attributes[method_id.to_s]
end
if elem == nil || (elem.attributes.empty? && elem.elements.empty? && elem.text == nil)then
return nil;
end
wrap(elem);
end
def name() @element.name end
def each
@element.elements.each{|e| yield DynamicXMLElement.new(e);}
end
def each(name)
name = name.to_s;
@element.elements.each{|e| yield wrap(e) if e.name == name}
end
def wrap(element)
@cache[element] = create(element) if @cache[element] == nil;
@cache[element]
end
def create(element) XMLElementWrapper.new(element) end
def to_xml() @element end
def to_s() @element.text end
def to_f() @element.text.to_f end
def to_i() @element.text.to_i end
end
The class uses the 'method_missing' method to handle all the accessors that are getting attributes or navigating to other objects. It also caches each navigation and uses a factory method so that it can be easily extended to customize the creation of subnodes.
The following example might be used to provide an xml wrapper that sanitizes or textifies all contents for displaying something like yahoo search results in your page.
class SanitizedXMLObject
def create(element)
SanitizedXMLObject.new(element)
end
def to_s() sanitize super end
end
Like I said, this is very simplistic and I am sure it does not serve all needs for XML navigation (like XPath). However, you can retrieve the REXML::Element when you need to break out of the object navigation approach.
The cool thing about Ruby and the current libraries, is that I can do this with such simple code.