|
Download 1.0.pre3: | prdownloads.sourceforge.net/clxmlserial/clxmlserial.1.0.pre3.zip |
REXML (>=1.2.5)*: | www.germane-software.com/~ser/Software/rexml |
Home Page: | clabs.org/clxmlserial.htm |
ViewCVS: | cvs.sourceforge.net/cgi-bin/viewcvs.cgi/clxmlserial/xmls/ |
Anon CVS: | sourceforge.net/cvs/?group_id=51071 |
* not tested with any version > 1.2.7.
please review the Security Issues section before using.
Xml Serialization allows classes to be marshalled to and from XML.
It consists of a module (XmlSerialization) and modified standard classes which add to_xml and from_xml methods. to_xml is an instance method which returns an XML element containing the data from each instance variable in the including class. from_xml is a singleton/class method which accepts an XML element and creates an instance of the class with the data in the element.
Currently, REXML is used for XML parsing. It's possible later versions could plug-in other XML processors.
This project is still in a pre-release state, though functional. Feel free to give me feedback (code contributions are of course always welcome).
Copyright (c) 2002, Chris Morris (clxmlserial@clabs.org). BSD license.
% ruby install.rb
See the examples directory for a sample. Unit tests are also included in SITE/1.6/cl/xmlserial/xmlserialtest.rb. Here's a quick sample:
require 'cl/xmlserial' class MyClass include XmlSerialization
attr_accessor :attr def initialize attr = 0 end end
doc = REXML::Document.new(File.open("class.xml")) c = MyClass.from_xml(doc.root) c.attr = 60 f = File.new("class.xml", File::CREAT|File::TRUNC|File::RDWR) c.to_xml.write(f, -1) f.close
yields either:
<MyClass> <attr> <Fixnum>60</Fixnum> </attr> </MyClass>
or:
<MyClass> <attr>60</attr> </MyClass>
The XmlSerialization module includes a singleton configuration class (XmlSerialConf.instance aliased XSConf ) with an outputTypeElements setting. Setting this to false gives more concise XML (the latter example above). In order to ensure the data is read in correctly, the instance variables should be initialized in the class's initialize method.
Attempts to correctly grok Strings and Numerics will be made for uninitialized instance vars, so the latter example above will read in 60 as a Fixnum, even if @attr is not initialized. If the value is neither a valid Integer or Float, then it's read in as a String.
All forms of Ruby Numeric notation are supported as well. So this:
<Array>-5.4,5.a,4e5,0xaabb,123_456</Array>
is read in as:
[-5.4, "5.a", 400000.0, 43707, 123456]
Arrays and Hashes also work with outputTypeElements set to false, assuming the items/keys/values are all of type String or Numeric. In that case, a CSV string is output. For example:
c = MyClass.new c.attr = ['a', 5]
becomes
<MyClass> <attr>a,5</attr> </MyClass>
and
c = MyClass.new c.attr = { 'a' => 5, 'b' => 6 }
becomes
<MyClass> <attr>a=5,b=6</attr> </MyClass>
If any of the items/keys/values are neither a String or Numeric, then type elements are automagically used:
c = MyClass.new c.attr = ['a', ['b', 'c']]
becomes
<MyClass> <attr> <Array> <String>a</String> <Array> <String>b</String> <String>c</String> </Array> </Array> </attr> </MyClass>
As of 1.0.pre3, Xml Serialization can be used with classes that do not have a default/parameterless constructor. Set the XSConf.bypassInitialize attribute to true to have from_xml ignore the initialize method of the class. False is the default setting.
Also changed in 1.0.pre3, attribute accessors are no longer required. instance_eval is used to set attributes directly.
Currently, the following standard classes are supported:
1.0.pre3 switched from requiring attribute accessors for deserialization to calling instance_eval. This is more convenient, but has a potential security hole.
If the $SAFE level is set to 1, all strings read in from a file are marked tainted, and cannot be passed to instance_eval. However, because REXML passes all strings through Array.pack and Array.unpack calls to support various xml encodings, the string's taintedness is lost, and the instance_eval calls are allowed.
Beyond that, a $SAFE level of 3 or more will simply not allow calls to instance_eval, so the current release won't work under those conditions.
In 1.0.pre4, I plan to re-add the original code that uses send and requires writer accessor methods, in addition to the instance_eval code, and add a XSConf switch to control this. The default setting will be required accessor methods to play it safe with the potential security hole.
I've been discussing this issue with Sean Russell, author of REXML, and it's possible that REXML will be changed to retain the string's taintedness through the encoding process. In this case, the security hole should be closed, and the option to not use instance_eval will be necessary at any $SAFE level.
The Object class has a few methods appended to it, the main one being to_xml. Its primary role is to setup the base XML element node, including a type element if required. Then it calls instance_data_to_xml which must be overridden in descendant classes.
The supported standard classes all have instance_data_to_xml methods appended to them. For custom classes, the module XmlSerialization has a instance_data_to_xml method that loops through each instance variable in the including class, calling to_xml on each of them.
from_xml is a singleton method (class method) appended to each supported standard class as well as a singleton method in the XmlSerialization module. It creates a new instance of the class based on the XML element passed to it.
Hmmm ... this gets complicated fast. Basically, the xml will have to have an id system, so that a child instance can simply refer to an already serialized instance's id. Then, during deserialization this id system will tie back to Object#id.
Problem here is now the xml is getting cluttered and I want to keep an option for uncluttered xml -- so, how to handle this properly.