README.md
  1  node-xml2js
  2  ===========
  3  
  4  Ever had the urge to parse XML? And wanted to access the data in some sane,
  5  easy way? Don't want to compile a C parser, for whatever reason? Then xml2js is
  6  what you're looking for!
  7  
  8  Description
  9  ===========
 10  
 11  Simple XML to JavaScript object converter. It supports bi-directional conversion.
 12  Uses [sax-js](https://github.com/isaacs/sax-js/) and
 13  [xmlbuilder-js](https://github.com/oozcitak/xmlbuilder-js/).
 14  
 15  Note: If you're looking for a full DOM parser, you probably want
 16  [JSDom](https://github.com/tmpvar/jsdom).
 17  
 18  Installation
 19  ============
 20  
 21  Simplest way to install `xml2js` is to use [npm](http://npmjs.org), just `npm
 22  install xml2js` which will download xml2js and all dependencies.
 23  
 24  xml2js is also available via [Bower](http://bower.io/), just `bower install
 25  xml2js` which will download xml2js and all dependencies.
 26  
 27  Usage
 28  =====
 29  
 30  No extensive tutorials required because you are a smart developer! The task of
 31  parsing XML should be an easy one, so let's make it so! Here's some examples.
 32  
 33  Shoot-and-forget usage
 34  ----------------------
 35  
 36  You want to parse XML as simple and easy as possible? It's dangerous to go
 37  alone, take this:
 38  
 39  ```javascript
 40  var parseString = require('xml2js').parseString;
 41  var xml = "<root>Hello xml2js!</root>"
 42  parseString(xml, function (err, result) {
 43      console.dir(result);
 44  });
 45  ```
 46  
 47  Can't get easier than this, right? This works starting with `xml2js` 0.2.3.
 48  With CoffeeScript it looks like this:
 49  
 50  ```coffeescript
 51  {parseString} = require 'xml2js'
 52  xml = "<root>Hello xml2js!</root>"
 53  parseString xml, (err, result) ->
 54      console.dir result
 55  ```
 56  
 57  If you need some special options, fear not, `xml2js` supports a number of
 58  options (see below), you can specify these as second argument:
 59  
 60  ```javascript
 61  parseString(xml, {trim: true}, function (err, result) {
 62  });
 63  ```
 64  
 65  Simple as pie usage
 66  -------------------
 67  
 68  That's right, if you have been using xml-simple or a home-grown
 69  wrapper, this was added in 0.1.11 just for you:
 70  
 71  ```javascript
 72  var fs = require('fs'),
 73      xml2js = require('xml2js');
 74  
 75  var parser = new xml2js.Parser();
 76  fs.readFile(__dirname + '/foo.xml', function(err, data) {
 77      parser.parseString(data, function (err, result) {
 78          console.dir(result);
 79          console.log('Done');
 80      });
 81  });
 82  ```
 83  
 84  Look ma, no event listeners!
 85  
 86  You can also use `xml2js` from
 87  [CoffeeScript](https://github.com/jashkenas/coffeescript), further reducing
 88  the clutter:
 89  
 90  ```coffeescript
 91  fs = require 'fs',
 92  xml2js = require 'xml2js'
 93  
 94  parser = new xml2js.Parser()
 95  fs.readFile __dirname + '/foo.xml', (err, data) ->
 96    parser.parseString data, (err, result) ->
 97      console.dir result
 98      console.log 'Done.'
 99  ```
100  
101  But what happens if you forget the `new` keyword to create a new `Parser`? In
102  the middle of a nightly coding session, it might get lost, after all. Worry
103  not, we got you covered! Starting with 0.2.8 you can also leave it out, in
104  which case `xml2js` will helpfully add it for you, no bad surprises and
105  inexplicable bugs!
106  
107  Parsing multiple files
108  ----------------------
109  
110  If you want to parse multiple files, you have multiple possibilities:
111  
112    * You can create one `xml2js.Parser` per file. That's the recommended one
113      and is promised to always *just work*.
114    * You can call `reset()` on your parser object.
115    * You can hope everything goes well anyway. This behaviour is not
116      guaranteed work always, if ever. Use option #1 if possible. Thanks!
117  
118  So you wanna some JSON?
119  -----------------------
120  
121  Just wrap the `result` object in a call to `JSON.stringify` like this
122  `JSON.stringify(result)`. You get a string containing the JSON representation
123  of the parsed object that you can feed to JSON-hungry consumers.
124  
125  Displaying results
126  ------------------
127  
128  You might wonder why, using `console.dir` or `console.log` the output at some
129  level is only `[Object]`. Don't worry, this is not because `xml2js` got lazy.
130  That's because Node uses `util.inspect` to convert the object into strings and
131  that function stops after `depth=2` which is a bit low for most XML.
132  
133  To display the whole deal, you can use `console.log(util.inspect(result, false,
134  null))`, which displays the whole result.
135  
136  So much for that, but what if you use
137  [eyes](https://github.com/cloudhead/eyes.js) for nice colored output and it
138  truncates the output with `…`? Don't fear, there's also a solution for that,
139  you just need to increase the `maxLength` limit by creating a custom inspector
140  `var inspect = require('eyes').inspector({maxLength: false})` and then you can
141  easily `inspect(result)`.
142  
143  XML builder usage
144  -----------------
145  
146  Since 0.4.0, objects can be also be used to build XML:
147  
148  ```javascript
149  var fs = require('fs'),
150      xml2js = require('xml2js');
151  
152  var obj = {name: "Super", Surname: "Man", age: 23};
153  
154  var builder = new xml2js.Builder();
155  var xml = builder.buildObject(obj);
156  ```
157  
158  At the moment, a one to one bi-directional conversion is guaranteed only for
159  default configuration, except for `attrkey`, `charkey` and `explicitArray` options
160  you can redefine to your taste. Writing CDATA is supported via setting the `cdata`
161  option to `true`.
162  
163  Processing attribute, tag names and values
164  ------------------------------------------
165  
166  Since 0.4.1 you can optionally provide the parser with attribute name and tag name processors as well as element value processors (Since 0.4.14, you can also optionally provide the parser with attribute value processors):
167  
168  ```javascript
169  
170  function nameToUpperCase(name){
171      return name.toUpperCase();
172  }
173  
174  //transform all attribute and tag names and values to uppercase
175  parseString(xml, {
176    tagNameProcessors: [nameToUpperCase],
177    attrNameProcessors: [nameToUpperCase],
178    valueProcessors: [nameToUpperCase],
179    attrValueProcessors: [nameToUpperCase]},
180    function (err, result) {
181      // processed data
182  });
183  ```
184  
185  The `tagNameProcessors` and `attrNameProcessors` options
186  accept an `Array` of functions with the following signature:
187  
188  ```javascript
189  function (name){
190    //do something with `name`
191    return name
192  }
193  ```
194  
195  The `attrValueProcessors` and `valueProcessors` options
196  accept an `Array` of functions with the following signature:
197  
198  ```javascript
199  function (value, name) {
200    //`name` will be the node name or attribute name
201    //do something with `value`, (optionally) dependent on the node/attr name
202    return value
203  }
204  ```
205  
206  Some processors are provided out-of-the-box and can be found in `lib/processors.js`:
207  
208  - `normalize`: transforms the name to lowercase.
209  (Automatically used when `options.normalize` is set to `true`)
210  
211  - `firstCharLowerCase`: transforms the first character to lower case.
212  E.g. 'MyTagName' becomes 'myTagName'
213  
214  - `stripPrefix`: strips the xml namespace prefix. E.g `<foo:Bar/>` will become 'Bar'.
215  (N.B.: the `xmlns` prefix is NOT stripped.)
216  
217  - `parseNumbers`: parses integer-like strings as integers and float-like strings as floats
218  E.g. "0" becomes 0 and "15.56" becomes 15.56
219  
220  - `parseBooleans`: parses boolean-like strings to booleans
221  E.g. "true" becomes true and "False" becomes false
222  
223  Options
224  =======
225  
226  Apart from the default settings, there are a number of options that can be
227  specified for the parser. Options are specified by ``new Parser({optionName:
228  value})``. Possible options are:
229  
230    * `attrkey` (default: `$`): Prefix that is used to access the attributes.
231      Version 0.1 default was `@`.
232    * `charkey` (default: `_`): Prefix that is used to access the character
233      content. Version 0.1 default was `#`.
234    * `explicitCharkey` (default: `false`)
235    * `trim` (default: `false`): Trim the whitespace at the beginning and end of
236      text nodes.
237    * `normalizeTags` (default: `false`): Normalize all tag names to lowercase.
238    * `normalize` (default: `false`): Trim whitespaces inside text nodes.
239    * `explicitRoot` (default: `true`): Set this if you want to get the root
240      node in the resulting object.
241    * `emptyTag` (default: `''`): what will the value of empty nodes be.
242    * `explicitArray` (default: `true`): Always put child nodes in an array if
243      true; otherwise an array is created only if there is more than one.
244    * `ignoreAttrs` (default: `false`): Ignore all XML attributes and only create
245      text nodes.
246    * `mergeAttrs` (default: `false`): Merge attributes and child elements as
247      properties of the parent, instead of keying attributes off a child
248      attribute object. This option is ignored if `ignoreAttrs` is `false`.
249    * `validator` (default `null`): You can specify a callable that validates
250      the resulting structure somehow, however you want. See unit tests
251      for an example.
252    * `xmlns` (default `false`): Give each element a field usually called '$ns'
253      (the first character is the same as attrkey) that contains its local name
254      and namespace URI.
255    * `explicitChildren` (default `false`): Put child elements to separate
256      property. Doesn't work with `mergeAttrs = true`. If element has no children
257      then "children" won't be created. Added in 0.2.5.
258    * `childkey` (default `$$`): Prefix that is used to access child elements if
259      `explicitChildren` is set to `true`. Added in 0.2.5.
260    * `preserveChildrenOrder` (default `false`): Modifies the behavior of
261      `explicitChildren` so that the value of the "children" property becomes an
262      ordered array. When this is `true`, every node will also get a `#name` field
263      whose value will correspond to the XML nodeName, so that you may iterate
264      the "children" array and still be able to determine node names. The named
265      (and potentially unordered) properties are also retained in this
266      configuration at the same level as the ordered "children" array. Added in
267      0.4.9.
268    * `charsAsChildren` (default `false`): Determines whether chars should be
269      considered children if `explicitChildren` is on. Added in 0.2.5.
270    * `includeWhiteChars` (default `false`): Determines whether whitespace-only
271       text nodes should be included. Added in 0.4.17.
272    * `async` (default `false`): Should the callbacks be async? This *might* be
273      an incompatible change if your code depends on sync execution of callbacks.
274      Future versions of `xml2js` might change this default, so the recommendation
275      is to not depend on sync execution anyway. Added in 0.2.6.
276    * `strict` (default `true`): Set sax-js to strict or non-strict parsing mode.
277      Defaults to `true` which is *highly* recommended, since parsing HTML which
278      is not well-formed XML might yield just about anything. Added in 0.2.7.
279    * `attrNameProcessors` (default: `null`): Allows the addition of attribute
280      name processing functions. Accepts an `Array` of functions with following
281      signature:
282      ```javascript
283      function (name){
284          //do something with `name`
285          return name
286      }
287      ```
288      Added in 0.4.14
289    * `attrValueProcessors` (default: `null`): Allows the addition of attribute
290      value processing functions. Accepts an `Array` of functions with following
291      signature:
292      ```javascript
293      function (name){
294        //do something with `name`
295        return name
296      }
297      ```
298      Added in 0.4.1
299    * `tagNameProcessors` (default: `null`): Allows the addition of tag name
300      processing functions. Accepts an `Array` of functions with following
301      signature:
302      ```javascript
303      function (name){
304        //do something with `name`
305        return name
306      }
307      ```
308      Added in 0.4.1
309    * `valueProcessors` (default: `null`): Allows the addition of element value
310      processing functions. Accepts an `Array` of functions with following
311      signature:
312      ```javascript
313      function (name){
314        //do something with `name`
315        return name
316      }
317      ```
318      Added in 0.4.6
319  
320  Options for the `Builder` class
321  -------------------------------
322  These options are specified by ``new Builder({optionName: value})``.
323  Possible options are:
324  
325    * `rootName` (default `root` or the root key name): root element name to be used in case
326       `explicitRoot` is `false` or to override the root element name.
327    * `renderOpts` (default `{ 'pretty': true, 'indent': '  ', 'newline': '\n' }`):
328      Rendering options for xmlbuilder-js.
329      * pretty: prettify generated XML
330      * indent: whitespace for indentation (only when pretty)
331      * newline: newline char (only when pretty)
332    * `xmldec` (default `{ 'version': '1.0', 'encoding': 'UTF-8', 'standalone': true }`:
333      XML declaration attributes.
334      * `xmldec.version` A version number string, e.g. 1.0
335      * `xmldec.encoding` Encoding declaration, e.g. UTF-8
336      * `xmldec.standalone` standalone document declaration: true or false
337    * `doctype` (default `null`): optional DTD. Eg. `{'ext': 'hello.dtd'}`
338    * `headless` (default: `false`): omit the XML header. Added in 0.4.3.
339    * `allowSurrogateChars` (default: `false`): allows using characters from the Unicode
340      surrogate blocks.
341    * `cdata` (default: `false`): wrap text nodes in `<![CDATA[ ... ]]>` instead of
342      escaping when necessary. Does not add `<![CDATA[ ... ]]>` if it is not required.
343      Added in 0.4.5.
344  
345  `renderOpts`, `xmldec`,`doctype` and `headless` pass through to
346  [xmlbuilder-js](https://github.com/oozcitak/xmlbuilder-js).
347  
348  Updating to new version
349  =======================
350  
351  Version 0.2 changed the default parsing settings, but version 0.1.14 introduced
352  the default settings for version 0.2, so these settings can be tried before the
353  migration.
354  
355  ```javascript
356  var xml2js = require('xml2js');
357  var parser = new xml2js.Parser(xml2js.defaults["0.2"]);
358  ```
359  
360  To get the 0.1 defaults in version 0.2 you can just use
361  `xml2js.defaults["0.1"]` in the same place. This provides you with enough time
362  to migrate to the saner way of parsing in `xml2js` 0.2. We try to make the
363  migration as simple and gentle as possible, but some breakage cannot be
364  avoided.
365  
366  So, what exactly did change and why? In 0.2 we changed some defaults to parse
367  the XML in a more universal and sane way. So we disabled `normalize` and `trim`
368  so `xml2js` does not cut out any text content. You can reenable this at will of
369  course. A more important change is that we return the root tag in the resulting
370  JavaScript structure via the `explicitRoot` setting, so you need to access the
371  first element. This is useful for anybody who wants to know what the root node
372  is and preserves more information. The last major change was to enable
373  `explicitArray`, so everytime it is possible that one might embed more than one
374  sub-tag into a tag, xml2js >= 0.2 returns an array even if the array just
375  includes one element. This is useful when dealing with APIs that return
376  variable amounts of subtags.
377  
378  Running tests, development
379  ==========================
380  
381  [![Build Status](https://travis-ci.org/Leonidas-from-XIV/node-xml2js.svg?branch=master)](https://travis-ci.org/Leonidas-from-XIV/node-xml2js)
382  [![Coverage Status](https://coveralls.io/repos/Leonidas-from-XIV/node-xml2js/badge.svg?branch=)](https://coveralls.io/r/Leonidas-from-XIV/node-xml2js?branch=master)
383  [![Dependency Status](https://david-dm.org/Leonidas-from-XIV/node-xml2js.svg)](https://david-dm.org/Leonidas-from-XIV/node-xml2js)
384  
385  The development requirements are handled by npm, you just need to install them.
386  We also have a number of unit tests, they can be run using `npm test` directly
387  from the project root. This runs zap to discover all the tests and execute
388  them.
389  
390  If you like to contribute, keep in mind that `xml2js` is written in
391  CoffeeScript, so don't develop on the JavaScript files that are checked into
392  the repository for convenience reasons. Also, please write some unit test to
393  check your behaviour and if it is some user-facing thing, add some
394  documentation to this README, so people will know it exists. Thanks in advance!
395  
396  Getting support
397  ===============
398  
399  Please, if you have a problem with the library, first make sure you read this
400  README. If you read this far, thanks, you're good. Then, please make sure your
401  problem really is with `xml2js`. It is? Okay, then I'll look at it. Send me a
402  mail and we can talk. Please don't open issues, as I don't think that is the
403  proper forum for support problems. Some problems might as well really be bugs
404  in `xml2js`, if so I'll let you know to open an issue instead :)
405  
406  But if you know you really found a bug, feel free to open an issue instead.