README.md
1 node-xml2js 2 =========== 3 4 Ever had the urge to parse XML? And wanted to access the data in some sane, 5 easy way? Don't want to compile a C parser, for whatever reason? Then xml2js is 6 what you're looking for! 7 8 Description 9 =========== 10 11 Simple XML to JavaScript object converter. It supports bi-directional conversion. 12 Uses [sax-js](https://github.com/isaacs/sax-js/) and 13 [xmlbuilder-js](https://github.com/oozcitak/xmlbuilder-js/). 14 15 Note: If you're looking for a full DOM parser, you probably want 16 [JSDom](https://github.com/tmpvar/jsdom). 17 18 Installation 19 ============ 20 21 Simplest way to install `xml2js` is to use [npm](http://npmjs.org), just `npm 22 install xml2js` which will download xml2js and all dependencies. 23 24 xml2js is also available via [Bower](http://bower.io/), just `bower install 25 xml2js` which will download xml2js and all dependencies. 26 27 Usage 28 ===== 29 30 No extensive tutorials required because you are a smart developer! The task of 31 parsing XML should be an easy one, so let's make it so! Here's some examples. 32 33 Shoot-and-forget usage 34 ---------------------- 35 36 You want to parse XML as simple and easy as possible? It's dangerous to go 37 alone, take this: 38 39 ```javascript 40 var parseString = require('xml2js').parseString; 41 var xml = "<root>Hello xml2js!</root>" 42 parseString(xml, function (err, result) { 43 console.dir(result); 44 }); 45 ``` 46 47 Can't get easier than this, right? This works starting with `xml2js` 0.2.3. 48 With CoffeeScript it looks like this: 49 50 ```coffeescript 51 {parseString} = require 'xml2js' 52 xml = "<root>Hello xml2js!</root>" 53 parseString xml, (err, result) -> 54 console.dir result 55 ``` 56 57 If you need some special options, fear not, `xml2js` supports a number of 58 options (see below), you can specify these as second argument: 59 60 ```javascript 61 parseString(xml, {trim: true}, function (err, result) { 62 }); 63 ``` 64 65 Simple as pie usage 66 ------------------- 67 68 That's right, if you have been using xml-simple or a home-grown 69 wrapper, this was added in 0.1.11 just for you: 70 71 ```javascript 72 var fs = require('fs'), 73 xml2js = require('xml2js'); 74 75 var parser = new xml2js.Parser(); 76 fs.readFile(__dirname + '/foo.xml', function(err, data) { 77 parser.parseString(data, function (err, result) { 78 console.dir(result); 79 console.log('Done'); 80 }); 81 }); 82 ``` 83 84 Look ma, no event listeners! 85 86 You can also use `xml2js` from 87 [CoffeeScript](https://github.com/jashkenas/coffeescript), further reducing 88 the clutter: 89 90 ```coffeescript 91 fs = require 'fs', 92 xml2js = require 'xml2js' 93 94 parser = new xml2js.Parser() 95 fs.readFile __dirname + '/foo.xml', (err, data) -> 96 parser.parseString data, (err, result) -> 97 console.dir result 98 console.log 'Done.' 99 ``` 100 101 But what happens if you forget the `new` keyword to create a new `Parser`? In 102 the middle of a nightly coding session, it might get lost, after all. Worry 103 not, we got you covered! Starting with 0.2.8 you can also leave it out, in 104 which case `xml2js` will helpfully add it for you, no bad surprises and 105 inexplicable bugs! 106 107 Parsing multiple files 108 ---------------------- 109 110 If you want to parse multiple files, you have multiple possibilities: 111 112 * You can create one `xml2js.Parser` per file. That's the recommended one 113 and is promised to always *just work*. 114 * You can call `reset()` on your parser object. 115 * You can hope everything goes well anyway. This behaviour is not 116 guaranteed work always, if ever. Use option #1 if possible. Thanks! 117 118 So you wanna some JSON? 119 ----------------------- 120 121 Just wrap the `result` object in a call to `JSON.stringify` like this 122 `JSON.stringify(result)`. You get a string containing the JSON representation 123 of the parsed object that you can feed to JSON-hungry consumers. 124 125 Displaying results 126 ------------------ 127 128 You might wonder why, using `console.dir` or `console.log` the output at some 129 level is only `[Object]`. Don't worry, this is not because `xml2js` got lazy. 130 That's because Node uses `util.inspect` to convert the object into strings and 131 that function stops after `depth=2` which is a bit low for most XML. 132 133 To display the whole deal, you can use `console.log(util.inspect(result, false, 134 null))`, which displays the whole result. 135 136 So much for that, but what if you use 137 [eyes](https://github.com/cloudhead/eyes.js) for nice colored output and it 138 truncates the output with `…`? Don't fear, there's also a solution for that, 139 you just need to increase the `maxLength` limit by creating a custom inspector 140 `var inspect = require('eyes').inspector({maxLength: false})` and then you can 141 easily `inspect(result)`. 142 143 XML builder usage 144 ----------------- 145 146 Since 0.4.0, objects can be also be used to build XML: 147 148 ```javascript 149 var fs = require('fs'), 150 xml2js = require('xml2js'); 151 152 var obj = {name: "Super", Surname: "Man", age: 23}; 153 154 var builder = new xml2js.Builder(); 155 var xml = builder.buildObject(obj); 156 ``` 157 158 At the moment, a one to one bi-directional conversion is guaranteed only for 159 default configuration, except for `attrkey`, `charkey` and `explicitArray` options 160 you can redefine to your taste. Writing CDATA is supported via setting the `cdata` 161 option to `true`. 162 163 Processing attribute, tag names and values 164 ------------------------------------------ 165 166 Since 0.4.1 you can optionally provide the parser with attribute name and tag name processors as well as element value processors (Since 0.4.14, you can also optionally provide the parser with attribute value processors): 167 168 ```javascript 169 170 function nameToUpperCase(name){ 171 return name.toUpperCase(); 172 } 173 174 //transform all attribute and tag names and values to uppercase 175 parseString(xml, { 176 tagNameProcessors: [nameToUpperCase], 177 attrNameProcessors: [nameToUpperCase], 178 valueProcessors: [nameToUpperCase], 179 attrValueProcessors: [nameToUpperCase]}, 180 function (err, result) { 181 // processed data 182 }); 183 ``` 184 185 The `tagNameProcessors` and `attrNameProcessors` options 186 accept an `Array` of functions with the following signature: 187 188 ```javascript 189 function (name){ 190 //do something with `name` 191 return name 192 } 193 ``` 194 195 The `attrValueProcessors` and `valueProcessors` options 196 accept an `Array` of functions with the following signature: 197 198 ```javascript 199 function (value, name) { 200 //`name` will be the node name or attribute name 201 //do something with `value`, (optionally) dependent on the node/attr name 202 return value 203 } 204 ``` 205 206 Some processors are provided out-of-the-box and can be found in `lib/processors.js`: 207 208 - `normalize`: transforms the name to lowercase. 209 (Automatically used when `options.normalize` is set to `true`) 210 211 - `firstCharLowerCase`: transforms the first character to lower case. 212 E.g. 'MyTagName' becomes 'myTagName' 213 214 - `stripPrefix`: strips the xml namespace prefix. E.g `<foo:Bar/>` will become 'Bar'. 215 (N.B.: the `xmlns` prefix is NOT stripped.) 216 217 - `parseNumbers`: parses integer-like strings as integers and float-like strings as floats 218 E.g. "0" becomes 0 and "15.56" becomes 15.56 219 220 - `parseBooleans`: parses boolean-like strings to booleans 221 E.g. "true" becomes true and "False" becomes false 222 223 Options 224 ======= 225 226 Apart from the default settings, there are a number of options that can be 227 specified for the parser. Options are specified by ``new Parser({optionName: 228 value})``. Possible options are: 229 230 * `attrkey` (default: `$`): Prefix that is used to access the attributes. 231 Version 0.1 default was `@`. 232 * `charkey` (default: `_`): Prefix that is used to access the character 233 content. Version 0.1 default was `#`. 234 * `explicitCharkey` (default: `false`) 235 * `trim` (default: `false`): Trim the whitespace at the beginning and end of 236 text nodes. 237 * `normalizeTags` (default: `false`): Normalize all tag names to lowercase. 238 * `normalize` (default: `false`): Trim whitespaces inside text nodes. 239 * `explicitRoot` (default: `true`): Set this if you want to get the root 240 node in the resulting object. 241 * `emptyTag` (default: `''`): what will the value of empty nodes be. 242 * `explicitArray` (default: `true`): Always put child nodes in an array if 243 true; otherwise an array is created only if there is more than one. 244 * `ignoreAttrs` (default: `false`): Ignore all XML attributes and only create 245 text nodes. 246 * `mergeAttrs` (default: `false`): Merge attributes and child elements as 247 properties of the parent, instead of keying attributes off a child 248 attribute object. This option is ignored if `ignoreAttrs` is `false`. 249 * `validator` (default `null`): You can specify a callable that validates 250 the resulting structure somehow, however you want. See unit tests 251 for an example. 252 * `xmlns` (default `false`): Give each element a field usually called '$ns' 253 (the first character is the same as attrkey) that contains its local name 254 and namespace URI. 255 * `explicitChildren` (default `false`): Put child elements to separate 256 property. Doesn't work with `mergeAttrs = true`. If element has no children 257 then "children" won't be created. Added in 0.2.5. 258 * `childkey` (default `$$`): Prefix that is used to access child elements if 259 `explicitChildren` is set to `true`. Added in 0.2.5. 260 * `preserveChildrenOrder` (default `false`): Modifies the behavior of 261 `explicitChildren` so that the value of the "children" property becomes an 262 ordered array. When this is `true`, every node will also get a `#name` field 263 whose value will correspond to the XML nodeName, so that you may iterate 264 the "children" array and still be able to determine node names. The named 265 (and potentially unordered) properties are also retained in this 266 configuration at the same level as the ordered "children" array. Added in 267 0.4.9. 268 * `charsAsChildren` (default `false`): Determines whether chars should be 269 considered children if `explicitChildren` is on. Added in 0.2.5. 270 * `includeWhiteChars` (default `false`): Determines whether whitespace-only 271 text nodes should be included. Added in 0.4.17. 272 * `async` (default `false`): Should the callbacks be async? This *might* be 273 an incompatible change if your code depends on sync execution of callbacks. 274 Future versions of `xml2js` might change this default, so the recommendation 275 is to not depend on sync execution anyway. Added in 0.2.6. 276 * `strict` (default `true`): Set sax-js to strict or non-strict parsing mode. 277 Defaults to `true` which is *highly* recommended, since parsing HTML which 278 is not well-formed XML might yield just about anything. Added in 0.2.7. 279 * `attrNameProcessors` (default: `null`): Allows the addition of attribute 280 name processing functions. Accepts an `Array` of functions with following 281 signature: 282 ```javascript 283 function (name){ 284 //do something with `name` 285 return name 286 } 287 ``` 288 Added in 0.4.14 289 * `attrValueProcessors` (default: `null`): Allows the addition of attribute 290 value processing functions. Accepts an `Array` of functions with following 291 signature: 292 ```javascript 293 function (name){ 294 //do something with `name` 295 return name 296 } 297 ``` 298 Added in 0.4.1 299 * `tagNameProcessors` (default: `null`): Allows the addition of tag name 300 processing functions. Accepts an `Array` of functions with following 301 signature: 302 ```javascript 303 function (name){ 304 //do something with `name` 305 return name 306 } 307 ``` 308 Added in 0.4.1 309 * `valueProcessors` (default: `null`): Allows the addition of element value 310 processing functions. Accepts an `Array` of functions with following 311 signature: 312 ```javascript 313 function (name){ 314 //do something with `name` 315 return name 316 } 317 ``` 318 Added in 0.4.6 319 320 Options for the `Builder` class 321 ------------------------------- 322 These options are specified by ``new Builder({optionName: value})``. 323 Possible options are: 324 325 * `rootName` (default `root` or the root key name): root element name to be used in case 326 `explicitRoot` is `false` or to override the root element name. 327 * `renderOpts` (default `{ 'pretty': true, 'indent': ' ', 'newline': '\n' }`): 328 Rendering options for xmlbuilder-js. 329 * pretty: prettify generated XML 330 * indent: whitespace for indentation (only when pretty) 331 * newline: newline char (only when pretty) 332 * `xmldec` (default `{ 'version': '1.0', 'encoding': 'UTF-8', 'standalone': true }`: 333 XML declaration attributes. 334 * `xmldec.version` A version number string, e.g. 1.0 335 * `xmldec.encoding` Encoding declaration, e.g. UTF-8 336 * `xmldec.standalone` standalone document declaration: true or false 337 * `doctype` (default `null`): optional DTD. Eg. `{'ext': 'hello.dtd'}` 338 * `headless` (default: `false`): omit the XML header. Added in 0.4.3. 339 * `allowSurrogateChars` (default: `false`): allows using characters from the Unicode 340 surrogate blocks. 341 * `cdata` (default: `false`): wrap text nodes in `<![CDATA[ ... ]]>` instead of 342 escaping when necessary. Does not add `<![CDATA[ ... ]]>` if it is not required. 343 Added in 0.4.5. 344 345 `renderOpts`, `xmldec`,`doctype` and `headless` pass through to 346 [xmlbuilder-js](https://github.com/oozcitak/xmlbuilder-js). 347 348 Updating to new version 349 ======================= 350 351 Version 0.2 changed the default parsing settings, but version 0.1.14 introduced 352 the default settings for version 0.2, so these settings can be tried before the 353 migration. 354 355 ```javascript 356 var xml2js = require('xml2js'); 357 var parser = new xml2js.Parser(xml2js.defaults["0.2"]); 358 ``` 359 360 To get the 0.1 defaults in version 0.2 you can just use 361 `xml2js.defaults["0.1"]` in the same place. This provides you with enough time 362 to migrate to the saner way of parsing in `xml2js` 0.2. We try to make the 363 migration as simple and gentle as possible, but some breakage cannot be 364 avoided. 365 366 So, what exactly did change and why? In 0.2 we changed some defaults to parse 367 the XML in a more universal and sane way. So we disabled `normalize` and `trim` 368 so `xml2js` does not cut out any text content. You can reenable this at will of 369 course. A more important change is that we return the root tag in the resulting 370 JavaScript structure via the `explicitRoot` setting, so you need to access the 371 first element. This is useful for anybody who wants to know what the root node 372 is and preserves more information. The last major change was to enable 373 `explicitArray`, so everytime it is possible that one might embed more than one 374 sub-tag into a tag, xml2js >= 0.2 returns an array even if the array just 375 includes one element. This is useful when dealing with APIs that return 376 variable amounts of subtags. 377 378 Running tests, development 379 ========================== 380 381 [](https://travis-ci.org/Leonidas-from-XIV/node-xml2js) 382 [](https://coveralls.io/r/Leonidas-from-XIV/node-xml2js?branch=master) 383 [](https://david-dm.org/Leonidas-from-XIV/node-xml2js) 384 385 The development requirements are handled by npm, you just need to install them. 386 We also have a number of unit tests, they can be run using `npm test` directly 387 from the project root. This runs zap to discover all the tests and execute 388 them. 389 390 If you like to contribute, keep in mind that `xml2js` is written in 391 CoffeeScript, so don't develop on the JavaScript files that are checked into 392 the repository for convenience reasons. Also, please write some unit test to 393 check your behaviour and if it is some user-facing thing, add some 394 documentation to this README, so people will know it exists. Thanks in advance! 395 396 Getting support 397 =============== 398 399 Please, if you have a problem with the library, first make sure you read this 400 README. If you read this far, thanks, you're good. Then, please make sure your 401 problem really is with `xml2js`. It is? Okay, then I'll look at it. Send me a 402 mail and we can talk. Please don't open issues, as I don't think that is the 403 proper forum for support problems. Some problems might as well really be bugs 404 in `xml2js`, if so I'll let you know to open an issue instead :) 405 406 But if you know you really found a bug, feel free to open an issue instead.