/ doc / serialization.txt
serialization.txt
  1  ======================
  2  Serialization Overview
  3  ======================
  4  
  5  .. importdoc::
  6    api/yaml/loading.nim, api/yaml/dumping.nim, api/yaml/native.nim,
  7    api/yaml/annotations.nim, api/yaml/taglib.nim, api/yaml/style.nim,
  8    api/yaml/dom.nim, api/yaml/tojson.nim,
  9    api/yaml/parser.nim, api/yaml/presenter.nim, api/yaml/data.nim,
 10    api/yaml/stream.nim 
 11  
 12  Introduction
 13  ============
 14  
 15  NimYAML tries hard to make transforming YAML characters streams to native Nim
 16  types and vice versa as easy as possible. In simple scenarios, you might not
 17  need anything else than the two procs `dump`_ and `load`_. On the other side,
 18  the process should be as customizable as possible to allow the user to tightly
 19  control how the generated YAML character stream will look and how a YAML
 20  character stream is interpreted.
 21  
 22  An important thing to remember in NimYAML is that unlike in interpreted
 23  languages like Ruby, Nim cannot load a YAML character stream without knowing the
 24  resulting type beforehand. For example, if you want to load this piece of YAML:
 25  
 26  .. code-block:: yaml
 27  
 28      %YAML 1.2
 29      --- !nim:system:seq(nim:system:int8)
 30      - 1
 31      - 2
 32      - 3
 33  
 34  You would need to know that it will load a ``seq[int8]`` *at compile time*. This
 35  is not really a problem because without knowing which type you will load, you
 36  cannot do anything useful with the result afterwards in the code. But it may be
 37  unfamiliar for programmers who are used to the YAML libraries of Python or Ruby.
 38  
 39  Supported Types
 40  ===============
 41  
 42  NimYAML supports a growing number of types of Nim's ``system`` module and
 43  standard library, and it also supports user-defined object, tuple and enum types
 44  out of the box. A complete list of explicitly supported types is available in
 45  `Schema <schema.html>`_.
 46  
 47  **Important**: NimYAML currently does not support polymorphism. This may be
 48  added in the future.
 49  
 50  This also means that NimYAML is generally able to work with object, tuple and
 51  enum types defined in the standard library or a third-party library without
 52  further configuration, assuming that all fields of the object are accessible at the
 53  code point where NimYAML's facilities are invoked.
 54  
 55  Scalar Types
 56  ------------
 57  
 58  The following integer types are supported by NimYAML: ``int``, ``int8``,
 59  ``int16``, ``int32``, ``int64``, ``uint8``, ``uint16``, ``uint32``, ``uint64``.
 60  Note that the ``int`` type has a variable size dependent on the target
 61  operation system. To make sure that it round-trips properly between 32-bit and
 62  64-bit operating systems, it will be converted to an ``int32`` during loading
 63  and dumping. This will raise an exception for values outside of the range
 64  ``int32.low .. int32.high``! If you define the types you serialize yourself,
 65  always consider using an integer type with explicit length. The same goes for
 66  ``uint``.
 67  
 68  The floating point types ``float``, ``float32`` and ``float64`` are also
 69  supported. There is currently no problem with ``float``, because it is always a
 70  ``float64``.
 71  
 72  ``string`` is supported and one of the few Nim types which directly map to a
 73  standard YAML type. NimYAML is able to handle strings that are ``nil``, they
 74  will be serialized with the special tag ``!nim:nil:string``. ``char`` is also
 75  supported.
 76  
 77  To support new scalar types, you must implement the ``constructObject()`` and
 78  ``representObject()`` procs on that type (see below).
 79  
 80  Container Types
 81  ---------------
 82  
 83  NimYAML supports Nim's ``array``, ``set``, ``seq``, ``Table``, ``OrderedTable``
 84  and ``Option`` types out of the box. While YAML's standard types ``!!seq`` and
 85  ``!!map`` allow arbitrarily typed content, in Nim the contained type must be
 86  known at compile time. Therefore, Nim cannot load ``!!seq`` and ``!!map``.
 87  
 88  However, it doesn't need to. For example, if you have a YAML file like this:
 89  
 90  .. code-block:: yaml
 91      
 92      %YAML 1.2
 93      ---
 94      - 1
 95      - 2
 96  
 97  You can simply load it into a ``seq[int]``. If your YAML file contains differently
 98  typed values in the same collection, you can use an implicit variant object, see
 99  below.
100  
101  A special case is ``Option[T]``: This type will either contain a value or not.
102  NimYAML maps ``!!null`` YAML scalars to the option's ``none(T)`` value.
103  This also works for ``ref`` types because ``Option`` for those types will use
104  ``nil`` as its ``none(T)`` value.
105  
106  By default, ``Option`` fields must be given even if they are ``none(T)``.
107  You can circumvent this by putting the annotation ``{.sparse.}`` on the type
108  containing the ``Option`` field.
109  
110  Reference Types
111  ---------------
112  
113  A reference to any supported non-reference type (including user defined types,
114  see below) is supported by NimYAML. A reference type will be treated like its
115  base type, but NimYAML is able to detect multiple references to the same object
116  and dump the structure properly with anchors and aliases in place. It is
117  possible to dump and load cyclic data structures without further configuration.
118  It is possible for reference types to hold a ``nil`` value, which will be mapped
119  to the ``!!null`` YAML scalar type.
120  
121  ``ptr`` types are not supported because it seems dangerous to automatically
122  allocate memory which the user must then manually deallocate.
123  
124  Anchors and aliases are not supported when calling NimYAML at compile time.
125  
126  User Defined Types
127  ------------------
128  
129  For an object or tuple type to be directly usable with NimYAML, the following
130  conditions must be met:
131  
132  - Every type contained in the object/tuple must be supported
133  - All fields of an object type must be accessible from the code position where
134    you call NimYAML. If an object has non-public member fields, it can only be
135    processed in the module where it is defined.
136  - The object must not have a generic parameter
137  
138  NimYAML will present enum types as YAML scalars, and tuple and object types as
139  YAML mappings. Some of the conditions above may be loosened in future releases.
140  
141  Variant Object Types
142  ....................
143  
144  A *variant object type* is an object type that contains one or more ``case``
145  clauses. NimYAML supports variant object types. Only the currently accessible
146  fields of a variant object type are dumped, and only those may be present when
147  loading.
148  
149  The value of a discriminator field must be loaded before any value of a field
150  that depends on it. Therefore, a YAML mapping cannot be used to serialize
151  variant object types - the YAML specification explicitly states that the order
152  of key-value pairs in a mapping must not be used to convey content information.
153  So, any variant object type is serialized as a list of key-value pairs.
154  
155  For example, this type:
156  
157  .. code-block:: nim
158    type
159      AnimalKind = enum
160        akCat, akDog
161  
162      Animal = object
163        name: string
164        case kind: AnimalKind
165        of akCat:
166          purringIntensity: int
167        of akDog:
168          barkometer: int
169  
170  will be serialized as:
171  
172  .. code-block:: yaml
173    %YAML 1.2
174    --- !nim:custom:Animal
175    - name: Bastet
176    - kind: akCat
177    - purringIntensity: 7
178  
179  You can also use variant object types for processing heterogeneous data sets.
180  For example, if you have a YAML document which contains differently typed values
181  in the same list like this:
182  
183  .. code-block:: yaml
184    %YAML 1.2
185    ---
186    - 42
187    - this is a string
188    - !!null
189  
190  You can define a variant object type that can hold all types that occur in this
191  list in order to load it:
192  
193  .. code-block:: nim
194    import yaml
195  
196    type
197      ContainerKind = enum
198        ckInt, ckString, ckNone
199      Container {.implicit.} = object
200        case kind: ContainerKind
201        of ckInt:
202          intVal: int
203        of ckString:
204          strVal: string
205        of ckNone:
206          discard
207  
208    var
209      list: seq[Container]
210      s = newFileStream("in.yaml")
211    load(s, list)
212  
213  ``{.implicit.}`` tells NimYAML that you want to use the type ``Container``
214  implicitly, i.e. its fields are not visible in YAML, and are set dependent on
215  the value type that gets loaded into it. The type ``Container`` must fullfil the
216  following requirements:
217  
218  - It must contain exactly one ``case`` clause, and nothing else.
219  - Each branch of the ``case`` clause must contain exactly one field, with one
220    exception: There may be at most one branch that contains no field at all.
221  - It must not be a derived object type (this is currently not enforced)
222  
223  When loading the sequence, NimYAML writes the value into the first field that
224  can hold the value's type. All complex values (i.e. non-scalar values) *must*
225  have a tag in the YAML source, because NimYAML would otherwise be unable to
226  determine their type. The type of scalar values will be guessed if no tag is
227  available, but be aware that ``42`` can fit in both ``int8`` and ``int16``, so
228  in the case you have fields for both types, you should annotate the value.
229  
230  When dumping the sequence, NimYAML will always annotate a tag to each value it
231  outputs. This is to avoid possible ambiguity when loading. If a branch without
232  a field exists, it is represented as a ``!!null`` value.
233  
234  Tags
235  ====
236  
237  NimYAML uses local tags to represent Nim types that do not map directly to a
238  YAML type. For example, ``int8`` is presented with the tag ``!nim:system:int8``.
239  Tags are mostly unnecessary when loading YAML data because the caller already
240  defines the target Nim type which usually defines all types of the structure.
241  However, there is one case where a tag is necessary: A reference type with the
242  value ``nil`` is represented in YAML as a ``!!null`` scalar. This will be
243  automatically detected by type guessing, but if it is for example a reference to
244  a string with the value ``"~"``, it must be tagged with ``!!string``, because
245  otherwise, it would be loaded as ``nil``.
246  
247  As you might have noticed in the example above, the YAML tag of a ``seq``
248  depends on its generic type parameter. The same applies to ``Table``. So, a
249  table that maps ``int8`` to string sequences would be presented with the tag
250  ``!n!tables:Table(tag:nimyaml.org,2016:int8,tag:nimyaml.org,2016:system:seq(tag:yaml.org,2002:string))``.
251  These tags are generated on the fly based on the types you instantiate
252  ``Table`` or ``seq`` with.
253  
254  You may customize the tags used for your types by using the template
255  `setTagUri`_. It may not be applied to scalar and collection types implemented
256  by NimYAML, but you can for example use it on a certain ``seq`` type:
257  
258  .. code-block:: nim
259  
260      setTagUri(seq[string], "!nim:my:seq")
261  
262  Customizing Field Handling
263  ==========================
264  
265  NimYAML allows the user to specify special handling of certain object fields via
266  annotation pragmas.
267  
268  Transient Fields
269  ----------------
270  
271  It may happen that certain fields of an object type are transient, i.e. they are
272  used in a way that makes (de)serializing them unnecessary. Such fields can be
273  marked as transient. This will cause them not to be serialized to YAML. They
274  will also not be accepted when loading the object.
275  
276  Example:
277  
278  .. code-block:: nim
279  
280    type MyObject: object
281      storable: string
282      temporary {.transient.}: string
283  
284  Default Values
285  --------------
286  
287  When you load YAML, you might want to allow for the omission certain fields,
288  which should then be filled with a default value. You can do that like this:
289  
290  .. code-block:: nim
291  
292    type MyObject: object
293      required: string
294      optional {.defaultVal: "default value".}: string
295  
296  Whenever a value of type ``MyObject`` now is loaded and the input stream does
297  not contain the field ``optional``, that field will be set to the value
298  ``"default value"``.
299  
300  Customize Serialization
301  =======================
302  
303  It is possible to customize the serialization of a type. For this, you need to
304  implement two procs, ``constructObject̀`` and ``representObject``. If you only
305  need to process the type in one direction (loading or dumping), you can omit
306  the other proc.
307  
308  constructObject
309  ---------------
310  
311  .. code-block:: nim
312  
313      proc constructObject*(
314        ctx   : var ConstructionContext,
315        result: var MyObject,
316      ) {.raises: [YamlConstructionError, YamlStreamError].}
317  
318  This proc should construct the type from the ``YamlStream`` in ``ctx.input``.
319  Follow the following guidelines when implementing a custom ``constructObject`` proc:
320  
321  - For constructing a value from a YAML scalar, consider using the
322    ``constructScalarItem`` template, which will automatically catch exceptions
323    and wrap them with a ``YamlConstructionError``, and also will assure that the
324    item you use for construction is a ``yamlScalar``. See below for an example.
325  - For constructing a value from a YAML sequence or map, you **must** use the
326    ``constructChild`` proc for child values if you want to use their
327    ``constructObject`` implementation. This will check their tag and anchor.
328    Always try to construct child values that way.
329  - For non-scalars, make sure that the last value you remove from the stream is
330    the object's ending event (``yamlEndMap`` or ``yamlEndSequence``)
331  - Use `peek <yaml.html#peek,YamlStream>`_ for inspecting the next event in
332    the ``YamlStream`` without removing it.
333  - Never write a ``constructObject`` proc for a ``ref`` type. ``ref`` types are
334    always handled by NimYAML itself. You can only customize the construction of
335    the underlying object.
336  
337  The following example for constructing from a YAML scalar value is the actual
338  implementation of constructing ``bool`` types:
339  
340  .. code-block:: nim
341  
342      proc constructObject*(
343        ctx   : var ConstructionContext,
344        result: var bool,
345      ) {.raises: [YamlConstructionError, YamlStreamError].} =
346        ## constructs a bool value from a YAML scalar
347        ctx.input.constructScalarItem(item, bool):
348          case guessType(item.scalarContent)
349          of yTypeBoolTrue: result = true
350          of yTypeBoolFalse: result = false
351          else:
352            raise ctx.input.constructionError(
353              item.startPos,
354              "Cannot construct to bool: " & escape(item.scalarContent)
355            )
356  
357  The following example for constructing from a YAML non-scalar is the actual
358  implementation of constructing ``seq`` types:
359  
360  .. code-block:: nim
361  
362      proc constructObject*[T](
363        ctx   : var ConstructionContext,
364        result: var seq[T],
365      ) {.raises: [YamlConstructionError, YamlStreamError].} =
366        ## constructs a Nim seq from a YAML sequence
367        let event = ctx.input.next()
368        if event.kind != yamlStartSeq:
369          raise ctx.input.constructionError(event.startPos, "Expected sequence start")
370        result = newSeq[T]()
371        while ctx.input.peek().kind != yamlEndSeq:
372          var item: T
373          ctx.constructChild(item)
374          result.add(move(item))
375        discard ctx.input.next()
376  
377  representObject
378  ---------------
379  
380  .. code-block:: nim
381  
382      proc representObject*(
383        ctx  : var SerializationContext,
384        value: MyObject,
385        tag  : Tag,
386      ): {.raises: [YamlSerializationError].}
387  
388  This proc should push a list of tokens that represent the type into the
389  serialization context via ``ctx.put``. Follow the following guidelines when
390  implementing a custom ``representObject`` proc:
391  
392  - Always output the first token with a ``yAnchorNone``. Anchors will be set
393    automatically by ``ref`` type handling.
394  - When outputting non-scalar types, you should use ``representChild`` for
395    contained values.
396  - Always use the ``tag`` parameter as tag for the first token you generate.
397  - Never write a ``representObject`` proc for ``ref`` types, instead write the
398    proc for the ref'd type.
399  
400  The following example for representing to a YAML scalar is the actual
401  implementation of representing ``int`` types:
402  
403  .. code-block:: nim
404  
405      proc representObject*[T: int8|int16|int32|int64](
406        ctx  : var SerializationContext,
407        value: T,
408        tag  : Tag,
409      ) {.raises: [].} =
410        ## represents an integer value as YAML scalar
411        ctx.put(scalarEvent($value, tag, yAnchorNone))
412  
413  The following example for representing to a YAML non-scalar is the actual
414  implementation of representing ``seq`` and ``set`` types:
415  
416  .. code-block:: nim
417  
418      proc representObject*[T](
419        ctx  : var SerializationContext,
420        value: seq[T]|set[T],
421        tag  : Tag,
422      ) {.raises: [YamlSerializationError].} =
423        ## represents a Nim seq as YAML sequence
424        ctx.put(startSeqEvent(tag = tag))
425        for item in value: ctx.representChild(item)
426        ctx.put(endSeqEvent())