-
Notifications
You must be signed in to change notification settings - Fork 12
/
Copy pathyaml-spec-1.2.txt
9467 lines (8352 loc) · 387 KB
/
yaml-spec-1.2.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
YAML Ain’t Markup Language (YAML™) Version 1.2
3^rd Edition, Patched at 2009-10-01
Oren Ben-Kiki
<[1][email protected]>
Clark Evans
<[2][email protected]>
Ingy döt Net
<[3][email protected]>
Latest (patched) version:
HTML: [4]http://yaml.org/spec/1.2/spec.html
PDF: [5]http://yaml.org/spec/1.2/spec.pdf
PS: [6]http://yaml.org/spec/1.2/spec.ps
Errata: [7]http://yaml.org/spec/1.2/errata.html
Previous (original) version:
[8]http://yaml.org/spec/1.2/2009-07-21/spec.html
Copyright © 2001-2009 Oren Ben-Kiki, Clark Evans, Ingy döt Net
This document may be freely copied, provided it is not modified.
Status of this Document
This document reflects the third version of YAML data serialization
language. The content of the specification was arrived at by consensus
of its authors and through user feedback on the [9]yaml-core mailing
list. We encourage implementers to please update their software with
support for this version.
The primary objective of this revision is to bring YAML into compliance
with JSON as an official subset. YAML 1.2 is compatible with 1.1 for
most practical applications - this is a minor revision. An expected
source of incompatibility with prior versions of YAML, especially the
syck implementation, is the change in implicit typing rules. We have
removed unique implicit typing rules and have updated these rules to
align them with JSON's productions. In this version of YAML, boolean
values may be serialized as “true” or “false”; the empty scalar as
“null”. Unquoted numeric values are a superset of JSON's numeric
production. Other changes in the specification were the removal of the
Unicode line breaks and production bug fixes. We also define 3 built-in
implicit typing rule sets: untyped, strict JSON, and a more flexible
YAML rule set that extends JSON typing.
The difference between late 1.0 drafts which syck 0.55 implements and
the 1.1 revision of this specification is much more extensive. We fixed
usability issues with the tagging syntax. In particular, the single
exclamation was re-defined for private types and a simple prefixing
mechanism was introduced. This revision also fixed many production edge
cases and introduced a type repository. Therefore, there are several
incompatibilities between syck and this revision as well.
The list of known errors in this specification is available at
[10]http://yaml.org/spec/1.2/errata.html. Please report errors in this
document to the [11]yaml-core mailing list. This revision contains
fixes for all errors known as of 2009-10-01.
We wish to thank implementers who have tirelessly tracked earlier
versions of this specification, and our fabulous user community whose
feedback has both validated and clarified our direction.
Abstract
YAML™ (rhymes with “camel”) is a human-friendly, cross language,
Unicode based data serialization language designed around the common
native data types of agile programming languages. It is broadly useful
for programming needs ranging from configuration files to Internet
messaging to object persistence to data auditing. Together with the
[12]Unicode standard for characters, this specification provides all
the information necessary to understand YAML Version 1.2 and to create
programs that process YAML information.
__________________________________________________________________
Table of Contents
[13]1. Introduction
[14]1.1. Goals
[15]1.2. Prior Art
[16]1.3. Relation to JSON
[17]1.4. Relation to XML
[18]1.5. Terminology
[19]2. Preview
[20]2.1. Collections
[21]2.2. Structures
[22]2.3. Scalars
[23]2.4. Tags
[24]2.5. Full Length Example
[25]3. Processing YAML Information
[26]3.1. Processes
[27]3.1.1. Dump
[28]3.1.2. Load
[29]3.2. Information Models
[30]3.2.1. Representation Graph
[31]3.2.1.1. Nodes
[32]3.2.1.2. Tags
[33]3.2.1.3. Node Comparison
[34]3.2.2. Serialization Tree
[35]3.2.2.1. Keys Order
[36]3.2.2.2. Anchors and Aliases
[37]3.2.3. Presentation Stream
[38]3.2.3.1. Node Styles
[39]3.2.3.2. Scalar Formats
[40]3.2.3.3. Comments
[41]3.2.3.4. Directives
[42]3.3. Loading Failure Points
[43]3.3.1. Well-Formed Streams and Identified Aliases
[44]3.3.2. Resolved Tags
[45]3.3.3. Recognized and Valid Tags
[46]3.3.4. Available Tags
[47]4. Syntax Conventions
[48]4.1. Production Parameters
[49]4.2. Production Naming Conventions
[50]5. Characters
[51]5.1. Character Set
[52]5.2. Character Encodings
[53]5.3. Indicator Characters
[54]5.4. Line Break Characters
[55]5.5. White Space Characters
[56]5.6. Miscellaneous Characters
[57]5.7. Escaped Characters
[58]6. Basic Structures
[59]6.1. Indentation Spaces
[60]6.2. Separation Spaces
[61]6.3. Line Prefixes
[62]6.4. Empty Lines
[63]6.5. Line Folding
[64]6.6. Comments
[65]6.7. Separation Lines
[66]6.8. Directives
[67]6.8.1. “YAML” Directives
[68]6.8.2. “TAG” Directives
[69]6.8.2.1. Tag Handles
[70]6.8.2.2. Tag Prefixes
[71]6.9. Node Properties
[72]6.9.1. Node Tags
[73]6.9.2. Node Anchors
[74]7. Flow Styles
[75]7.1. Alias Nodes
[76]7.2. Empty Nodes
[77]7.3. Flow Scalar Styles
[78]7.3.1. Double-Quoted Style
[79]7.3.2. Single-Quoted Style
[80]7.3.3. Plain Style
[81]7.4. Flow Collection Styles
[82]7.4.1. Flow Sequences
[83]7.4.2. Flow Mappings
[84]7.5. Flow Nodes
[85]8. Block Styles
[86]8.1. Block Scalar Styles
[87]8.1.1. Block Scalar Headers
[88]8.1.1.1. Block Indentation Indicator
[89]8.1.1.2. Block Chomping Indicator
[90]8.1.2. Literal Style
[91]8.1.3. Folded Style
[92]8.2. Block Collection Styles
[93]8.2.1. Block Sequences
[94]8.2.2. Block Mappings
[95]8.2.3. Block Nodes
[96]9. YAML Character Stream
[97]9.1. Documents
[98]9.1.1. Document Prefix
[99]9.1.2. Document Markers
[100]9.1.3. Bare Documents
[101]9.1.4. Explicit Documents
[102]9.1.5. Directives Documents
[103]9.2. Streams
[104]10. Recommended Schemas
[105]10.1. Failsafe Schema
[106]10.1.1. Tags
[107]10.1.1.1. Generic Mapping
[108]10.1.1.2. Generic Sequence
[109]10.1.1.3. Generic String
[110]10.1.2. Tag Resolution
[111]10.2. JSON Schema
[112]10.2.1. Tags
[113]10.2.1.1. Null
[114]10.2.1.2. Boolean
[115]10.2.1.3. Integer
[116]10.2.1.4. Floating Point
[117]10.2.2. Tag Resolution
[118]10.3. Core Schema
[119]10.3.1. Tags
[120]10.3.2. Tag Resolution
[121]10.4. Other Schemas
[122]Index
Chapter 1. Introduction
“YAML Ain’t Markup Language” (abbreviated YAML) is a data serialization
language designed to be human-friendly and work well with modern
programming languages for common everyday tasks. This specification is
both an introduction to the YAML language and the concepts supporting
it, and also a complete specification of the information needed to
develop [123]applications for processing YAML.
Open, interoperable and readily understandable tools have advanced
computing immensely. YAML was designed from the start to be useful and
friendly to people working with data. It uses Unicode [124]printable
characters, [125]some of which provide structural information and the
rest containing the data itself. YAML achieves a unique cleanness by
minimizing the amount of structural characters and allowing the data to
show itself in a natural and meaningful way. For example,
[126]indentation may be used for structure, [127]colons separate
[128]key: value pairs, and [129]dashes are used to create “bullet”
[130]lists.
There are myriad flavors of [131]data structures, but they can all be
adequately [132]represented with three basic primitives: [133]mappings
(hashes/dictionaries), [134]sequences (arrays/lists) and [135]scalars
(strings/numbers). YAML leverages these primitives, and adds a simple
typing system and [136]aliasing mechanism to form a complete language
for [137]serializing any [138]native data structure. While most
programming languages can use YAML for data serialization, YAML excels
in working with those languages that are fundamentally built around the
three basic primitives. These include the new wave of agile languages
such as Perl, Python, PHP, Ruby, and Javascript.
There are hundreds of different languages for programming, but only a
handful of languages for storing and transferring data. Even though its
potential is virtually boundless, YAML was specifically created to work
well for common use cases such as: configuration files, log files,
interprocess messaging, cross-language data sharing, object
persistence, and debugging of complex data structures. When data is
easy to view and understand, programming becomes a simpler task.
1.1. Goals
The design goals for YAML are, in decreasing priority:
1. YAML is easily readable by humans.
2. YAML data is portable between programming languages.
3. YAML matches the [139]native data structures of agile languages.
4. YAML has a consistent model to support generic tools.
5. YAML supports one-pass processing.
6. YAML is expressive and extensible.
7. YAML is easy to implement and use.
1.2. Prior Art
YAML’s initial direction was set by the data serialization and markup
language discussions among [140]SML-DEV members. Later on, it directly
incorporated experience from Ingy döt Net’s Perl module
[141]Data::Denter. Since then, YAML has matured through ideas and
support from its user community.
YAML integrates and builds upon concepts described by [142]C,
[143]Java, [144]Perl, [145]Python, [146]Ruby, [147]RFC0822 (MAIL),
[148]RFC1866 (HTML), [149]RFC2045 (MIME), [150]RFC2396 (URI), [151]XML,
[152]SAX, [153]SOAP, and [154]JSON.
The syntax of YAML was motivated by Internet Mail (RFC0822) and remains
partially compatible with that standard. Further, borrowing from MIME
(RFC2045), YAML’s top-level production is a [155]stream of independent
[156]documents, ideal for message-based distributed processing systems.
YAML’s [157]indentation-based scoping is similar to Python’s (without
the ambiguities caused by [158]tabs). [159]Indented blocks facilitate
easy inspection of the data’s structure. YAML’s [160]literal style
leverages this by enabling formatted text to be cleanly mixed within an
[161]indented structure without troublesome [162]escaping. YAML also
allows the use of traditional [163]indicator-based scoping similar to
JSON’s and Perl’s. Such [164]flow content can be freely nested inside
[165]indented blocks.
YAML’s [166]double-quoted style uses familiar C-style [167]escape
sequences. This enables ASCII encoding of non-[168]printable or 8-bit
(ISO 8859-1) characters such as [169]“\x3B”. Non-[170]printable 16-bit
Unicode and 32-bit (ISO/IEC 10646) characters are supported with
[171]escape sequences such as [172]“\u003B” and [173]“\U0000003B”.
Motivated by HTML’s end-of-line normalization, YAML’s [174]line folding
employs an intuitive method of handling [175]line breaks. A single
[176]line break is [177]folded into a single [178]space, while
[179]empty lines are interpreted as [180]line break characters. This
technique allows for paragraphs to be word-wrapped without affecting
the [181]canonical form of the [182]scalar content.
YAML’s core type system is based on the requirements of agile languages
such as Perl, Python, and Ruby. YAML directly supports both
[183]collections ([184]mappings, [185]sequences) and [186]scalars.
Support for these common types enables programmers to use their
language’s [187]native data structures for YAML manipulation, instead
of requiring a special document object model (DOM).
Like XML’s SOAP, YAML supports [188]serializing a graph of [189]native
data structures through an [190]aliasing mechanism. Also like SOAP,
YAML provides for [191]application-defined [192]types. This allows YAML
to [193]represent rich data structures required for modern distributed
computing. YAML provides globally unique [194]type names using a
namespace mechanism inspired by Java’s DNS-based package naming
convention and XML’s URI-based namespaces. In addition, YAML allows for
private [195]types specific to a single [196]application.
YAML was designed to support incremental interfaces that include both
input (“getNextEvent()”) and output (“sendNextEvent()”) one-pass
interfaces. Together, these enable YAML to support the processing of
large [197]documents (e.g. transaction logs) or continuous [198]streams
(e.g. feeds from a production machine).
1.3. Relation to JSON
Both JSON and YAML aim to be human readable data interchange formats.
However, JSON and YAML have different priorities. JSON’s foremost
design goal is simplicity and universality. Thus, JSON is trivial to
generate and parse, at the cost of reduced human readability. It also
uses a lowest common denominator information model, ensuring any JSON
data can be easily processed by every modern programming environment.
In contrast, YAML’s foremost design goals are human readability and
support for [199]serializing arbitrary [200]native data structures.
Thus, YAML allows for extremely readable files, but is more complex to
generate and parse. In addition, YAML ventures beyond the lowest common
denominator data types, requiring more complex processing when crossing
between different programming environments.
YAML can therefore be viewed as a natural superset of JSON, offering
improved human readability and a more complete information model. This
is also the case in practice; every JSON file is also a valid YAML
file. This makes it easy to migrate from JSON to YAML if/when the
additional features are required.
JSON's [201]RFC4627 requires that [202]mappings [203]keys merely
“SHOULD” be [204]unique, while YAML insists they “MUST” be.
Technically, YAML therefore complies with the JSON spec, choosing to
treat duplicates as an error. In practice, since JSON is silent on the
semantics of such duplicates, the only portable JSON files are those
with unique keys, which are therefore valid YAML files.
It may be useful to define a intermediate format between YAML and JSON.
Such a format would be trivial to parse (but not very human readable),
like JSON. At the same time, it would allow for [205]serializing
arbitrary [206]native data structures, like YAML. Such a format might
also serve as YAML’s "canonical format". Defining such a “YSON” format
(YSON is a Serialized Object Notation) can be done either by enhancing
the JSON specification or by restricting the YAML specification. Such a
definition is beyond the scope of this specification.
1.4. Relation to XML
Newcomers to YAML often search for its correlation to the eXtensible
Markup Language (XML). Although the two languages may actually compete
in several application domains, there is no direct correlation between
them.
YAML is primarily a data serialization language. XML was designed to be
backwards compatible with the Standard Generalized Markup Language
(SGML), which was designed to support structured documentation. XML
therefore had many design constraints placed on it that YAML does not
share. XML is a pioneer in many domains, YAML is the result of lessons
learned from XML and other technologies.
It should be mentioned that there are ongoing efforts to define
standard XML/YAML mappings. This generally requires that a subset of
each language be used. For more information on using both XML and YAML,
please visit [207]http://yaml.org/xml.
1.5. Terminology
This specification uses key words based on [208]RFC2119 to indicate
requirement level. In particular, the following words are used to
describe the actions of a YAML [209]processor:
May
The word may, or the adjective optional, mean that conforming
YAML [210]processors are permitted to, but need not behave as
described.
Should
The word should, or the adjective recommended, mean that there
could be reasons for a YAML [211]processor to deviate from the
behavior described, but that such deviation could hurt
interoperability and should therefore be advertised with
appropriate notice.
Must
The word must, or the term required or shall, mean that the
behavior described is an absolute requirement of the
specification.
The rest of this document is arranged as follows. Chapter [212]2
provides a short preview of the main YAML features. Chapter [213]3
describes the YAML information model, and the processes for converting
from and to this model and the YAML text format. The bulk of the
document, chapters [214]4 through [215]9, formally define this text
format. Finally, chapter [216]10 recommends basic YAML schemas.
Chapter 2. Preview
This section provides a quick glimpse into the expressive power of
YAML. It is not expected that the first-time reader grok all of the
examples. Rather, these selections are used as motivation for the
remainder of the specification.
2.1. Collections
YAML’s [217]block collections use [218]indentation for scope and begin
each entry on its own line. [219]Block sequences indicate each entry
with a dash and space ( [220]“- ”). [221]Mappings use a colon and
space ([222]“: ”) to mark each [223]key: value pair. [224]Comments
begin with an octothorpe (also called a “hash”, “sharp”, “pound”, or
“number sign” - [225]“#”).
Example 2.1. Sequence of Scalars
(ball players)
- Mark McGwire
- Sammy Sosa
- Ken Griffey
Example 2.2. Mapping Scalars to Scalars
(player statistics)
hr: 65 # Home runs
avg: 0.278 # Batting average
rbi: 147 # Runs Batted In
Example 2.3. Mapping Scalars to Sequences
(ball clubs in each league)
american:
- Boston Red Sox
- Detroit Tigers
- New York Yankees
national:
- New York Mets
- Chicago Cubs
- Atlanta Braves
Example 2.4. Sequence of Mappings
(players’ statistics)
-
name: Mark McGwire
hr: 65
avg: 0.278
-
name: Sammy Sosa
hr: 63
avg: 0.288
YAML also has [226]flow styles, using explicit [227]indicators rather
than [228]indentation to denote scope. The [229]flow sequence is
written as a [230]comma separated list within [231]square
[232]brackets. In a similar manner, the [233]flow mapping uses
[234]curly [235]braces.
Example 2.5. Sequence of Sequences
- [name , hr, avg ]
- [Mark McGwire, 65, 0.278]
- [Sammy Sosa , 63, 0.288]
Example 2.6. Mapping of Mappings
Mark McGwire: {hr: 65, avg: 0.278}
Sammy Sosa: {
hr: 63,
avg: 0.288
}
2.2. Structures
YAML uses three dashes ([236]“---”) to separate [237]directives from
[238]document [239]content. This also serves to signal the start of a
document if no [240]directives are present. Three dots ( [241]“...”)
indicate the end of a document without starting a new one, for use in
communication channels.
Example 2.7. Two Documents in a Stream
(each with a leading comment)
# Ranking of 1998 home runs
---
- Mark McGwire
- Sammy Sosa
- Ken Griffey
# Team ranking
---
- Chicago Cubs
- St Louis Cardinals
Example 2.8. Play by Play Feed
from a Game
---
time: 20:03:20
player: Sammy Sosa
action: strike (miss)
...
---
time: 20:03:47
player: Sammy Sosa
action: grand slam
...
Repeated [242]nodes (objects) are first [243]identified by an
[244]anchor (marked with the ampersand - [245]“&”), and are then
[246]aliased (referenced with an asterisk - [247]“*”) thereafter.
Example 2.9. Single Document with
Two Comments
---
hr: # 1998 hr ranking
- Mark McGwire
- Sammy Sosa
rbi:
# 1998 rbi ranking
- Sammy Sosa
- Ken Griffey
Example 2.10. Node for “Sammy Sosa”
appears twice in this document
---
hr:
- Mark McGwire
# Following node labeled SS
- &SS Sammy Sosa
rbi:
- *SS # Subsequent occurrence
- Ken Griffey
A question mark and space ([248]“? ”) indicate a complex [249]mapping
[250]key. Within a [251]block collection, [252]key: value pairs can
start immediately following the [253]dash, [254]colon, or [255]question
mark.
Example 2.11. Mapping between Sequences
? - Detroit Tigers
- Chicago cubs
:
- 2001-07-23
? [ New York Yankees,
Atlanta Braves ]
: [ 2001-07-02, 2001-08-12,
2001-08-14 ]
Example 2.12. Compact Nested Mapping
---
# Products purchased
- item : Super Hoop
quantity: 1
- item : Basketball
quantity: 4
- item : Big Shoes
quantity: 1
2.3. Scalars
[256]Scalar content can be written in [257]block notation, using a
[258]literal style (indicated by [259]“|”) where all [260]line breaks
are significant. Alternatively, they can be written with the
[261]folded style [262](denoted by “>”) where each [263]line break is
[264]folded to a [265]space unless it ends an [266]empty or a
[267]more-indented line.
Example 2.13. In literals,
newlines are preserved
# ASCII Art
--- |
\//||\/||
// || ||__
Example 2.14. In the folded scalars,
newlines become spaces
--- >
Mark McGwire's
year was crippled
by a knee injury.
Example 2.15. Folded newlines are preserved
for "more indented" and blank lines
>
Sammy Sosa completed another
fine season with great stats.
63 Home Runs
0.288 Batting Average
What a year!
Example 2.16. Indentation determines scope
name: Mark McGwire
accomplishment: >
Mark set a major league
home run record in 1998.
stats: |
65 Home Runs
0.278 Batting Average
YAML’s [268]flow scalars include the [269]plain style (most examples
thus far) and two quoted styles. The [270]double-quoted style provides
[271]escape sequences. The [272]single-quoted style is useful when
[273]escaping is not needed. All [274]flow scalars can span multiple
lines; [275]line breaks are always [276]folded.
Example 2.17. Quoted Scalars
unicode: "Sosa did fine.\u263A"
control: "\b1998\t1999\t2000\n"
hex esc: "\x0d\x0a is \r\n"
single: '"Howdy!" he cried.'
quoted: ' # Not a ''comment''.'
tie-fighter: '|\-*-/|'
Example 2.18. Multi-line Flow Scalars
plain:
This unquoted scalar
spans many lines.
quoted: "So does this
quoted scalar.\n"
2.4. Tags
In YAML, [277]untagged nodes are given a type depending on the
[278]application. The examples in this specification generally use the
[279]seq, [280]map and [281]str types from the [282]fail safe schema. A
few examples also use the [283]int, [284]float, and [285]null types
from the [286]JSON schema. The [287]repository includes additional
types such as [288]binary, [289]omap, [290]set and others.
Example 2.19. Integers
canonical: 12345
decimal: +12345
octal: 0o14
hexadecimal: 0xC
Example 2.20. Floating Point
canonical: 1.23015e+3
exponential: 12.3015e+02
fixed: 1230.15
negative infinity: -.inf
not a number: .NaN
Example 2.21. Miscellaneous
null:
booleans: [ true, false ]
string: '012345'
Example 2.22. Timestamps
canonical: 2001-12-15T02:59:43.1Z
iso8601: 2001-12-14t21:59:43.10-05:00
spaced: 2001-12-14 21:59:43.10 -5
date: 2002-12-14
Explicit typing is denoted with a [291]tag using the exclamation point
([292]“!”) symbol. [293]Global tags are URIs and may be specified in a
[294]tag shorthand notation using a [295]handle.
[296]Application-specific [297]local tags may also be used.
Example 2.23. Various Explicit Tags
---
not-date: !!str 2002-04-28
picture: !!binary |
R0lGODlhDAAMAIQAAP//9/X
17unp5WZmZgAAAOfn515eXv
Pz7Y6OjuDg4J+fn5OTk6enp
56enmleECcgggoBADs=
application specific tag: !something |
The semantics of the tag
above may be different for
different documents.
Example 2.24. Global Tags
%TAG ! tag:clarkevans.com,2002:
--- !shape
# Use the ! handle for presenting
# tag:clarkevans.com,2002:circle
- !circle
center: &ORIGIN {x: 73, y: 129}
radius: 7
- !line
start: *ORIGIN
finish: { x: 89, y: 102 }
- !label
start: *ORIGIN
color: 0xFFEEBB
text: Pretty vector drawing.
Example 2.25. Unordered Sets
# Sets are represented as a
# Mapping where each key is
# associated with a null value
--- !!set
? Mark McGwire
? Sammy Sosa
? Ken Griff
Example 2.26. Ordered Mappings
# Ordered maps are represented as
# A sequence of mappings, with
# each mapping having one key
--- !!omap
- Mark McGwire: 65
- Sammy Sosa: 63
- Ken Griffy: 58
2.5. Full Length Example
Below are two full-length examples of YAML. On the left is a sample
invoice; on the right is a sample log file.
Example 2.27. Invoice
--- !<tag:clarkevans.com,2002:invoice>
invoice: 34843
date : 2001-01-23
bill-to: &id001
given : Chris
family : Dumars
address:
lines: |
458 Walkman Dr.
Suite #292
city : Royal Oak
state : MI
postal : 48046
ship-to: *id001
product:
- sku : BL394D
quantity : 4
description : Basketball
price : 450.00
- sku : BL4438H
quantity : 1
description : Super Hoop
price : 2392.00
tax : 251.42
total: 4443.52
comments:
Late afternoon is best.
Backup contact is Nancy
Billsmer @ 338-4338.
Example 2.28. Log File
---
Time: 2001-11-23 15:01:42 -5
User: ed
Warning:
This is an error message
for the log file
---
Time: 2001-11-23 15:02:31 -5
User: ed
Warning:
A slightly different error
message.
---
Date: 2001-11-23 15:03:17 -5
User: ed
Fatal:
Unknown variable "bar"
Stack:
- file: TopClass.py
line: 23
code: |
x = MoreObject("345\n")
- file: MoreClass.py
line: 58
code: |-
foo = bar
Chapter 3. Processing YAML Information
YAML is both a text format and a method for [298]presenting any
[299]native data structure in this format. Therefore, this
specification defines two concepts: a class of data objects called YAML
[300]representations, and a syntax for [301]presenting YAML
[302]representations as a series of characters, called a YAML
[303]stream. A YAML processor is a tool for converting information
between these complementary views. It is assumed that a YAML processor
does its work on behalf of another module, called an application. This
chapter describes the information structures a YAML processor must
provide to or obtain from the application.
YAML information is used in two ways: for machine processing, and for
human consumption. The challenge of reconciling these two perspectives
is best done in three distinct translation stages: [304]representation,
[305]serialization, and [306]presentation. [307]Representation
addresses how YAML views [308]native data structures to achieve
portability between programming environments. [309]Serialization
concerns itself with turning a YAML [310]representation into a serial
form, that is, a form with sequential access constraints.
[311]Presentation deals with the formatting of a YAML
[312]serialization as a series of characters in a human-friendly
manner.
3.1. Processes
Translating between [313]native data structures and a character
[314]stream is done in several logically distinct stages, each with a
well defined input and output data model, as shown in the following
diagram:
Figure 3.1. Processing Overview
Processing Overview
A YAML processor need not expose the [315]serialization or
[316]representation stages. It may translate directly between
[317]native data structures and a character [318]stream ([319]dump and
[320]load in the diagram above). However, such a direct translation
should take place so that the [321]native data structures are
[322]constructed only from information available in the
[323]representation. In particular, [324]mapping key order,
[325]comments, and [326]tag handles should not be referenced during
[327]composition.
3.1.1. Dump
Dumping native data structures to a character [328]stream is done using
the following three stages:
Representing Native Data Structures
YAML represents any native data structure using three [329]node
kinds: [330]sequence - an ordered series of entries;
[331]mapping - an unordered association of [332]unique [333]keys
to [334]values; and [335]scalar - any datum with opaque
structure presentable as a series of Unicode characters.
Combined, these primitives generate directed graph structures.
These primitives were chosen because they are both powerful and
familiar: the [336]sequence corresponds to a Perl array and a
Python list, the [337]mapping corresponds to a Perl hash table
and a Python dictionary. The [338]scalar represents strings,
integers, dates, and other atomic data types.
Each YAML [339]node requires, in addition to its [340]kind and
[341]content, a [342]tag specifying its data type. Type
specifiers are either [343]global URIs, or are [344]local in
scope to a single [345]application. For example, an integer is
represented in YAML with a [346]scalar plus the [347]global tag
“tag:yaml.org,2002:int”. Similarly, an invoice object,
particular to a given organization, could be represented as a
[348]mapping together with the [349]local tag “!invoice”. This
simple model can represent any data structure independent of
programming language.
Serializing the Representation Graph
For sequential access mediums, such as an event callback API, a
YAML [350]representation must be serialized to an ordered tree.
Since in a YAML [351]representation, [352]mapping keys are
unordered and [353]nodes may be referenced more than once (have
more than one incoming “arrow”), the serialization process is
required to impose an [354]ordering on the [355]mapping keys and
to replace the second and subsequent references to a given
[356]node with place holders called [357]aliases. YAML does not
specify how these serialization details are chosen. It is up to
the YAML [358]processor to come up with human-friendly [359]key
order and [360]anchor names, possibly with the help of the
[361]application. The result of this process, a YAML
[362]serialization tree, can then be traversed to produce a
series of event calls for one-pass processing of YAML data.
Presenting the Serialization Tree
The final output process is presenting the YAML
[363]serializations as a character [364]stream in a
human-friendly manner. To maximize human readability, YAML
offers a rich set of stylistic options which go far beyond the
minimal functional needs of simple data storage. Therefore the
YAML [365]processor is required to introduce various
presentation details when creating the [366]stream, such as the
choice of [367]node styles, how to [368]format scalar content,
the amount of [369]indentation, which [370]tag handles to use,
the [371]node tags to leave [372]unspecified, the set of
[373]directives to provide and possibly even what [374]comments
to add. While some of this can be done with the help of the
[375]application, in general this process should be guided by
the preferences of the user.
3.1.2. Load
Loading [376]native data structures from a character [377]stream is
done using the following three stages:
Parsing the Presentation Stream
Parsing is the inverse process of [378]presentation, it takes a
[379]stream of characters and produces a series of events.
Parsing discards all the [380]details introduced in the
[381]presentation process, reporting only the [382]serialization
events. Parsing can fail due to [383]ill-formed input.
Composing the Representation Graph
Composing takes a series of [384]serialization events and
produces a [385]representation graph. Composing discards all the
[386]details introduced in the [387]serialization process,
producing only the [388]representation graph. Composing can fail
due to any of several reasons, detailed [389]below.
Constructing Native Data Structures
The final input process is constructing [390]native data
structures from the YAML [391]representation. Construction must
be based only on the information available in the
[392]representation, and not on additional [393]serialization or
[394]presentation details such as [395]comments,
[396]directives, [397]mapping key order, [398]node styles,
[399]scalar content format, [400]indentation levels etc.
Construction can fail due to the [401]unavailability of the
required [402]native data types.
3.2. Information Models
This section specifies the formal details of the results of the above
processes. To maximize data portability between programming languages
and implementations, users of YAML should be mindful of the distinction
between [403]serialization or [404]presentation properties and those
which are part of the YAML [405]representation. Thus, while imposing a
[406]order on [407]mapping keys is necessary for flattening YAML
[408]representations to a sequential access medium, this
[409]serialization detail must not be used to convey [410]application
level information. In a similar manner, while [411]indentation
technique and a choice of a [412]node style are needed for the human
readability, these [413]presentation details are neither part of the
YAML [414]serialization nor the YAML [415]representation. By carefully
separating properties needed for [416]serialization and
[417]presentation, YAML [418]representations of [419]application
information will be consistent and portable between various programming
environments.
The following diagram summarizes the three information models. Full
arrows denote composition, hollow arrows denote inheritance, “1” and
“*” denote “one” and “many” relationships. A single “+” denotes
[420]serialization details, a double “++” denotes [421]presentation
details.
Figure 3.2. Information Models
Information Models
3.2.1. Representation Graph
YAML’s representation of [422]native data structure is a rooted,
connected, directed graph of [423]tagged [424]nodes. By “directed
graph” we mean a set of [425]nodes and directed edges (“arrows”), where
each edge connects one [426]node to another (see [427]a formal
definition). All the [428]nodes must be reachable from the root node
via such edges. Note that the YAML graph may include cycles, and a
[429]node may have more than one incoming edge.