Formats
From DKPLP Doc
Formats make up a part of pattern sets and decides how the captured groups from matched rows should be rearranged.
Contents |
[edit]
BNF
The following BNF describes the syntax.
format ::= <group> <moreGroups>
group ::= "$" <number> <defaultValue>
| <moreCharacters>
| empty
moreGroups ::= "|" <group> <moreGroups>
| empty
defaultValue ::= "!" <character> <moreCharacters>
| empty
character ::= [^#|!]
| "#" [.]
moreCharacters ::= <character> <moreCharacters>
| empty
number ::= <digit> <moreDigits>
digit ::= [0-9]
moreDigits ::= digit moreDigits
| empty
[edit]
Special characters
- $ - Signifies the start of a group.
- # - Escape character.
- | - Group delimiter.
- ! - Construct for optional values. The value that follows is used in case the group was not captured.
[edit]
Examples
[edit]
Text
Lets say a the following regular expression is applied on the following textual string and then processed through a format.
Regular expression
(.*?)\s(.*?)
String
Hello world
Captured groups
1: Hello 2: world
Formats and results
$1|$2 1: Hello 2: world
$2|$1 1: world 2: Hello
$1|foo 1: Hello 2: foo
$2 1: world
[edit]
XML
Formats can also use optional constructs, they can define a value that should be used in case an optional capture fails. They are only used for XML patterns at the moment. The formats otherwise work in the same way as the text formats.
XML contents
<message>
<word1>Hello<word1>
</message>
XML pattern
message:word1<(.+)>"word2<(.+)"?
Captured groups
1: Hello 2: null
Formats and results
$1|$2 Causes an error because $2 is not captured
$1|$2!foo 1: Hello 2: foo
