From 3abbe4b5c683d247407050eaecaa70a2410c6b46 Mon Sep 17 00:00:00 2001 From: Kevin Day Date: Sun, 5 Mar 2023 21:48:22 -0600 Subject: [PATCH] Update: Further clarify quoting in FSS specifications. I looked at the code and realized I should instead favor the "typo" behavior. Process until the end of the line rather than break up the Object due to the quotes. This means if the quote is unterminated then the rest of the line is considerd the Object. Also document where Content utilizes the same rules. This invalidates some of the description in the commit 6b1720990df42b0024373776f41037b9331cc3cc. The two examples are now as follows: Example Object and Content pair following FSS-0001: Object "Content". The Content would be "Content". This behavior effectively preserves the period and also retains the quotes. Another example (FSS-0001): "Object 1" "Content 1" Content2 and_3 "Object 2": Content. This second row shows the "typo" is favored concept. The second row has no Content and instead has the following as the "Object: Object 2": Content. Note how the quotes are kept when this situation happens. I also updated the word "newline", replacing it with the words "new" and "line". I noticed and fixed a mistake in the logic due to this refactor and a similar previous refactor. --- specifications/fss-0000.txt | 20 +++++++++++--------- specifications/fss-0001.txt | 15 +++++++++------ specifications/fss-0002.txt | 14 +++++++------- specifications/fss-0003.txt | 26 +++++++++++++------------- specifications/fss-0004.txt | 4 ++-- specifications/fss-0005.txt | 20 +++++++++++--------- specifications/fss-0006.txt | 4 ++-- specifications/fss-0007.txt | 20 +++++++++++--------- specifications/fss-0009.txt | 18 ++++++++++-------- specifications/fss-000a.txt | 11 +++++++---- 10 files changed, 83 insertions(+), 69 deletions(-) diff --git a/specifications/fss-0000.txt b/specifications/fss-0000.txt index e00e36d..2f81518 100644 --- a/specifications/fss-0000.txt +++ b/specifications/fss-0000.txt @@ -12,24 +12,26 @@ Featureless Settings Specification: 0000 - Basic: Each Object starts at the beginning of a line and white space to the left of the Object is not treated as part of the object. White space separates an Object from the Content. - An Object may be preceded by a newline, in which case means that the Object has no Content. + An Object may be preceded by a new line character, in which case means that the Object has no Content. If only printing white spaces or non-printable characters follow a valid Object, then that Object is considered to have no Content. An Object may be quoted to include white space where a single quote character:"'" (unicode:"U+0027"), a double quote character:'"' (unicode:"U+0022"), or a backtick character:'`' (unicode:"U+0060") are used to quote. An Object is only considered quoted if the first and last character of the Object are the same quote. Any quote characters in a non-quoted Object are treated as part of the Object rather than as a quote. + An Object that properly starts with a quote character but is not properly terminated before the new line is reached is considered to be an Object terminating at the end of the line. + A quoted Objected terminating at the new line in this way preserves the quotes as part of the Object. Content exists on the same line as the Object. - Content is represented as a single Content column terminated by a newline. - Content column consists of everything following the first non-white space character until the newline. - Content column includes trailing white space before newline is reached. + Content is represented as a single Content column terminated by a new line. + Content column consists of everything following the first non-white space character until the new line. + Content column includes trailing white space before new line is reached. Content column does not include any of the leading white space. No delimits are supported in the Content. Key\: - code:"\s" = White space, except newline. - code:"\b" = Either white space or printable, except newline. + code:"\s" = White space, except new line. + code:"\b" = Either white space or printable, except new line. code:"\q" = Non-white space or quoted white space (and non-white space) with no white space outside of the quotes. - code:"\n" = Newline. + code:"\n" = New line. code:"*" = Zero or more occurrences. code:"+" = One or more occurrences. @@ -45,7 +47,7 @@ Featureless Settings Specification: 0000 - Basic: Example\: # fss-0000 # valid comments are ignored. - "The Object" Content until newline. + "The Object" Content until new line. Second object set. Object would be\: @@ -53,5 +55,5 @@ Featureless Settings Specification: 0000 - Basic: 2) Second Content would be\: - 1.1) Content until newline. + 1.1) Content until new line. 2.1) object set. diff --git a/specifications/fss-0001.txt b/specifications/fss-0001.txt index a366605..3046f47 100644 --- a/specifications/fss-0001.txt +++ b/specifications/fss-0001.txt @@ -12,22 +12,25 @@ Featureless Settings Specification: 0001 - Extended: Each Object starts at the beginning of a line and white space to the left of the Object is not treated as an object. White space separates an Object from the Content. - An Object may be followed by a newline, in which case means that the Object has no Content. + An Object may be followed by a new line, in which case means that the Object has no Content. If only printing white spaces or non-printable characters follow a valid Object, then that Object is considered to have no Content. An Object may be quoted to include white space where a single quote character:"'" (unicode:"U+0027"), a double quote character:'"' (unicode:"U+0022"), or a backtick character:'`' (unicode:"U+0060") are used to quote. An Object is only considered quoted if the first and last character of the Object are the same quote. Any quote characters in a non-quoted Object are treated as part of the Object rather than as a quote. + An Object that properly starts with a quote character but is not properly terminated before the new line is reached is considered to be an Object terminating at the end of the line. + A quoted Objected terminating at the new line in this way preserves the quotes as part of the Object. Content exists on the same line as the Object. Content is represented as multiple Content columns. - Content columns are white space separated parts within the Content and terminated by a newline. - Any number of Content columns may exist in the Content until the newline is reached. + Content columns are white space separated parts within the Content and terminated by a new line. + Any number of Content columns may exist in the Content until the new line is reached. + Content follows the same quoting rules as an Object. Key\: - code:"\s" = White space, except newline. - code:"\b" = Either white space or printable, except newline. + code:"\s" = White space, except new line. + code:"\b" = Either white space or printable, except new line. code:"\q" = Non-white space or quoted white space (and non-white space) with no white space outside of the quotes. - code:"\n" = Newline. + code:"\n" = New line. code:"*" = Zero or more occurrences. code:"+" = One or more occurrences. code:"()*" = Grouping that repeats zero or more times. diff --git a/specifications/fss-0002.txt b/specifications/fss-0002.txt index 550e9f5..79a55dc 100644 --- a/specifications/fss-0002.txt +++ b/specifications/fss-0002.txt @@ -11,23 +11,23 @@ Featureless Settings Specification: 0002 - Basic List: Each Object starts at the beginning of a line and white space to the left of the Object is not treated as an object. - A colon character:":" (unicode:"U+003A") followed by any white space until a newline terminates a valid Object. + A colon character:":" (unicode:"U+003A") followed by any white space until a new line terminates a valid Object. Non-white space printable characters may not follow the colon of a valid Object. Content is represented as a single Content column of every line following a valid object until the end of file (or string) or until the next valid Object is found. Any Content that could be interpreted as a valid Object must have the colon delimited. There is no single-quote, double-quote, or backtick delimitation in this specification. - Only the colon that would result in a valid Object can be delimited. + Only a colon character:":" (unicode:"U+003A") that would result in a valid Object can be delimited. Empty Objects are allowed, that is, the length of the object may be zero. Key\: - code:"\s" = White space, except newline. + code:"\s" = White space, except new line. code:"\o" = Any printable character, except unescaped character:":" (unicode:"U+003A"). code:"\l" = Any printable character or white space, except unescaped character:":" (unicode:"U+003A"). - code:"\c" = Either white space or printable, including newline, that not interpretable as an Object. - code:"\n" = Newline. + code:"\c" = Either white space or printable, including new line, that not interpretable as an Object. + code:"\n" = New line. code:"*" = Zero or more occurrences. Before Structure\: @@ -48,7 +48,7 @@ Featureless Settings Specification: 0002 - Basic List: This Does\\\: Second\: Continues until EOS/EOF. - All white space, including newline (and leading white space) is "part of content." + All white space, including new line (and leading white space) is "part of content." # Valid comments are still ignored. Objects would be\: @@ -60,4 +60,4 @@ Featureless Settings Specification: 0002 - Basic List: This: does not need to be delimited. This Does\: 2.1) Continues until EOS/EOF. - All white space, including newline (and leading white space) is "part of content." + All white space, including new line (and leading white space) is "part of content." diff --git a/specifications/fss-0003.txt b/specifications/fss-0003.txt index f8a436c..175e203 100644 --- a/specifications/fss-0003.txt +++ b/specifications/fss-0003.txt @@ -11,30 +11,30 @@ Featureless Settings Specification: 0003 - Extended List: Each Object starts at the beginning of a line and white space to the left of the Object is not treated as an object. - An open-brace character:"{" (unicode:"U+007B") followed by any white space until a newline terminates a possible valid Object. + An open-brace character:"{" (unicode:"U+007B") followed by any white space until a new line terminates a possible valid Object. An Object is not considered fully valid until a valid close-brace character:"}" (unicode:"U+007D") is found, designating the end of the Content. - Non-white space printable characters may not follow the open-brace of a valid Object. + Non-white space printable characters may not follow the open-brace character:"{" (unicode:"U+007B") of a valid Object. - Content is represented as a single Content column of every line following a valid object until the end of file (or string) or until a non-delimited close-brace character:"}". + Content is represented as a single Content column of every line following a valid object until the end of file (or string) or until a non-delimited close-brace character:"}" (unicode:"U+007D"). Any Content column that could be interpreted as an end of Content must be delimited if it should be part of the Content. - White space may follow a valid close-brace but a terminating newline must be present to designate a valid end of Content. + White space may follow a valid close-brace character:"}" (unicode:"U+007D") but a terminating new line must be present to designate a valid end of Content. There is no single-quote, double-quote, or backtick delimitation in this specification. - Only the open-brace that would result in a valid Object or the close-brace that would terminate valid Content can be delimited. - When inside potentially valid Content (which follows a valid Object) the open-brace cannot be delimited because this standard is not-recursive. + Only an open-brace character:"{" (unicode:"U+007B") that would result in a valid Object or the close-brace character:"}" (unicode:"U+007D") that would terminate valid Content can be delimited. + When inside potentially valid Content (which follows a valid Object) the open-brace character:"{" (unicode:"U+007B") cannot be delimited because this standard is not-recursive. When not inside any potentially valid Content (that is, there is no previous unclosed Object), then the Object may be delimited. - Likewise, the close-brace may only be delimited if it is within any potentially valid Content. + Likewise, the close-brace character:"}" (unicode:"U+007D") may only be delimited if it is within any potentially valid Content. - Each delimit slash in a delimitable open-brace is treated as a potential delimit such that two slashes represents a single delimited slash (code:"\\{" would represent code:"\{"). - Only the first delimit slash in a delimitable close-brace is treated as a potential delimit (code:"\\\}" would represent code:"\\}"). + Each delimit slash in a delimitable open-brace character:"{" (unicode:"U+007B") is treated as a potential delimit such that two slashes represents a single delimited slash (code:"\\{" would represent code:"\{"). + Only the first delimit slash in a delimitable close-brace character:"}" (unicode:"U+007D") is treated as a potential delimit (code:"\\\}" would represent code:"\\}"). Empty Objects are allowed, that is, the length of the object may be zero. Key\: - code:"\s" = White space, except newline. + code:"\s" = White space, except new line. code:"\o" = Any printable character, except unescaped character:"{" (unicode:"U+007B"). code:"\l" = Any printable character or white space, except unescaped character:"}" (unicode:"U+007D"). - code:"\c" = Either white space or printable, including newline, that is not interpretable as an Object. + code:"\c" = Either white space or printable, including new line, that is not interpretable as an Object. code:"\n" = Newline. code:"*" = Zero or more occurrences. @@ -58,7 +58,7 @@ Featureless Settings Specification: 0003 - Extended List: Second { Continues until EOS/EOF. - All white space, including newline (and leading white space) is "part of content." + All white space, including new line (and leading white space) is "part of content." # Valid comments are still ignored. } @@ -71,4 +71,4 @@ Featureless Settings Specification: 0003 - Extended List: This: does not need to be delimited. } 2.1) Continues until EOS/EOF. - All white space, including newline (and leading white space) is "part of content." + All white space, including new line (and leading white space) is "part of content." diff --git a/specifications/fss-0004.txt b/specifications/fss-0004.txt index 4fbc452..7a95921 100644 --- a/specifications/fss-0004.txt +++ b/specifications/fss-0004.txt @@ -25,7 +25,7 @@ Featureless Settings Specification: 0004 - Very Basic List: This Does\\\: Second\: Continues until EOS/EOF. - All white space, including newline (and leading white space) is "part of content." + All white space, including new line (and leading white space) is "part of content." # Valid comments are still ignored. Outer Objects would be\: @@ -46,4 +46,4 @@ Featureless Settings Specification: 0004 - Very Basic List: 1.3.1) Does\: 2.1.1) until EOS/EOF. - 2.2.1) white space, including newline (and leading white space) is "part of content." + 2.2.1) white space, including new line (and leading white space) is "part of content." diff --git a/specifications/fss-0005.txt b/specifications/fss-0005.txt index 0a82867..ccf253e 100644 --- a/specifications/fss-0005.txt +++ b/specifications/fss-0005.txt @@ -25,7 +25,7 @@ Featureless Settings Specification: 0005 - Somewhat Basic List: This Does\\\: Second\: Continues until EOS/EOF. - All white space, including newline (and leading white space) is "part of content." + All white space, including new line (and leading white space) is "part of content." # Valid comments are still ignored. Outer Objects would be\: @@ -59,11 +59,13 @@ Featureless Settings Specification: 0005 - Somewhat Basic List: 2.1.1) until 2.1.2) EOS/EOF. - 2.2.1) white space, - 2.2.2) including - 2.2.3) newline - 2.2.4) (and - 2.2.5) leading - 2.2.6) white space) - 2.2.7) is - 2.2.8) part of content. + 2.2.1) white + 2.2.2) space, + 2.2.3) including + 2.2.4) new + 2.2.5) line + 2.2.6) (and + 2.2.7) leading + 2.2.8) white space) + 2.2.9) is + 2.2.10) part of content. diff --git a/specifications/fss-0006.txt b/specifications/fss-0006.txt index 5716431..e14caad 100644 --- a/specifications/fss-0006.txt +++ b/specifications/fss-0006.txt @@ -27,7 +27,7 @@ Featureless Settings Specification: 0006 - Somewhat Extended List: Second { Continues until EOS/EOF. - All white space, including newline (and leading white space) is "part of content." + All white space, including new line (and leading white space) is "part of content." # Valid comments are still ignored. } @@ -49,4 +49,4 @@ Featureless Settings Specification: 0006 - Somewhat Extended List: 1.3.1) 2.1.1) until EOS/EOF. - 2.2.1) white space, including newline (and leading white space) is "part of content." + 2.2.1) white space, including new line (and leading white space) is "part of content." diff --git a/specifications/fss-0007.txt b/specifications/fss-0007.txt index fd2dd0f..3feb558 100644 --- a/specifications/fss-0007.txt +++ b/specifications/fss-0007.txt @@ -27,7 +27,7 @@ Featureless Settings Specification: 0007 - Very Extended List: Second { Continues until EOS/EOF. - All white space, including newline (and leading white space) is "part of content." + All white space, including new line (and leading white space) is "part of content." # Valid comments are still ignored. } @@ -62,11 +62,13 @@ Featureless Settings Specification: 0007 - Very Extended List: 2.1.1) until 2.1.2) EOS/EOF. - 2.2.1) white space, - 2.2.2) including - 2.2.3) newline - 2.2.4) (and - 2.2.5) leading - 2.2.6) white space) - 2.2.7) is - 2.2.8) part of content. + 2.2.1) white + 2.2.2) space, + 2.2.3) including + 2.2.4) new + 2.2.5) line + 2.2.6) (and + 2.2.7) leading + 2.2.8) white space) + 2.2.9) is + 2.2.10) part of content. diff --git a/specifications/fss-0009.txt b/specifications/fss-0009.txt index d447578..8472fa1 100644 --- a/specifications/fss-0009.txt +++ b/specifications/fss-0009.txt @@ -14,24 +14,26 @@ Featureless Settings Specification: 0009 - Reverse Mapping: Each Object starts at the end of a line and white space to the left of the Object is not treated as part of the object. White space separates an Object from the Content. - An Object may be preceded by a newline, in which case means that the Object has no Content. + An Object may be preceded by a new line, in which case means that the Object has no Content. If only printing white spaces or non-printable characters precedes a valid Object, then that Object is considered to have no Content. An Object may be quoted to include white space where a single quote character:"'" (unicode:"U+0027"), a double quote character:'"' (unicode:"U+0022"), or a backtick character:'`' (unicode:"U+0060") are used to quote. An Object is only considered quoted if the first and last character of the Object are the same quote. Any quote characters in a non-quoted Object are treated as part of the Object rather than as a quote. + An Object that properly starts with a quote character but is not properly terminated before the new line is reached is considered to be an Object terminating at the end of the line. + A quoted Objected terminating at the new line in this way preserves the quotes as part of the Object. Content exists on the same line as the Object. - Content is represented as a single Content column that begins at a newline. + Content is represented as a single Content column that begins at a new line. Content column consists of everything following the first non-white space character at the start of the line until the Object is reached. - Content column includes trailing white space before newline is reached. + Content column includes trailing white space before new line is reached. Content column does not include any of the white space between the last non-white space character and the start of the Object. No delimits are supported in the Content. Key\: - code:"\s" = White space, except newline. - code:"\b" = Either white space or printable, except newline. + code:"\s" = White space, except new line. + code:"\b" = Either white space or printable, except new line. code:"\q" = Non-white space or quoted white space (and non-white space) with no white space outside of the quotes. - code:"\n" = Newline. + code:"\n" = New line. code:"*" = Zero or more occurrences. code:"+" = One or more occurrences. @@ -47,7 +49,7 @@ Featureless Settings Specification: 0009 - Reverse Mapping: Example\: # fss-0009 # valid comments are ignored. - Content from newline. "The Object" + Content from new line. "The Object" object set. Second Object would be\: @@ -55,5 +57,5 @@ Featureless Settings Specification: 0009 - Reverse Mapping: 2) Second Content would be\: - 1.1) Content from newline. + 1.1) Content from new line. 2.1) object set. diff --git a/specifications/fss-000a.txt b/specifications/fss-000a.txt index 2a30d66..c5984ac 100644 --- a/specifications/fss-000a.txt +++ b/specifications/fss-000a.txt @@ -14,22 +14,25 @@ Featureless Settings Specification: 000A - Extended Reverse Mapping: Each Object starts at the end of a line and white space to the left of the Object is not treated as an object. White space separates an Object from the Content. - An Object may be followed by a newline, in which case means that the Object has no Content. + An Object may be followed by a new line, in which case means that the Object has no Content. If only printing white spaces or non-printable characters follow a valid Object, then that Object is considered to have no Content. An Object may be quoted to include white space where a single quote character:"'" (unicode:"U+0027"), a double quote character:'"' (unicode:"U+0022"), or a backtick character:'`' (unicode:"U+0060") are used to quote. An Object is only considered quoted if the first and last character of the Object are the same quote. Any quote characters in a non-quoted Object are treated as part of the Object rather than as a quote. + An Object that properly starts with a quote character but is not properly terminated before the new line is reached is considered to be an Object terminating at the end of the line. + A quoted Objected terminating at the new line in this way preserves the quotes as part of the Object. Content exists on the same line as the Object. Content is represented as multiple Content columns. Content columns are white space separated parts within the Content is terminated by the start of the Object. Any number of Content columns may exist in the Content until the Object is reached. + Content follows the same quoting rules as an Object. Key\: - code:"\s" = White space, except newline. - code:"\b" = Either white space or printable, except newline. + code:"\s" = White space, except new line. + code:"\b" = Either white space or printable, except new line. code:"\q" = Non-white space or quoted white space (and non-white space) with no white space outside of the quotes. - code:"\n" = Newline. + code:"\n" = New line. code:"*" = Zero or more occurrences. code:"+" = One or more occurrences. code:"()*" = Grouping that repeats zero or more times. -- 1.8.3.1