The "or", (|) in BNF Grammar

Question:

I can’t seem to fully understand the application of the "or" in BNF Grammar which is denoted by the vertical bar symbol (|). A good example of what gets me confused is the description of string literals in The Python Language Reference. (I’ve deleted part of the description which is irrelevant to the question):

stringliteral   ::=  [stringprefix](shortstring | longstring)
shortstring     ::=  "'" shortstringitem* "'" | '"' shortstringitem* '"'
shortstringitem ::=  shortstringchar | stringescapeseq
shortstringchar ::=  <any source character except "" or newline or the quote>
stringescapeseq ::=  "" <any source character>

So, the way I understand the description of <shortstringitem> is that it can be <shortstringchar> OR <stringecapeseq>. Does this mean it cannot be both at the same time? If I am not mistaken a single string may contain both at the same time… (For clarity <shortstingchar> as I understand it is the text of my string)

Thank you.

Searched the web, including stackoverflow and watched explanatory videos but all seem to describe the "or" with something like:

<letter> ::= A|B|C|D|E...Y|Z.

Without going in too deep with the examples… Unfortunately this does not answer my question.

Asked By: user21488634

||

Answers:

One shortstringitem can only be one or the other. But a shortstring can consist of multiple shortstringitems, each of which is "expanded" independently.

Consider 'xn', for example, which you could parse as

'xn' -> stringliteral
      -> shortstring
      -> "'"  shortstringitem shortstringitem "'"
      -> "'" shortstringchar stringescapeseq "'"
      -> "'" 'x' '' 'n' "'"

The first shortstringitem is recognized as a shortstringchar, the second as a stringescapeseq.

Answered By: chepner
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.