|
| 1 | +Gherkin Compatibility Mode |
| 2 | +========================== |
| 3 | + |
| 4 | +Behat uses the `behat/gherkin`_ library to parse your feature files into the data structures that |
| 5 | +Behat will use to execute them. |
| 6 | + |
| 7 | +In most cases, this parses identically to `the official parsers provided by the Cucumber project`_. |
| 8 | +However, there are some small differences in how our parser has traditionally treated some specific |
| 9 | +syntax compared to the official parsers. |
| 10 | + |
| 11 | +To resolve this, we have added a ``GherkinCompatibilityMode`` setting to the parser. This setting |
| 12 | +has two possible options: |
| 13 | + |
| 14 | +* ``GherkinCompatibilityMode::LEGACY`` - match our previous behaviour. This is the default in Behat 3.x. |
| 15 | +* ``GherkinCompatibilityMode::GHERKIN_32`` - match the official parsers. This will become the default in Behat 4.0. |
| 16 | + |
| 17 | +.. caution:: |
| 18 | + ``GherkinCompatibilityMode::GHERKIN_32`` is currently considered experimental. We expect that |
| 19 | + there will be more changes to how the parser behaves in this mode before we mark it as stable. |
| 20 | + |
| 21 | +Configuring the parser mode |
| 22 | +--------------------------- |
| 23 | + |
| 24 | +In Behat >= 3.30, you can specify the parser compatibility mode for your project in |
| 25 | +your :doc:`/user_guide/configuration`: |
| 26 | + |
| 27 | +.. code-block:: php |
| 28 | +
|
| 29 | + <?php |
| 30 | + use Behat\Config\GherkinOptions; |
| 31 | + use Behat\Config\Profile; |
| 32 | + use Behat\Gherkin\GherkinCompatibilityMode; |
| 33 | +
|
| 34 | + return new Config() |
| 35 | + ->withProfile(new Profile('default') |
| 36 | + ->withGherkinOptions(new GherkinOptions() |
| 37 | + ->withCompatibilityMode(GherkinCompatibilityMode::GHERKIN_32) |
| 38 | + ) |
| 39 | + ) |
| 40 | + ; |
| 41 | +
|
| 42 | +Differences between parser modes |
| 43 | +-------------------------------- |
| 44 | + |
| 45 | +Tables |
| 46 | +~~~~~~ |
| 47 | + |
| 48 | +In ``GHERKIN_32`` mode, table cells can include newlines, which will be unescaped during parsing. Note that |
| 49 | +newlines are unescaped **after** we remove the cell padding. |
| 50 | + |
| 51 | +For example, with the following table: |
| 52 | + |
| 53 | +.. code-block:: gherkin |
| 54 | +
|
| 55 | + Given 3 lines of poetry on 5 lines: |
| 56 | + | \nraindrops--\nher last kiss\ngoodbye.\n | |
| 57 | +
|
| 58 | +In ``GHERKIN_32`` mode, the table will parse as: |
| 59 | + |
| 60 | +.. code-block:: php |
| 61 | +
|
| 62 | + [ |
| 63 | + [ |
| 64 | + <<<TEXT |
| 65 | +
|
| 66 | + raindrops-- |
| 67 | + her last kiss |
| 68 | + goodbye. |
| 69 | +
|
| 70 | + TEXT |
| 71 | + ] |
| 72 | + ] |
| 73 | +
|
| 74 | +In legacy mode, this would be parsed as ``'\nraindrops--\nher last kiss\ngoodbye.'``. |
| 75 | +
|
| 76 | +The other difference is in how the parser trims padding of table cells: |
| 77 | +
|
| 78 | +* In ``GHERKIN_32`` mode, all leading and trailing whitespace, including tabs and unicode whitespace, is removed. |
| 79 | +* In ``LEGACY`` mode, only literal space characters are removed. |
| 80 | +
|
| 81 | +
|
| 82 | +Docstrings |
| 83 | +~~~~~~~~~~ |
| 84 | +
|
| 85 | +Docstrings (which Behat has historically referred to as PyStrings) in feature files can contain escaped delimiters - |
| 86 | +for example: |
| 87 | +
|
| 88 | +.. code-block:: gherkin |
| 89 | +
|
| 90 | + And a DocString with escaped separator inside |
| 91 | + """ |
| 92 | + first line |
| 93 | + \"\"\" |
| 94 | + third line |
| 95 | + """ |
| 96 | +
|
| 97 | +In ``GHERKIN_32`` mode, the parser will unescape the delimiters - e.g. this will be parsed as: |
| 98 | +
|
| 99 | +.. code-block:: text |
| 100 | +
|
| 101 | + first line |
| 102 | + """ |
| 103 | + third line |
| 104 | +
|
| 105 | +In legacy mode, the parsed string is not unescaped - e.g. it includes the literal ``\"\"\"`` text. |
| 106 | +
|
| 107 | +Tags |
| 108 | +~~~~ |
| 109 | +
|
| 110 | +In ``GHERKIN_32`` mode: |
| 111 | +
|
| 112 | +* Parsing fails if any tags contain whitespace (e.g. ``@some tag``). In legacy mode, these have triggered |
| 113 | + an ``E_USER_DEPRECATED`` since behat/gherkin v4.9.0 |
| 114 | +* The values returned by ``$node->getTags()`` will **include** the ``@`` prefix. In legacy mode, |
| 115 | + this was removed. This may affect custom hooks / event listeners that inspect the tag values at |
| 116 | + runtime. |
| 117 | +
|
| 118 | +
|
| 119 | +File language |
| 120 | +~~~~~~~~~~~~~ |
| 121 | +
|
| 122 | +In ``GHERKIN_32`` mode, if a file includes a ``#language`` annotation: |
| 123 | +
|
| 124 | +* Any whitespace in / around the tag will be ignored - so ``# language : fr`` will be |
| 125 | + recognised as a valid language tag. In legacy mode, this would have been treated as a comment. |
| 126 | +* Parsing fails if the language is not recognised - so ``#language: no-such`` will cause an error. |
| 127 | + In legacy mode, this would have been ignored and parsing would continue in the default language. |
| 128 | +
|
| 129 | +Whitespace following step keywords |
| 130 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 131 | +
|
| 132 | +In ``GHERKIN_32`` mode, a space between a step keyword and the rest of the text is treated as part of the keyword. This |
| 133 | +is because in a small number of languages there is no space after the keyword. |
| 134 | +
|
| 135 | +With a step in English like ``Then something should happen``, if you call ``StepNode::getKeyword()`` then: |
| 136 | +
|
| 137 | +* In ``GHERKIN_32`` mode the return value will be ``'Then '`` |
| 138 | +* In ``LEGACY`` mode the return value will be ``'Then'`` |
| 139 | +
|
| 140 | +In a language that does not place spaces after the keyword (e.g. Japanese), the return value will be the same in both |
| 141 | +modes. |
| 142 | +
|
| 143 | +Elements with descriptions |
| 144 | +~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 145 | +
|
| 146 | +Gherkin syntax allows multi-line descriptions on ``Feature:``, ``Background:``, ``Scenario:``, ``Scenario Outline:``, |
| 147 | +and ``Examples:`` elements. |
| 148 | +
|
| 149 | +Historically, we only parsed the description separately for a ``Feature`` node. For other nodes, we parsed the full |
| 150 | +text as a multi-line title. |
| 151 | +
|
| 152 | +In ``GHERKIN_32`` mode, if one of the elements listed above has multi-line text, then: |
| 153 | +
|
| 154 | +* The first line (containing the keyword) will be parsed as the title. |
| 155 | +* Following lines will be parsed as the description. |
| 156 | +* Any blank lines between the title & description will be ignored (in legacy mode, these were included at the start of |
| 157 | + the description). |
| 158 | +* Any left padding will be removed from the first line of the description, but subsequent lines will have the same |
| 159 | + left padding / indentation as the feature file. In legacy mode, we attempted to left-trim all lines to match the |
| 160 | + indentation of the keyword. |
| 161 | +
|
| 162 | +
|
| 163 | +.. _`behat/gherkin`: http://martinfowler.com/bliki/BusinessReadableDSL.html |
| 164 | +.. _`the official parsers provided by the Cucumber project`: https://github.com/cucumber/gherkin |
0 commit comments