/ specs / test_formats / README.md
README.md
  1  # General test format
  2  
  3  This document defines the YAML format and structure used for ETH 2.0 testing.
  4  
  5  ## ToC
  6  
  7  * [About](#about)
  8  * [Glossary](#glossary)
  9  * [Test format philosophy](#test-format-philosophy)
 10  * [Test Suite](#test-suite)
 11  * [Config](#config)
 12  * [Fork-timeline](#fork-timeline)
 13  * [Config sourcing](#config-sourcing)
 14  * [Test structure](#test-structure)
 15  
 16  ## About
 17  
 18  Ethereum 2.0 uses YAML as the format for all cross client tests. This document describes at a high level the general format to which all test files should conform.
 19  
 20  ### Test-case formats
 21  
 22  The particular formats of specific types of tests (test suites) are defined in separate documents.
 23  
 24  Test formats:
 25  - [`bls`](./bls/README.md)
 26  - [`operations`](./operations/README.md)
 27  - [`shuffling`](./shuffling/README.md)
 28  - [`ssz`](./ssz/README.md)
 29  - More formats are planned, see tracking issues for CI/testing
 30  
 31  ## Glossary
 32  
 33  - `generator`: a program that outputs one or more `suite` files.
 34    - A generator should only output one `type` of test.
 35    - A generator is free to output multiple `suite` files, optionally with different `handler`s.
 36  - `type`: the specialization of one single `generator`.
 37  - `suite`: a YAML file with:
 38    - a header: describes the `suite`, and defines what the `suite` is for
 39    - a list of test cases
 40  - `runner`: where a generator is a *"producer"*, this is the *"consumer"*.
 41    - A `runner` focuses on *only one* `type`, and each type has *only one* `runner`.
 42  - `handler`: a `runner` may be too limited sometimes, you may have a `suite` with a specific focus that requires a different format.
 43    To facilitate this, you specify a `handler`: the runner can deal with the format by using the specified handler.
 44    Using a `handler` in a `runner` is optional.
 45  - `case`: a test case, an entry in the `test_cases` list of a `suite`. A case can be anything in general, 
 46    but its format should be well-defined in the documentation corresponding to the `type` (and `handler`).\
 47    A test has the same exact configuration and fork context as the other entries in the `case` list of its `suite`.
 48  - `forks_timeline`: a fork timeline definition, a YAML file containing a key for each fork-name, and an epoch number as value.
 49  
 50  ## Test format philosophy
 51  
 52  ### Config design
 53  
 54  After long discussion, the following types of configured constants were identified:
 55  - Never changing: genesis data
 56  - Changing, but reliant on old value: e.g. an epoch time may change, but if you want to do the conversion 
 57    `(genesis data, timestamp) -> epoch number` you end up needing both constants.
 58  - Changing, but kept around during fork transition: finalization may take a while,
 59    e.g. an executable has to deal with new deposits and old deposits at the same time. Another example may be economic constants.
 60  - Additional, back-wards compatible: new constants are introduced for later phases
 61  - Changing: there is a very small chance some constant may really be *replaced*. 
 62    In this off-chance, it is likely better to include it as an additional variable,
 63    and some clients may simply stop supporting the old one, if they do not want to sync from genesis.
 64  
 65  Based on these types of changes, we model the config as a list of key value pairs,
 66   that only grows with every fork (they may change in development versions of forks however, git manages this).
 67  With this approach, configurations are backwards compatible (older clients ignore unknown variables), and easy to maintain.
 68  
 69  ### Fork config design
 70  
 71  There are two types of fork-data:
 72  1) timeline: when does a fork take place?
 73  2) coverage: what forks are covered by a test?
 74  
 75  The first is neat to have as a separate form: we prevent duplication, and can run with different presets
 76   (e.g. fork timeline for a minimal local test, for a public testnet, or for mainnet)
 77  
 78  The second does not affect the result of the tests, it just states what is covered by the tests,
 79   so that the right suites can be executed to see coverage for a certain fork.
 80  For some types of tests, it may be beneficial to ensure it runs exactly the same, with any given fork "active".
 81  Test-formats can be explicit on the need to repeat a test with different forks being "active",
 82   but generally tests run only once.
 83  
 84  ### Test completeness
 85  
 86  Tests should be independent of any sync-data. If one wants to run a test, the input data should be available from the YAML.
 87  The aim is to provide clients with a well-defined scope of work to run a particular set of test-suites.
 88  
 89  - Clients that are complete are expected to contribute to testing, seeking for better resources to get conformance with the spec, and other clients.
 90  - Clients that are not complete in functionality can choose to ignore suites that use certain test-runners, or specific handlers of these test-runners.
 91  - Clients that are on older versions can test their work based on older releases of the generated tests, and catch up with newer releases when possible.
 92  
 93  ## Test Suite
 94  
 95  ```
 96  title: <string, short, one line> -- Display name for the test suite
 97  summary: <string, average, 1-3 lines> -- Summarizes the test suite
 98  forks_timeline: <string, reference to a fork definition file, without extension> -- Used to determine the forking timeline
 99  forks: <list of strings> -- Defines the coverage. Test-runner code may decide to re-run with the different forks "activated", when applicable.
100  config: <string, reference to a config file, without extension> -- Used to determine which set of constants to run (possibly compile time) with
101  runner: <string, no spaces, python-like naming format> *MUST be consistent with folder structure*
102  handler: <string, no spaces, python-like naming format> *MUST be consistent with folder structure*
103  
104  test_cases: <list, values being maps defining a test case each>
105     ...
106  
107  ```
108  
109  ## Config
110  
111  A configuration is a separate YAML file.
112  Separation of configuration and tests aims to:
113  - Prevent duplication of configuration
114  - Make all tests easy to upgrade (e.g. when a new config constant is introduced)
115  - Clearly define which constants to use
116  - Shareable between clients, for cross-client short or long lived testnets
117  - Minimize the amounts of different constants permutations to compile as a client.
118    Note: Some clients prefer compile-time constants and optimizations.
119    They should compile for each configuration once, and run the corresponding tests per build target.
120  
121  The format is described in [`configs/constant_presets`](../../configs/constant_presets/README.md#format).
122  
123  
124  ## Fork-timeline
125  
126  A fork timeline is (preferably) loaded in as a configuration object into a client, as opposed to the constants configuration:
127   - we do not allocate or optimize any code based on epoch numbers
128   - when we transition from one fork to the other, it is preferred to stay online.
129   - we may decide on an epoch number for a fork based on external events (e.g. Eth1 log event),
130      a client should be able to activate a fork dynamically.
131  
132  The format is described in [`configs/fork_timelines`](../../configs/fork_timelines/README.md#format).
133  
134  ## Config sourcing
135  
136  The constants configurations are located in:
137  
138  ```
139  <specs repo root>/configs/constant_presets/<config name>.yaml
140  ```
141  
142  And copied by CI for testing purposes to:
143  
144  ```
145  <tests repo root>/configs/constant_presets/<config name>.yaml
146  ```
147  
148  
149  The fork timelines are located in:
150  
151  ```
152  <specs repo root>/configs/fork_timelines/<timeline name>.yaml
153  ```
154  
155  And copied by CI for testing purposes to:
156  
157  ```
158  <tests repo root>/configs/fork_timelines/<timeline name>.yaml
159  ```
160  
161  ## Test structure
162  
163  To prevent parsing of hundreds of different YAML files to test a specific test type, 
164   or even more specific, just a handler, tests should be structured in the following nested form: 
165  
166  ```
167  .                             <--- root of eth2.0 tests repository
168  ├── bls                       <--- collection of handler for a specific test-runner, example runner: "bls"
169  │   ├── verify_msg            <--- collection of test suites for a specific handler, example handler: "verify_msg". If no multiple handlers, use a dummy folder (e.g. "core"), and specify that in the yaml.
170  │   │   ├── verify_valid.yml    .
171  │   │   ├── special_cases.yml   . a list of test suites
172  │   │   ├── domains.yml         .
173  │   │   ├── invalid.yml         .
174  │   │   ...                   <--- more suite files (optional)
175  │   ...                       <--- more handlers
176  ...                           <--- more test types
177  ```
178  
179  ## Common test-case properties
180  
181  Some test-case formats share some common key-value pair patterns, and these are documented here:
182  
183  ```
184  bls_setting: int     -- optional, can have 3 different values:
185                              0: (default, applies if key-value pair is absent). Free to choose either BLS ON or OFF.
186                                   Tests are generated with valid BLS data in this case,
187                                   but there is no change of outcome when running the test if BLS is ON or OFF.
188                              1: known as "BLS required" - if the test validity is strictly dependent on BLS being ON
189                              2: known as "BLS ignored"  - if the test validity is strictly dependent on BLS being OFF
190  ```
191  
192  ## Note for implementers
193  
194  The basic pattern for test-suite loading and running is:
195  
196  Iterate suites for given test-type, or sub-type (e.g. `operations > deposits`):
197  1. Filter test-suite, options:
198      - Config: Load first few lines, load into YAML, and check `config`, either:
199          - Pass the suite to the correct compiled target
200          - Ignore the suite if running tests as part of a compiled target with different configuration
201          - Load the correct configuration for the suite dynamically before running the suite
202      - Select by file name
203      - Filter for specific suites (e.g. for a specific fork)
204  2. Load the YAML
205      - Optionally translate the data into applicable naming, e.g. `snake_case` to `PascalCase`
206  3. Iterate through the `test_cases`
207  4. Ask test-runner to allocate a new test-case (i.e. objectify the test-case, generalize it with a `TestCase` interface) 
208      Optionally pass raw test-case data to enable dynamic test-case allocation.
209      1. Load test-case data into it.
210      2. Make the test-case run.