Skip to content

Commit 4a94f6c

Browse files
committed
Refinements based on read-thru
1 parent 8cb35de commit 4a94f6c

1 file changed

Lines changed: 91 additions & 78 deletions

File tree

peps/pep-0829.rst

Lines changed: 91 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
PEP: 829
22
Title: Structured Startup Configuration via .site.toml Files
33
Author: Barry Warsaw <barry@python.org>
4+
Discussions-To: Pending
45
Status: Draft
56
Type: Standards Track
67
Topic: Packaging
@@ -27,40 +28,40 @@ Motivation
2728
Python's ``.pth`` files (processed by ``Lib/site.py`` at startup)
2829
support two functions:
2930

30-
#. **Extending** ``sys.path`` -- Lines in this file (excluding
31-
comments and lines that start with ``import``) name directories to
32-
be appended to ``sys.path``. Relative paths are implicitly
33-
anchored at the site-packages directory.
31+
* **Extending** ``sys.path`` -- Lines in this file (excluding
32+
comments and lines that start with ``import``) name directories to
33+
be appended to ``sys.path``. Relative paths are implicitly
34+
anchored at the site-packages directory.
3435

35-
#. **Executing code** -- lines starting with ``import`` (or
36-
``import\\t``) are executed immediately by passing the source string
37-
to ``exec()``.
36+
* **Executing code** -- lines starting with ``import`` (or
37+
``import\\t``) are executed immediately by passing the source string
38+
to ``exec()``.
3839

3940
This design has several problems:
4041

41-
#. Code execution is a side effect of the implementation. Lines that
42-
start with ``import`` can be extended by separating multiple
43-
statements with a semicolon. As long as all the code to be
44-
executed appears on the same line, it all gets executed when the
45-
``.pth`` file is processed.
42+
* Code execution is a side effect of the implementation. Lines that
43+
start with ``import`` can be extended by separating multiple
44+
statements with a semicolon. As long as all the code to be
45+
executed appears on the same line, it all gets executed when the
46+
``.pth`` file is processed.
4647

47-
#. ``.pth`` files are essentially unstructured, leading to contents
48-
which are difficult to reason about or verify, and often even
49-
difficult to read. It mixes two potentially useful features with
50-
different security constraints, and no way to separate out these
51-
concerns.
48+
* ``.pth`` files are essentially unstructured, leading to contents
49+
which are difficult to reason about or validate, and are often even
50+
difficult to read. It mixes two potentially useful features with
51+
different security constraints, and no way to separate out these
52+
concerns.
5253

53-
#. The lack of ``.pth`` file structure also means there's no way to
54-
express metadata, no future-proofing of the format, and no defined
55-
execution or processing order of the contents.
54+
* The lack of ``.pth`` file structure also means there's no way to
55+
express metadata, no future-proofing of the format, and no defined
56+
execution or processing order of the contents.
5657

57-
#. Using ``exec()`` on the file contents during interpreter startup is
58-
a broad attack surface.
58+
* Using ``exec()`` on the file contents during interpreter startup is
59+
a broad attack surface.
5960

60-
#. There is no explicit concept of an entry point, which is an
61-
established pattern in Python packaging. Packages that require
62-
code execution and initialization at startup abuse ``import`` lines
63-
rather than explicitly declaring entry points.
61+
* There is no explicit concept of an entry point, which is an
62+
established pattern in Python packaging. Packages that require
63+
code execution and initialization at startup abuse ``import`` lines
64+
rather than explicitly declaring entry points.
6465

6566

6667
Specification
@@ -76,11 +77,14 @@ option, which disables ``site.py`` also disables
7677
The standard library ``tomllib`` package is used to read and process
7778
``<package>.site.toml`` files.
7879

79-
Any parsing errors cause the entire ``<package>.site.toml`` file to be
80-
ignored and not processed (but it still supersedes any parallel
81-
``<package>.pth`` file). Any errors that occur when importing entry
82-
point modules or calling entry point functions are reported but do no
83-
abort the Python executable.
80+
The presence of a ``<package>.site.toml`` file supersedes a parallel
81+
``<package>.pth`` file. This allows for both an easy migration path and
82+
continued support for older Pythons in parallel.
83+
84+
Any parsing errors cause the entire ``<package>.site.toml`` file to be ignored
85+
and not processed (but it still supersedes any parallel ``<package>.pth``
86+
file). Any errors that occur when importing entry point modules or calling
87+
entry point functions are reported but do not abort the Python executable.
8488

8589

8690
File Naming and Discovery
@@ -96,18 +100,20 @@ File Naming and Discovery
96100
``site.py``).
97101

98102
* ``<package>.site.toml`` files live in the same site-packages directories
99-
where ``.pth`` files are found today.
103+
where ``<package>.pth`` files are found today.
100104

101105
* The discovery rules for ``<package>.site.toml`` files is the same as
102-
``.pth`` files today. File names that start with a single ``.``
106+
``<package>.pth`` files today. File names that start with a single ``.``
103107
(e.g. ``.site.toml``) and files with OS-level hidden attributes (``UF_HIDDEN``,
104108
``FILE_ATTRIBUTE_HIDDEN``) are excluded.
105109

106110
* The processing order is alphabetical by filename, matching ``.pth``
107111
behavior.
108112

109113
* If both ``<package>.site.toml`` and ``<package>.pth`` exist in the same
110-
directory, only the ``<package>.site.toml`` file is processed.
114+
directory, only the ``<package>.site.toml`` file is processed. In other
115+
words, the presence of a ``<package>.site.toml`` file supersedes a parallel
116+
``<package.pth>`` file, even if the format of the TOML file is invalid.
111117

112118

113119
Processing Model
@@ -135,9 +141,9 @@ This two-phase approach (read then process) enables:
135141

136142
Within each site-packages directory, the processing order is:
137143

138-
#. Discover and parse all ``<package>.site.toml`` files (alphabetically).
139-
#. Process all ``[paths]`` entries from the parsed data.
140-
#. Execute all ``[entrypoints]`` entries from the parsed data.
144+
#. Discover and parse all ``<package>.site.toml`` files, sorted alphabetically.
145+
#. Process all ``[paths]`` entries from the parsed TOML files.
146+
#. Execute all ``[entrypoints]`` entries from the parsed TOML files.
141147
#. Process any remaining ``<package>.pth`` files that are not superseded by a
142148
``<package>.site.toml`` file.
143149

@@ -146,8 +152,8 @@ runs, and that ``<package>.site.toml``-declared paths are available to both
146152
entry point imports and ``<package>.pth`` import lines.
147153

148154

149-
<package>.site.toml file schema
150-
-------------------------------
155+
TOML file schema
156+
----------------
151157

152158
A ``<package>.site.toml`` file is defined to have three sections, all of which
153159
are optional:
@@ -167,8 +173,8 @@ are optional:
167173
init = ["foo.startup:initialize", "foo.plugins"]
168174
169175
170-
``[metadata]``
171-
''''''''''''''
176+
The ``[metadata]`` section
177+
''''''''''''''''''''''''''
172178

173179
This section contains package and/or file metadata. There are no required
174180
keys, and no semantics are assigned to any keys in this section *except* for
@@ -178,13 +184,14 @@ permitted and preserved.
178184
Defined keys:
179185

180186
``schema_version`` (integer, recommended)
181-
The TOML file schema version number. Must be the integer ``1`` for this
182-
specification. If present, Python guarantees forward-compatible handling:
183-
future versions will either process the file according to the declared
184-
schema or skip it with a clear diagnostic. If the
185-
``schema_version`` is present but has an unsupported value, the
186-
entire file is skipped. If ``schema_version`` is omitted, the file is processed
187-
on a best-effort basis with no forward-compatibility guarantees.
187+
The TOML file schema version number. Must be the integer ``1``
188+
for this specification. If present, Python guarantees
189+
forward-compatible handling: future versions will either process
190+
the file according to the declared schema or skip it with clear
191+
diagnostics. If the ``schema_version`` is present but has an
192+
unsupported value, the entire file is skipped. If
193+
``schema_version`` is omitted, the file is processed on a
194+
best-effort basis with no forward-compatibility guarantees.
188195

189196
Recommended keys:
190197

@@ -201,8 +208,8 @@ Recommended keys:
201208
``"aperson@example.com"``.
202209

203210

204-
``[paths]``
205-
'''''''''''
211+
The ``[paths]`` section
212+
'''''''''''''''''''''''
206213

207214
Defined keys:
208215

@@ -222,8 +229,8 @@ Path entries use a hybrid resolution scheme:
222229
* **Placeholder variables** are supported using ``{name}`` syntax. The
223230
placeholder ``{sitedir}`` expands to the site-packages directory where the
224231
``<package>.site.toml`` file was found. Thus ``{sitedir}/relpath`` and
225-
``relpath`` resolve to the same path and this is the explicit form
226-
of the relative path form.
232+
``relpath`` resolve to the same path with the placeholder version being the
233+
explicit (and recommended) form of the relative path form.
227234

228235
While only ``{sitedir}`` is defined in this PEP, additional
229236
placeholder variables (e.g., ``{prefix}``, ``{exec_prefix}``,
@@ -232,13 +239,13 @@ placeholder variables (e.g., ``{prefix}``, ``{exec_prefix}``,
232239
If ``dirs`` is not a list of strings, a warning is emitted (visible
233240
with ``-v``) and the section is skipped.
234241

235-
Directories that do not exist on the filesystem are silently skipped,
236-
matching ``<package>.pth`` behavior. Duplicate paths are
237-
de-duplicated, also matching ``<package>.pth`` behavior.
242+
Directories that do not exist on the filesystem are silently skipped, matching
243+
``<package>.pth`` behavior. Paths are de-duplicated, also matching
244+
``<package>.pth`` behavior.
238245

239246

240-
``[entrypoints]``
241-
'''''''''''''''''
247+
The ``[entrypoints]`` section
248+
'''''''''''''''''''''''''''''
242249

243250
``init`` -- a list of strings specifying entry point references to
244251
execute at startup. Each item uses the standard Python entry point
@@ -358,11 +365,11 @@ Continue on error rather than abort
358365
Backwards Compatibility
359366
=======================
360367

361-
* ``<package>.pth`` file processing is **not** removed. Both
362-
``<package>.pth`` and ``<package>.site.toml`` files are discovered
363-
in parallel within each site-packages directory. This preserves
364-
backward compatibility for all existing (pre-migration) packages.
365-
Deprecation of ``<package>.pth`` files is out-of-scope for this PEP.
368+
* ``<package>.pth`` file processing is **not** deprecated or removed. Both
369+
``<package>.pth`` and ``<package>.site.toml`` files are discovered in
370+
parallel within each site-packages directory. This preserves backward
371+
compatibility for all existing (pre-migration) packages. Deprecation of
372+
``<package>.pth`` files is out-of-scope for this PEP.
366373

367374
* When ``<package>.site.toml`` exists alongside ``<package>.pth``, the
368375
``<package>.site.toml`` takes precedence and the ``<package>.pth`` file is
@@ -396,10 +403,10 @@ This PEP improves the security posture of interpreter startup:
396403
importable modules and their attributes, unlike ``exec()`` which can
397404
run arbitrary code.
398405

399-
The overall attack surface is not eliminated -- a malicious
400-
``<package>.site.toml`` file can still cause arbitrary code execution via
401-
``init`` entrypoints, but the mechanism proposed in this PEP is more
402-
structured, auditable, and amenable to policy controls.
406+
The overall attack surface is not eliminated -- a malicious package
407+
can still cause arbitrary code execution via ``init`` entrypoints, but
408+
the mechanism proposed in this PEP is more structured, auditable, and
409+
amenable to future policy controls.
403410

404411

405412
How to Teach This
@@ -437,25 +444,26 @@ If your ``<package>.pth`` file includes arbitrary code, put that code in a
437444
start up function and use the ``module:callable`` syntax.
438445

439446
Both ``<package>.pth`` and ``<package>.site.toml`` can coexist during
440-
migration. If both exist for the same package, only the ``<package>.site.toml`` is
441-
processed. Thus, it is recommended that packages compatible with older
442-
Pythons ship both files.
447+
migration. If both exist for the same package, only the
448+
``<package>.site.toml`` is processed. Thus it is recommended that
449+
packages compatible with older Pythons ship both files.
450+
443451

444452
For tool authors
445453
----------------
446454

447-
Build backends and installers should generate ``<package>.site.toml`` files
448-
alongside or instead of ``<package>.pth`` files, depending on the package's
449-
Python support matrix. The TOML format is easy to generate programmatically
450-
using ``tomllib`` (for reading) or string formatting (for writing, since the
451-
schema is simple).
455+
Build backends and installers should generate ``<package>.site.toml``
456+
files alongside or instead of ``<package>.pth`` files, depending on
457+
the package's Python support matrix. The TOML format is easy to
458+
generate programmatically using ``tomllib`` (for reading) or string
459+
formatting (for writing, since the schema is simple).
452460

453461

454462
Reference Implementation
455463
=========================
456464

457-
A reference implementation is provided as modifications to ``Lib/site.py``,
458-
adding the following:
465+
A `reference implementation <https://github.com/python/cpython/pull/147955>`_
466+
is provided as modifications to ``Lib/site.py``, adding the following:
459467

460468
* ``_SiteTOMLData`` -- a ``__slots__`` class holding parsed data from
461469
a single ``<package>.site.toml`` file (metadata, dirs, init).
@@ -494,8 +502,7 @@ JSON instead of TOML
494502
``pyproject.toml``.
495503

496504
YAML instead of TOML
497-
YAML is not in the standard library and has well-documented
498-
parsing pitfalls.
505+
There is no standard YAML parser in the standard library.
499506

500507
Python instead of TOML
501508
Python is imperative, TOML is declarative. Thus TOML files are
@@ -548,6 +555,12 @@ Open Issues
548555
``{userbase}``.
549556

550557

558+
Change History
559+
==============
560+
561+
None at this time.
562+
563+
551564
Copyright
552565
=========
553566

0 commit comments

Comments
 (0)