Skip to content

Commit 145832e

Browse files
committed
Add notes on security
1 parent bd2cf5a commit 145832e

2 files changed

Lines changed: 28 additions & 0 deletions

File tree

NEWS

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@
99

1010
* Ignore deleted table rows.
1111

12+
* Add notes on security.
13+
1214
# 1.9.1
1315

1416
* Ignore AlternateContent elements when there is no Fallback element.

README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,10 @@ The output is an HTML fragment, rather than a full HTML document, encoded with U
7373
Since the encoding is not explicitly set in the fragment,
7474
opening the output file in a web browser may cause Unicode characters to be rendered incorrectly if the browser doesn't default to UTF-8.
7575

76+
**Mammoth performs no sanitisation of the source document,
77+
and should therefore be used extremely carefully with untrusted user input.**
78+
See the [Security](#security) section for more information.
79+
7680
#### Images
7781

7882
By default, images are included inline in the output HTML.
@@ -111,6 +115,10 @@ For instance:
111115

112116
### Library
113117

118+
**Mammoth performs no sanitisation of the source document,
119+
and should therefore be used extremely carefully with untrusted user input.**
120+
See the [Security](#security) section for more information.
121+
114122
#### Basic conversion
115123

116124
To convert an existing .docx file to HTML,
@@ -389,6 +397,24 @@ although the fidelity of the conversion depends entirely on LibreOffice.
389397

390398
[wmf-libreoffice-recipe]: https://github.com/mwilliamson/python-mammoth/blob/master/recipes/wmf_images.py
391399

400+
### Security
401+
402+
Mammoth performs no sanitisation of the source document,
403+
and should therefore be used extremely carefully with untrusted user input.
404+
For instance:
405+
406+
* Source documents can contain links with `javascript:` targets.
407+
If, for instance, you allow users to upload source documents,
408+
automatically convert the document into HTML,
409+
and embed the HTML into your website without sanitisation,
410+
this may create links that can execute arbitrary JavaScript when clicked.
411+
412+
* Source documents may reference files outside of the source document.
413+
If, for instance, you allow users to upload source documents to a server,
414+
automatically convert the document into HTML on the server,
415+
and embed the HTML into your website,
416+
this may allow arbitrary files on the server to be read and exfiltrated.
417+
392418
### Document transforms
393419

394420
**The API for document transforms should be considered unstable,

0 commit comments

Comments
 (0)