@@ -73,6 +73,10 @@ The output is an HTML fragment, rather than a full HTML document, encoded with U
7373Since the encoding is not explicitly set in the fragment,
7474opening the output file in a web browser may cause Unicode characters to be rendered incorrectly if the browser doesn't default to UTF-8.
7575
76+ ** Mammoth performs no sanitisation of the source document,
77+ and should therefore be used extremely carefully with untrusted user input.**
78+ See the [ Security] ( #security ) section for more information.
79+
7680#### Images
7781
7882By default, images are included inline in the output HTML.
@@ -111,6 +115,10 @@ For instance:
111115
112116### Library
113117
118+ ** Mammoth performs no sanitisation of the source document,
119+ and should therefore be used extremely carefully with untrusted user input.**
120+ See the [ Security] ( #security ) section for more information.
121+
114122#### Basic conversion
115123
116124To convert an existing .docx file to HTML,
@@ -389,6 +397,24 @@ although the fidelity of the conversion depends entirely on LibreOffice.
389397
390398[ wmf-libreoffice-recipe ] : https://github.com/mwilliamson/python-mammoth/blob/master/recipes/wmf_images.py
391399
400+ ### Security
401+
402+ Mammoth performs no sanitisation of the source document,
403+ and should therefore be used extremely carefully with untrusted user input.
404+ For instance:
405+
406+ * Source documents can contain links with ` javascript: ` targets.
407+ If, for instance, you allow users to upload source documents,
408+ automatically convert the document into HTML,
409+ and embed the HTML into your website without sanitisation,
410+ this may create links that can execute arbitrary JavaScript when clicked.
411+
412+ * Source documents may reference files outside of the source document.
413+ If, for instance, you allow users to upload source documents to a server,
414+ automatically convert the document into HTML on the server,
415+ and embed the HTML into your website,
416+ this may allow arbitrary files on the server to be read and exfiltrated.
417+
392418### Document transforms
393419
394420** The API for document transforms should be considered unstable,
0 commit comments