Skip to content

Commit 1ae31e2

Browse files
committed
Prepare release 2.2
1 parent 67ea1e4 commit 1ae31e2

5 files changed

Lines changed: 59 additions & 69 deletions

File tree

CHANGES.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
## 2.2.0
2+
3+
### Build
4+
- Install the dynamic library (PR #12)
5+
- Build the library with Dune (PR #13)
6+
- Prevent installation of the opam package on Windows (PR #13, #15)
7+
- Add ocaml-option-nnp as dependency (PR #16)
8+
9+
### Features
10+
- Support for OCaml 5 (PR #7, #10)
Lines changed: 46 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
1-
'Ancient' module for OCaml
2-
----------------------------------------------------------------------
1+
# Ancient library for OCaml
32

4-
What does this module do?
5-
----------------------------------------------------------------------
3+
## What does this module do?
64

75
This module allows you to use in-memory data structures which are
86
larger than available memory and so are kept in swap. If you try this
@@ -36,8 +34,8 @@ the ancient heap may need to be manually deallocated. The ancient
3634
heap may either exist as ordinary memory, or may be backed by a file,
3735
which is how shared structures are possible.
3836

39-
Structures which are moved into ancient must be treated as STRICTLY
40-
NON-MUTABLE. If an ancient structure is changed in any way then it
37+
Structures which are moved into ancient must be treated as **STRICTLY
38+
NON-MUTABLE**. If an ancient structure is changed in any way then it
4139
may cause a crash.
4240

4341
There are some limitations which apply to ancient data structures.
@@ -47,57 +45,56 @@ This module is most useful on 64 bit architectures where large address
4745
spaces are the norm. We have successfully used it with a 38 GB
4846
address space backed by a file and shared between processes.
4947

50-
API
51-
----------------------------------------------------------------------
48+
## API
5249

53-
Please see file ancient.mli .
50+
Please see file `ancient.mli`.
5451

55-
Compiling
56-
----------------------------------------------------------------------
52+
## Compiling
5753

58-
make
54+
```console
55+
make
56+
```
5957

6058
Make sure you run this command before running any program which
6159
uses the Ancient module:
6260

63-
ulimit -s unlimited
64-
65-
Example
66-
----------------------------------------------------------------------
61+
```console
62+
ulimit -s unlimited
63+
```
6764

68-
XXX Note the example code is really stupid, and fails for large
69-
dictionaries. See bug (10) below.
65+
## Example
7066

71-
Run:
72-
73-
ulimit -s unlimited
74-
wordsfile=/usr/share/dict/words
75-
baseaddr=0x440000000000 # System specific - see below.
76-
./test_ancient_dict_write.opt $wordsfile dictionary.data $baseaddr
77-
./test_ancient_dict_verify.opt $wordsfile dictionary.data
78-
./test_ancient_dict_read.opt dictionary.data
67+
Note the example code is really stupid, and fails for large
68+
dictionaries. See bug (10) below. Run:
69+
```console
70+
ulimit -s unlimited
71+
wordsfile=/usr/share/dict/words
72+
baseaddr=0x440000000000 # System specific - see below.
73+
./test_ancient_dict_write.opt $wordsfile dictionary.data $baseaddr
74+
./test_ancient_dict_verify.opt $wordsfile dictionary.data
75+
./test_ancient_dict_read.opt dictionary.data
76+
```
7977

8078
(You can run several instances of test_ancient_dict_read.opt on the
8179
same machine to demonstrate sharing).
8280

83-
Shortcomings & bugs
84-
----------------------------------------------------------------------
81+
## Shortcomings & bugs
8582

86-
(0) Stack overflows are common when marking/sharing large structures
83+
1. Stack overflows are common when marking/sharing large structures
8784
because we use a recursive algorithm to visit the structures. If you
8885
get random segfaults during marking/sharing, then try this before
8986
running your program:
90-
87+
```console
9188
ulimit -s unlimited
89+
```
9290

93-
(1) Ad-hoc polymorphic primitives (structural equality, marshalling
91+
2. Ad-hoc polymorphic primitives (structural equality, marshalling
9492
and hashing) do not work on ancient data structures, meaning that you
9593
will need to provide your own comparison and hashing functions. For
9694
more details see Xavier Leroy's response here:
97-
9895
http://caml.inria.fr/pub/ml-archives/caml-list/2006/09/977818689f4ceb2178c592453df7a343.en.html
9996

100-
(2) Ancient.attach suggests setting a baseaddr parameter for newly
97+
3. Ancient.attach suggests setting a baseaddr parameter for newly
10198
created files (it has no effect when attaching existing files). We
10299
strongly recommend this because in our tests we found that mmap would
103100
locate the memory segment inappropriately -- the basic problem is that
@@ -110,19 +107,17 @@ programmers to guess at a good base address which will be valid in the
110107
future. There are no other good solutions we have found --
111108
preallocating the file is tricky with the current mmalloc code.
112109

113-
(3) The current code requires you to first of all create the large
110+
4. The current code requires you to first of all create the large
114111
data structures on the regular OCaml heap, then mark them as ancient,
115112
effectively copying them out of the OCaml heap, then garbage collect
116113
the (hopefully unreferenced) structures on the OCaml heap. In other
117114
words, you need to have all the memory available as physical memory.
118115
The way to avoid this is to mark structures as ancient incrementally
119116
as they are created, or in chunks, whatever works for you.
120-
121117
We typically use Ancient to deal with web server logfiles, and in this
122118
case loading one file of data into memory and marking it as ancient
123119
before moving on to the next file works for us.
124-
125-
(4) Why do ancient structures need to be read-only / not mutated? The
120+
5. Why do ancient structures need to be read-only / not mutated? The
126121
reason is that you might create a new OCaml heap structure and point
127122
the ancient structure at this heap structure. The heap structure has
128123
no apparent incoming pointers (the GC will not by its very nature
@@ -134,14 +129,12 @@ data to OCaml heap data. In theory it should be possible to modify
134129
ancient data to point to other ancient data, but we have not tried
135130
this.
136131

137-
(5) [Limit on number of keys -- issue fixed]
138-
139-
(6) [Advanced topic] The _mark function in ancient_c.c makes no
132+
6. **Advanced topic:** the `_mark` function in `ancient_c.c` makes no
140133
attempt to arrange the data structures in memory / on disk in a way
141134
which optimises them for access. The worst example is when you have
142135
an array of large structures, where only a few fields in the structure
143136
will be accessed. Typically these will end up on disk as:
144-
137+
```
145138
array of N pointers
146139
structure 1
147140
field A
@@ -166,10 +159,10 @@ will be accessed. Typically these will end up on disk as:
166159
field B
167160
...
168161
field Z
169-
162+
```
170163
If you then iterate accessing only fields A, you end up swapping the
171164
whole lot back into memory. A better arrangement would have been:
172-
165+
```
173166
array of N pointers
174167
structure 1
175168
structure 2
@@ -184,23 +177,21 @@ whole lot back into memory. A better arrangement would have been:
184177
field B from structure 1
185178
field B from structure 2
186179
etc.
187-
188-
which avoids loading unused fields at all. In some circumstances we
180+
```
181+
which avoids loading unused fields at all. In some circumstances we
189182
have shown that this could make a huge difference to performance, but
190183
we are not sure how to implement this cleanly in the current library.
191-
192184
[Update: I have fixed issue 6 manually for my Weblogs example and
193185
confirmed that it does make a huge difference to performance, although
194186
at considerable extra code complexity. Interested people can see the
195187
weblogs library, file import_weblogs_ancient.ml.in].
196188

197-
(7) [Advanced topic] Certain techniques such as Address Space
189+
7. **Advanced topic:** certain techniques such as Address Space
198190
Randomisation (http://lwn.net/Articles/121845/) are probably not
199191
compatible with the Ancient module and shared files. Because the
200192
ancient data structures contain real pointers, these pointers would be
201193
invalidated if the shared file was not mapped in at precisely the same
202194
base address in all processes which are sharing the file.
203-
204195
One solution might be to use private mappings and a list of fixups.
205196
In fact, the code actually builds a list of fixups currently while
206197
marking, because it needs to deal with precisely this issue (during
@@ -210,7 +201,6 @@ to be fixed up afterwards). The list of fixups would need to be
210201
stored alongside the memory segment (currently it is discarded after
211202
marking), and the file would need to be mapped in using MAP_PRIVATE
212203
(see below).
213-
214204
A possible problem with this is that because OCaml objects tend to be
215205
small and contain a lot of pointers, it is likely that fixing up the
216206
pointers would result in every page in the memory segment becoming
@@ -219,34 +209,30 @@ mappings in the first place. However it is likely that some users of
219209
this module have large amounts of opaque data and few pointers, and
220210
for them this would be worthwhile.
221211

222-
(8) Currently mmalloc is implemented so that the file is mapped in
212+
8. Currently mmalloc is implemented so that the file is mapped in
223213
PROT_READ|PROT_WRITE and MAP_SHARED. Ancient data structures are
224214
supposed to be immutable so strictly speaking write access shouldn't
225215
be required. It may be worthwhile modifying mmalloc to allow
226216
read-only mappings, and private mappings.
227217

228-
(9) The library assumes that every OCaml object is at least one word
218+
9. The library assumes that every OCaml object is at least one word
229219
long. This seemed like a good assumption up until I found that
230220
zero-length arrays are valid zero word objects. At the moment you
231221
cannot mark structures which contain zero-length arrays -- you will
232222
get an assert-failure in the _mark function.
233-
234223
Possibly there are other types of OCaml structure which are zero word
235224
objects and also cannot be marked. I'm not sure what these will be:
236225
for example empty strings are stored as one word OCaml objects, so
237-
they are OK.
238-
239-
The solution to this bug is non-trivial.
226+
they are OK. The solution to this bug is non-trivial.
240227

241-
(10) Example code is very stupid. It fails with large dictionaries,
228+
11. Example code is very stupid. It fails with large dictionaries,
242229
eg. the one with nearly 500,000 words found in Fedora.
243230

244-
(11) In function 'mark', the "// Ran out of memory. Recover and throw
231+
12. In function 'mark', the "// Ran out of memory. Recover and throw
245232
an exception." codepath actually fails if you use it - segfaulting
246233
inside do_restore.
247234

248-
Authors
249-
----------------------------------------------------------------------
235+
## Authors
250236

251237
Primary code was written by Richard W.M. Jones <rich at annexia.org>
252238
with help from Markus Mottl, Martin Jambon, and invaluable advice from
@@ -257,16 +243,9 @@ mmalloc was written by Mike Haertel and Fred Fish.
257243
Port to no-naked-pointers and OCaml 5+ by Fabrice Le Fessant at
258244
OCamlPro.
259245

260-
License
261-
----------------------------------------------------------------------
246+
## License
262247

263248
The module is licensed under the LGPL + OCaml linking exception. This
264249
module includes mmalloc which was originally distributed with gdb
265250
(although it has since been removed), and that code was distributed
266251
under the plain LGPL.
267-
268-
Latest version
269-
----------------------------------------------------------------------
270-
271-
The latest version can be found on the website:
272-
http://merjis.com/developers/ancient

ancient.opam

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ Secondly, this module allows you to share those structures between
1616
processes. In this mode, the structures are backed by a disk file,
1717
and any process that has read/write access that disk file can map that
1818
file in and see the structures."""
19-
maintainer: ["Fabrice Le Fessant <fabrice.le_fessant@ocamlpro.com>"]
19+
maintainer: ["Pierre Villemot <pierre.villemot@ocamlpro.com>"]
2020
authors: ["Richard Jones et.al."]
2121
license: "LGPL-2.1-or-later WITH OCaml-LGPL-linking-exception"
2222
homepage: "https://github.com/OCamlPro/ocaml-ancient"

dune-project

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
(name ancient)
55
(authors "Richard Jones et.al.")
6-
(maintainers "Fabrice Le Fessant <fabrice.le_fessant@ocamlpro.com>")
6+
(maintainers "Pierre Villemot <pierre.villemot@ocamlpro.com>")
77
(homepage "https://github.com/OCamlPro/ocaml-ancient")
88
(bug_reports "https://github.com/OCamlPro/ocaml-ancient/issues")
99

flake.nix

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
odoc
3434
ocaml-lsp
3535
patdiff
36+
dune-release
3637
];
3738

3839
inputsFrom = [ ancient ];

0 commit comments

Comments
 (0)