Skip to content

Commit b9dd9d6

Browse files
committed
optimize sentence counting
1 parent 4793a7e commit b9dd9d6

2 files changed

Lines changed: 1 addition & 1 deletion

File tree

data/Edge_IoT/Dataset.xlsx

106 Bytes
Binary file not shown.

tlparser/text_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
re.IGNORECASE,
1212
)
1313
_NUMERIC_DOTTED_RE = re.compile(r"\b\d+(?:\.\d+)+\.?")
14-
_SENTENCE_BOUNDARY_RE = re.compile(r"[.!?]+(?=(?:\s|$|[\"'\)\]]))")
14+
_SENTENCE_BOUNDARY_RE = re.compile(r"[.!?;:]+(?=(?:\s|$|[\"'\)\]]))")
1515

1616

1717
def _replace_dots(match: Match[str]) -> str:

0 commit comments

Comments
 (0)