Explode a paragraph into sentences in PHP

You can use preg_split() combined with a PCRE lookahead condition to split the string after each occurance of ., ;, :, ?, !, .. while keeping the actual punctuation intact:

Code:

$subject="abc sdfs.    def ghi; this is [email protected]! asdasdasd? abc xyz";
// split on whitespace between sentences preceded by a punctuation mark
$result = preg_split('/(?<=[.?!;:])\s+/', $subject, -1, PREG_SPLIT_NO_EMPTY);
print_r($result);

Result:

Array
(
    [0] => abc sdfs.
    [1] => def ghi;
    [2] => this is [email protected]!
    [3] => asdasdasd?
    [4] => abc xyz
)

You can also add a blacklist for abbreviations (Mr., Mrs., Dr., ..) that should not be split into own sentences by inserting a negative lookbehind assertion:

$subject="abc sdfs.   Dr. Foo said he is not a sentence; asdasdasd? abc xyz";
// split on whitespace between sentences preceded by a punctuation mark
$result = preg_split('/(?<!Mr.|Mrs.|Dr.)(?<=[.?!;:])\s+/', $subject, -1, PREG_SPLIT_NO_EMPTY);
print_r($result);

Result:

Array
(
    [0] => abc sdfs.
    [1] => Dr. Foo said he is not a sentence;
    [2] => asdasdasd?
    [3] => abc xyz
)

Leave a Comment