DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world
Truncate Text Preserving HTML Tags With PHP
<a href="http://jsfromhell.com">Truncate/limit text preserving the HTML tags (the code auto-closes the tags).</a>
Example
echo String::truncate('jo<i><b>n</b>as</i>', 3, '...'); //jo<...
echo String::truncate('jo<i><b>n</b>as</i>', 3, '...', true); //jo<i><b>n</b></i>...
echo String::truncate('jo<i><b>n</b>as</i>', 3, '...', true, false); //jo<i><b>n...
Code
//+ Jonas Raoni Soares Silva
//@ http://jsfromhell.com
class String{
public static function truncate($s, $l, $e = '...', $isHTML = false){
$i = 0;
$tags = array();
if($isHTML){
preg_match_all('/<[^>]+>([^<]*)/', $s, $m, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
foreach($m as $o){
if($o[0][1] - $i >= $l)
break;
$t = substr(strtok($o[0][0], " \t\n\r\0\x0B>"), 1);
if($t[0] != '/')
$tags[] = $t;
elseif(end($tags) == substr($t, 1))
array_pop($tags);
$i += $o[1][1] - $o[0][1];
}
}
return substr($s, 0, $l = min(strlen($s), $l + $i)) . (count($tags = array_reverse($tags)) ? '</' . implode('></', $tags) . '>' : '') . (strlen($s) > $l ? $e : '');
}
}





Comments
Snippets Manager replied on Mon, 2010/03/22 - 8:53am
class String { public static function truncate($text, $length, $suffix = '…', $isHTML = true){ $i = 0; $tags = array(); if($isHTML){ preg_match_all('/<[^>]+>([^<]*)/', $text, $m, PREG_OFFSET_CAPTURE | PREG_SET_ORDER); foreach($m as $o){ if($o[0][1] - $i >= $length) break; $t = substr(strtok($o[0][0], " \t\n\r\0\x0B>"), 1); if($t[0] != '/') $tags[] = $t; elseif(end($tags) == substr($t, 1)) array_pop($tags); $i += $o[1][1] - $o[0][1]; } } $output = substr($text, 0, $length = min(strlen($text), $length + $i)) . (count($tags = array_reverse($tags)) ? '' : ''); // Get everything until last space $one = substr($output, 0, strrpos($output, " ")); // Get the rest $two = substr($output, strrpos($output, " "), (strlen($output) - strrpos($output, " "))); // Extract all tags from the last bit preg_match_all('/<(.*?)>/s', $two, $tags); // Add suffix if needed if (strlen($text) > $length) { $one .= $suffix; } // Re-attach tags $output = $one . implode($tags[0]); return $output; } }Also, forgot to mention, changed '...' to '…' character.Snippets Manager replied on Sat, 2010/08/07 - 4:16pm
//+ Jonas Raoni Soares Silva //@ http://jsfromhell.com class String { public static function truncate($text, $length, $suffix = '…', $isHTML = true){ $i = 0; $simpleTags=array('br'=>true,'hr'=>true,'input'=>true,'image'=>true,'link'=>true,'meta'=>true); $tags = array(); if($isHTML){ preg_match_all('/<[^>]+>([^<]*)/', $text, $m, PREG_OFFSET_CAPTURE | PREG_SET_ORDER); foreach($m as $o){ if($o[0][1] - $i >= $length) break; $t = substr(strtok($o[0][0], " \t\n\r\0\x0B>"), 1); // test if the tag is unpaired, then we mustn't save them if($t[0] != '/' && (!isset($simpleTags[$t]))) $tags[] = $t; elseif(end($tags) == substr($t, 1)) array_pop($tags); $i += $o[1][1] - $o[0][1]; } } // output without closing tags $output = substr($text, 0, $length = min(strlen($text), $length + $i)); // closing tags $output2 = (count($tags = array_reverse($tags)) ? '' : ''); // Find last space or HTML tag (solving problem with last space in HTML tag eg. ) $pos = (int)end(end(preg_split('/<.*>| /', $output, -1, PREG_SPLIT_OFFSET_CAPTURE))); // Append closing tags to output $output.=$output2; // Get everything until last space $one = substr($output, 0, $pos); // Get the rest $two = substr($output, $pos, (strlen($output) - $pos)); // Extract all tags from the last bit preg_match_all('/<(.*?)>/s', $two, $tags); // Add suffix if needed if (strlen($text) > $length) { $one .= $suffix; } // Re-attach tags $output = $one . implode($tags[0]); //added to remove unnecessary closure $output = str_replace('','',$output); return $output; } }Snippets Manager replied on Mon, 2010/03/22 - 8:53am
Snippets Manager replied on Mon, 2010/03/22 - 8:53am
class String { public static function truncate($text, $length, $suffix = '…', $isHTML = true){ $i = 0; $tags = array(); if($isHTML){ preg_match_all('/<[^>]+>([^<]*)/', $text, $m, PREG_OFFSET_CAPTURE | PREG_SET_ORDER); foreach($m as $o){ if($o[0][1] - $i >= $length) break; $t = substr(strtok($o[0][0], " \t\n\r\0\x0B>"), 1); if($t[0] != '/') $tags[] = $t; elseif(end($tags) == substr($t, 1)) array_pop($tags); $i += $o[1][1] - $o[0][1]; } } $output = substr($text, 0, $length = min(strlen($text), $length + $i)) . (count($tags = array_reverse($tags)) ? '' : ''); if (strlen($text) > $length) { $output = substr($output,-4,4)=='' ? $output=substr($output,0,(strlen($output)-4)).$suffix.'' : $output.=$suffix; } return $output; } }There's a few extra lines at the end to make sure the suffix falls within a P tag if that is the last tag being closed. Also I've expanded some of the variable names ;)Snippets Manager replied on Mon, 2010/03/22 - 8:53am
Snippets Manager replied on Wed, 2010/04/21 - 6:41am
,
NextLine"
class String { public static function truncate($text, $length, $suffix = '…', $isHTML = true){ $i = 0; $simpleTags=array('br'=>true,'hr'=>true,'input'=>true,'image'=>true,'link'=>true,'meta'=>true); $tags = array(); if($isHTML){ preg_match_all('/<[^>]+>([^<]*)/', $text, $m, PREG_OFFSET_CAPTURE | PREG_SET_ORDER); foreach($m as $o){ if($o[0][1] - $i >= $length) break; $t = substr(strtok($o[0][0], " \t\n\r\0\x0B>"), 1); // test if the tag is unpaired, then we mustn't save them if($t[0] != '/' && (!isset($simpleTags[$t]))) $tags[] = $t; elseif(end($tags) == substr($t, 1)) array_pop($tags); $i += $o[1][1] - $o[0][1]; } } // output without closing tags $output = substr($text, 0, $length = min(strlen($text), $length + $i)); // closing tags $output2 = (count($tags = array_reverse($tags)) ? '' : ''); // Find last space or HTML tag (solving problem with last space in HTML tag eg. ) $pos = (int)end(end(preg_split('/<.*>| /', $output, -1, PREG_SPLIT_OFFSET_CAPTURE))); // Append closing tags to output $output.=$output2; // Get everything until last space $one = substr($output, 0, $pos); // Get the rest $two = substr($output, $pos, (strlen($output) - $pos)); // Extract all tags from the last bit preg_match_all('/<(.*?)>/s', $two, $tags); // Add suffix if needed if (strlen($text) > $length) { $one .= $suffix; } // Re-attach tags $output = $one . implode($tags[0]); return $output; } }Snippets Manager replied on Mon, 2012/05/07 - 3:09pm
Michael Pearson replied on Sat, 2013/02/16 - 1:35pm
in response to:
I know I am necroing on this, but I had to comment. If you plan on being a paid developer, your preference for short variable names, such as single letters, needs to change. Your variables need to be descriptive of what they pertain to. It doesn't take long to type $username instead of $u, but it does take other developers a significant amount of time figuring out what $u is referencing 300 lines of code later.
You are making a classic mistake, found mostly in beginners. If you write code all day, every day, in 3 months you will not be able to read your own code, let alone have other developers work with your code.
Snippets Manager replied on Sun, 2009/04/12 - 9:56am