MarcosBL

Aprendiz de todo, maestro de nada

Syntax Highlight con HTML Purifier y GeSHi

En Syntax Highlighting and Allowing HTML in Comments, Jay Pipes nos explica cómo han utilizado las librerias PHP HTMLPurifier de Edward Yang y GeSHi de Nigel McNie para crear la funcionalidad de Sytax Highlight en el nuevo MySQL Forge.

Además nos facilitan las dos elegantes funciones ( clean_and_codify y syntax_highlight ) que han acabado utilizando en «vivo»:

[php]
/**
* Highlights the text as code in the supplied language
*
* @return string The marked up code
* @param subject The text to markup
* @param language The language to use for highlighting
*/
public static function syntax_highlight($subject, $language) {
/* Format the code with GeSHi */
include_once(APP_DIR . ‘/opt/geshi/geshi.php’);
$geshi= new GeSHi($subject, $language);
$geshi->enable_classes();
$geshi->enable_line_numbers(GESHI_NORMAL_LINE_NUMBERS);
return $geshi->parse_code();
}

/**
* Returns a cleaned and syntax-highlighted string of HTML
*
* @return string Cleaned and codified text
* @param subject The text to cut into code pieces
*/
public static function clean_and_codify($subject) {
$original= $subject;
$code_pieces= array();
$code_regex= ‘/[\[\< ]code\s*(lang|language)\=[\"\'](\w+)[\"\'][\]\>]([\D\S]+?)[\[\< ]\/code[\]\>]/’;
$code_delimiter= «CODECODECODE»;

/* First split the text into code and non-code blocks */
while (preg_match($code_regex, $subject, $code_matches) == 1) {
$language= trim(strtolower($code_matches[2])); // 0-index is the full match
$code_sample= $code_matches[3];
$entire_code_string= $code_matches[0];
$code_sample= str_replace(«\t», » «, $code_sample); /* Replace tabs with spaces */
$code_pieces[]= array(‘lang’=>$language
, ‘text’=>$code_sample);
$subject= str_replace($entire_code_string, $code_delimiter, $subject);
$code_matches= array(); //reset
}

/*
* Assume two consecutive newlines are a paragraph.
*/
/* Normalize Newlines */
$subject = str_replace(«\r\n», «\n», $subject);
$subject = str_replace(«\r», «\n», $subject);
$subject = preg_replace(«/[\n]{2}/», «

«, $subject);

/*
* Next, do the same thing with markup sections
* We use HTMLPurifier here for safe checks with some allowed
* tags for ease of use
*/
include_once(APP_DIR . ‘/opt/htmlpurifier/library/HTMLPurifier.auto.php’);
$config = HTMLPurifier_Config::createDefault();
$config->set(‘HTML’, ‘Doctype’, ‘XHTML 1.0 Transitional’);
$config->set(‘HTML’, ‘AllowedElements’, ‘a,em,blockquote,p,code,pre,strong,b’);
$config->set(‘HTML’, ‘AllowedAttributes’, ‘a.href,a.title’);
$config->set(‘HTML’, ‘TidyLevel’, ‘light’); // should be enough since we don’t allow many elements. this really just cleans up dangling elements…
$purifier= new HTMLPurifier();
$subject= $purifier->purify($subject, $config);

/*
* Now $subject should contain CleanMarkup\n|||CODE|||\nCleanMarkup…
* We now replace the code sections by passing an executable string
* to the regex parser (the /e option) and using the syntax_highlight
* function to do the grunt work
*/
$num_code_pieces= count($code_pieces);
$i= 0;
if ($num_code_pieces > 0) {
$replacement= «TextDecorator::syntax_highlight(trim(\$code_pieces[\$i][‘text’], \»\r\n \»), \$code_pieces[\$i++][‘lang’]);»;
$subject= preg_replace(‘/’ . $code_delimiter . ‘/e’, $replacement, $subject);
}
return $subject;
}
[/php]

2 comentarios en “Syntax Highlight con HTML Purifier y GeSHi

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *