Difference between revisions of "Mediawiki RawFile"
m (→Installation) |
|||
Line 143: | Line 143: | ||
</source> |
</source> |
||
===Hook to intercept the raw output=== |
===Hook to intercept the raw output=== |
||
+ | This part of the code doesn't look that nice because we've to parse the raw wiki page ourselves to retrieve the code sections we want. |
||
− | * Must extract the right paragraph |
||
+ | |||
− | ** Strip all up to the right <code>rawsnippet: filename</code> tag |
||
+ | </source> |
||
− | ** Find the next tag |
||
+ | {{#rawsnippetanchor: RawSnippet.php}} |
||
− | ** Select up to the closure tag |
||
+ | <source lang=php> |
||
− | * Must provide the filename to the browser |
||
+ | |||
− | * Tells the browser NOT to cache the raw |
||
+ | First let's see if <code>?action=raw</code> was used in the context of this extension: in that case we receive the filename as GET parameter, otherwise we simply return from our extension with return value=true which means we authorize the raw display (originally the hook was created to add an authentication point) |
||
− | Then the function to strip the code out of the raw wiki page, quite heavy as we've to parse the wiki page ourselves... |
||
{{#rawsnippetanchor: RawSnippet.php}} |
{{#rawsnippetanchor: RawSnippet.php}} |
||
<source lang=php> |
<source lang=php> |
||
Line 157: | Line 157: | ||
return true; |
return true; |
||
$filename=$_GET['rawsnippet']; |
$filename=$_GET['rawsnippet']; |
||
+ | </source> |
||
+ | Raw action already set the headers with some client cache pragmas and is supposed to be displayed in the browser but in our case we want to make this "page" a downloadable file so we overwrite the headers which were defined and we add a few more, to ensure there is no caching on the client (it's very hard for the client to force a refresh on a file download, contrary to a web page) and to provide the adequate filename. |
||
+ | {{#rawsnippetanchor: RawSnippet.php}} |
||
+ | <source lang=php> |
||
header("Content-disposition:filename=".$filename); |
header("Content-disposition:filename=".$filename); |
||
header("Content-type:application/octetstream"); |
header("Content-type:application/octetstream"); |
||
Line 163: | Line 167: | ||
header("Pragma: no-cache"); |
header("Pragma: no-cache"); |
||
header("Cache-Control: no-store"); |
header("Cache-Control: no-store"); |
||
+ | </source> |
||
− | // First: find the right anchor |
||
+ | Then we'll strip the output, first we've to locate the anchors but there are anchors that could be protected in literal blocks like nowiki. |
||
− | // We mask nowiki sections |
||
+ | <br>So we'll mask the literal blocks before searching for the anchors (we mask with the same string length because we'll retrieve an offset that we will use on the initial string and offsets must match) |
||
− | + | <br>'''TODO''': should we care also of source, js, css, pre,... blocks? |
|
− | //test2 will contain interpretable content, all static content is blanked out |
||
+ | {{#rawsnippetanchor: RawSnippet.php}} |
||
+ | <source lang=php> |
||
$maskedtext=preg_replace_callback('/<nowiki>(.*?)<\/nowiki>/', |
$maskedtext=preg_replace_callback('/<nowiki>(.*?)<\/nowiki>/', |
||
create_function( |
create_function( |
||
Line 173: | Line 179: | ||
), |
), |
||
$text); |
$text); |
||
+ | </source> |
||
− | // We search rawsnippet anchor position |
||
+ | Now we can search for the anchors (or the short version, in which case we only keep the first hit, no multiple blocks support) |
||
+ | <br>And we free the memory used for the masked version |
||
+ | <br>'''TODO''': instead of cowardly returning if we don't find our anchors, we should cancel the headers and return a proper error page |
||
+ | {{#rawsnippetanchor: RawSnippet.php}} |
||
+ | <source lang=php> |
||
if (preg_match_all('/{{#rawsnippetanchor: +'.$filename.' *}}/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE)) |
if (preg_match_all('/{{#rawsnippetanchor: +'.$filename.' *}}/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE)) |
||
$offsets=$matches[0]; |
$offsets=$matches[0]; |
||
else if (preg_match_all('/{{#rawsnippet: +'.$filename.' *}}/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE)) |
else if (preg_match_all('/{{#rawsnippet: +'.$filename.' *}}/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE)) |
||
− | // If the shortcut "rawsnippet" is used, no nuweb, just the first hit is considered |
||
$offsets=array($matches[0][0]); |
$offsets=array($matches[0][0]); |
||
else |
else |
||
// We didn't find our anchor, let's output all the raw... |
// We didn't find our anchor, let's output all the raw... |
||
− | // TODO change headers & send error msg |
||
return true; |
return true; |
||
− | // free some mem |
||
unset($maskedtext); |
unset($maskedtext); |
||
+ | </source> |
||
+ | $text is both input & output so we copy it and start with an empty output. |
||
+ | {{#rawsnippetanchor: RawSnippet.php}} |
||
+ | <source lang=php> |
||
$textorig=$text; |
$textorig=$text; |
||
$text=''; |
$text=''; |
||
+ | </source> |
||
+ | For each anchor found we've to isolate the content of the next block. |
||
+ | {{#rawsnippetanchor: RawSnippet.php}} |
||
+ | <source lang=php> |
||
foreach ($offsets as $offset) { |
foreach ($offsets as $offset) { |
||
+ | </source> |
||
− | + | Let's remove the text up to the tag following the anchor |
|
+ | <br>'''TODO''': the next tag could be a < br >, which we should skip |
||
+ | {{#rawsnippetanchor: RawSnippet.php}} |
||
+ | <source lang=php> |
||
$out = substr($textorig, $offset[1]); |
$out = substr($textorig, $offset[1]); |
||
⚫ | |||
$out = substr($out, strpos($out, '<')); |
$out = substr($out, strpos($out, '<')); |
||
+ | </source> |
||
⚫ | |||
+ | <br>Note that we're looking to the word directly following '<' up to '>' or a space, e.g. if there are arguments to the tag. |
||
+ | <br>'''TODO''': once again, better handling of errors than just returning. |
||
+ | {{#rawsnippetanchor: RawSnippet.php}} |
||
+ | <source lang=php> |
||
if (!preg_match('/^<([^> ]+)/', $out, $matches)) |
if (!preg_match('/^<([^> ]+)/', $out, $matches)) |
||
− | // TODO send error, we could not find end of bloc |
||
return true; |
return true; |
||
$key = $matches[1]; |
$key = $matches[1]; |
||
+ | </source> |
||
− | + | OK, let's extract the text up to the closing tag |
|
+ | <br>We skip the first carriage return after the opening tag, if any |
||
+ | <br>We look for the closing tag and we take what's in between. |
||
+ | <br>'''TODO''': once again, better handling of errors than just returning. |
||
+ | {{#rawsnippetanchor: RawSnippet.php}} |
||
+ | <source lang=php> |
||
$begin = strpos($out, '>')+1; |
$begin = strpos($out, '>')+1; |
||
if (ord(substr($out,$begin,1))==10) |
if (ord(substr($out,$begin,1))==10) |
||
Line 203: | Line 233: | ||
$text .= substr($out, $begin, $matches[0][0][1]-$begin); |
$text .= substr($out, $begin, $matches[0][0][1]-$begin); |
||
else |
else |
||
− | + | // error, we could not find end of bloc |
|
$text .= substr($out, $begin); |
$text .= substr($out, $begin); |
||
} |
} |
||
+ | </source> |
||
− | |||
+ | Be nice with the browser and tell it how much data we'll send. |
||
+ | <br>And that's it, $text contains our file! |
||
+ | {{#rawsnippetanchor: RawSnippet.php}} |
||
+ | <source lang=php> |
||
header("Content-Length: ".strlen($text)); |
header("Content-Length: ".strlen($text)); |
||
return true; |
return true; |
||
− | //TODO: downloadAs.. |
||
} |
} |
||
</source> |
</source> |
||
+ | |||
===Credits=== |
===Credits=== |
||
{{#rawsnippetanchor: RawSnippet.php}} |
{{#rawsnippetanchor: RawSnippet.php}} |
Revision as of 23:48, 3 April 2008
Introduction
Originally the idea was to be able to download directly a portion of code as a file.
I've numerous code examples in my wiki and I wanted an easy way to download them, easier than a copy/paste!
But from there it was rather easy to get something very close to literate programming just by allowing multiple blocks referring to the same file, which will be concatenated together at download time.
- It must work with pre, nowiki, js, css, code, source, so let's make it general: take the tag that comes after the parser function we'll create and select data up to the closing tag.
- There are two distinct functionalities provided by the extension:
- the parser that will convert a magic word into a link to the download URL
- an extended ?action=raw that will strip the raw output to keep the desired code
Documentation
- http://www.mediawiki.org/wiki/Manual:Extensions
- http://www.mediawiki.org/wiki/Manual:Magic_words
- http://www.mediawiki.org/wiki/Manual:Parser_functions
- http://meta.wikimedia.org/wiki/Help:Parser_function
- http://www.mediawiki.org/wiki/Manual:Hooks/RawPageViewBeforeOutput
- http://en.wikipedia.org/wiki/Literate_programming
Syntax
There are 2 kinds of elements to add to the wiki language:
- anchors that will flag which code blocks belong to a specific file
{{#rawsnippetAnchor: myscript.sh}}
- Not visible in the regular wiki display
- links that will allow to download the file
{{#rawsnippetLink: myscript.sh}}
- Transformed into real URLs:
{{fullurl:{{PAGENAME}}|action=raw&rawsnippet=myscript.sh}}
For regular use, when a single code block is used and when the download link can be at the same position as the anchor, there is a shortcut notation mixing both anchor & link properties:
{{#rawsnippet: myscript.sh}}
Short example
Let's save the following code [{{#rawsnippet: myscript.sh}} as myscript.sh] <source lang=bash> #!/bin/bash echo 'Hello world!' exit 0 </source>
will give:
Let's save the following code [{{#rawsnippet: myscript.sh}} as myscript.sh]
#!/bin/bash
echo 'Hello world!'
exit 0
Complete example
And a full example with anchors & link:
Let's start with the Bash usual header: {{#rawsnippetanchor: myotherscript.sh}} <source lang=bash> #!/bin/bash </source> Then we'll display a welcome message: {{#rawsnippetanchor: myotherscript.sh}} <source lang=bash> echo 'Welcome on earth!' exit 0 </source> [{{#rawsnippetlink: myotherscript.sh}} myotherscript.sh is now available for download below the code]
will give:
Let's start with the Bash usual header: {{#rawsnippetanchor: myotherscript.sh}}
#!/bin/bash
Then we'll display a welcome message: {{#rawsnippetanchor: myotherscript.sh}}
echo 'Welcome on earth!'
exit 0
[{{#rawsnippetlink: myotherscript.sh}} myotherscript.sh is now available for download below the code]
The code
Which you can of course download just by following [{{#rawsnippetlink: RawSnippet.php}} this link :-)]
So let's explain a bit the code in a Literate Programming way...
Hooks
First some hooks for our functions... {{#rawsnippetanchor: RawSnippet.php}}
<?php
if (defined('MEDIAWIKI')) {
# Define a setup function
$wgExtensionFunctions[] = 'efRawSnippet_Setup';
# Add a hook to initialise the magic words
$wgHooks['LanguageGetMagic'][] = 'efRawSnippet_Magic';
# Add a hook to intercept the raw output
$wgHooks['RawPageViewBeforeOutput'][] = 'fnRawSnippet_Strip';
Setup function
{{#rawsnippetanchor: RawSnippet.php}}
function efRawSnippet_Setup() {
global $wgParser;
# Set a function hook associating the "rawsnippet" magic word with our function
$wgParser->setFunctionHook( 'rawsnippet', 'efRawSnippet_Render' );
$wgParser->setFunctionHook( 'rawsnippetlink', 'efRawSnippet_Render' );
$wgParser->setFunctionHook( 'rawsnippetanchor', 'efRawSnippet_Empty' );
}
Hook to initialize the magic words
{{#rawsnippetanchor: RawSnippet.php}}
function efRawSnippet_Magic( &$magicWords, $langCode ) {
# Add the magic word
# The first array element is case sensitive, in this case it is not case sensitive
# All remaining elements are synonyms for our parser function
$magicWords['rawsnippet'] = array( 0, 'rawsnippet', 'downloadAs' );
$magicWords['rawsnippetlink'] = array( 0, 'rawsnippetlink', 'downloadLink' );
$magicWords['rawsnippetanchor'] = array( 0, 'rawsnippetanchor', 'downloadAnchor' );
# unless we return true, other parser functions extensions will not get loaded.
return true;
}
Parser functions of the magic words
The transformation rule to replace link shortcuts to actual links for download {{#rawsnippetanchor: RawSnippet.php}}
function efRawSnippet_Render( &$parser, $filename = '') {
# The parser function itself
# The input parameters are wikitext with templates expanded
# The output should be wikitext too
return '{{fullurl:{{PAGENAME}}|action=raw&rawsnippet='.$filename.'}}';
//TODO+support for other pages
}
And the other one, just removing the anchors from the rendered wiki page {{#rawsnippetanchor: RawSnippet.php}}
function efRawSnippet_Empty( &$parser, $filename = '') {
return '';
}
Hook to intercept the raw output
This part of the code doesn't look that nice because we've to parse the raw wiki page ourselves to retrieve the code sections we want.
</source> {{#rawsnippetanchor: RawSnippet.php}}
First let's see if <code>?action=raw</code> was used in the context of this extension: in that case we receive the filename as GET parameter, otherwise we simply return from our extension with return value=true which means we authorize the raw display (originally the hook was created to add an authentication point)
{{#rawsnippetanchor: RawSnippet.php}}
<source lang=php>
function fnRawSnippet_Strip(&$rawPage, &$text) {
// if our ext wasn't used, just exit
if (!isset($_GET['rawsnippet']))
return true;
$filename=$_GET['rawsnippet'];
Raw action already set the headers with some client cache pragmas and is supposed to be displayed in the browser but in our case we want to make this "page" a downloadable file so we overwrite the headers which were defined and we add a few more, to ensure there is no caching on the client (it's very hard for the client to force a refresh on a file download, contrary to a web page) and to provide the adequate filename. {{#rawsnippetanchor: RawSnippet.php}}
header("Content-disposition:filename=".$filename);
header("Content-type:application/octetstream");
header("Content-Transfer-Encoding: binary");
header("Expires: 0");
header("Pragma: no-cache");
header("Cache-Control: no-store");
Then we'll strip the output, first we've to locate the anchors but there are anchors that could be protected in literal blocks like nowiki.
So we'll mask the literal blocks before searching for the anchors (we mask with the same string length because we'll retrieve an offset that we will use on the initial string and offsets must match)
TODO: should we care also of source, js, css, pre,... blocks?
{{#rawsnippetanchor: RawSnippet.php}}
$maskedtext=preg_replace_callback('/<nowiki>(.*?)<\/nowiki>/',
create_function(
'$matches',
'return ereg_replace(".","X",$matches[0]);'
),
$text);
Now we can search for the anchors (or the short version, in which case we only keep the first hit, no multiple blocks support)
And we free the memory used for the masked version
TODO: instead of cowardly returning if we don't find our anchors, we should cancel the headers and return a proper error page
{{#rawsnippetanchor: RawSnippet.php}}
if (preg_match_all('/{{#rawsnippetanchor: +'.$filename.' *}}/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE))
$offsets=$matches[0];
else if (preg_match_all('/{{#rawsnippet: +'.$filename.' *}}/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE))
$offsets=array($matches[0][0]);
else
// We didn't find our anchor, let's output all the raw...
return true;
unset($maskedtext);
$text is both input & output so we copy it and start with an empty output. {{#rawsnippetanchor: RawSnippet.php}}
$textorig=$text;
$text='';
For each anchor found we've to isolate the content of the next block. {{#rawsnippetanchor: RawSnippet.php}}
foreach ($offsets as $offset) {
Let's remove the text up to the tag following the anchor
TODO: the next tag could be a < br >, which we should skip
{{#rawsnippetanchor: RawSnippet.php}}
$out = substr($textorig, $offset[1]);
$out = substr($out, strpos($out, '<'));
What type of tag do we have?
Note that we're looking to the word directly following '<' up to '>' or a space, e.g. if there are arguments to the tag.
TODO: once again, better handling of errors than just returning.
{{#rawsnippetanchor: RawSnippet.php}}
if (!preg_match('/^<([^> ]+)/', $out, $matches))
return true;
$key = $matches[1];
OK, let's extract the text up to the closing tag
We skip the first carriage return after the opening tag, if any
We look for the closing tag and we take what's in between.
TODO: once again, better handling of errors than just returning.
{{#rawsnippetanchor: RawSnippet.php}}
$begin = strpos($out, '>')+1;
if (ord(substr($out,$begin,1))==10)
$begin++;
if (preg_match_all('/<\/'.$key.'>/', $out, $matches, PREG_OFFSET_CAPTURE))
$text .= substr($out, $begin, $matches[0][0][1]-$begin);
else
// error, we could not find end of bloc
$text .= substr($out, $begin);
}
Be nice with the browser and tell it how much data we'll send.
And that's it, $text contains our file!
{{#rawsnippetanchor: RawSnippet.php}}
header("Content-Length: ".strlen($text));
return true;
}
Credits
{{#rawsnippetanchor: RawSnippet.php}}
$wgExtensionCredits['parserhook'][] = array('name' => 'RawSnippet',
'version' => '0.1',
'author' => 'Philippe Teuwen',
// 'url' => 'http://www.mediawiki.org/wiki/Extension:LocalServer',
'url' => 'http://wiki.yobi.be/wiki/Mediawiki_RawSnippet',
'description' => 'Downloads a RAW copy of <nowiki><tag>data</tag></nowiki> in a file<br>'.
'Useful e.g. to download an example code or a patch<br>'.
'It also opens the path to [http://en.wikipedia.org/wiki/Literate_programming Literate Programming]');
}
?>
Installation
Download [{{#rawsnippetlink: RawSnippet.php}} RawSnippet.php] and save it under the MediaWiki directory as extensions/RawSnippet/RawSnippet.php
Add at the end of LocalSettings.php:
require_once("$IP/extensions/RawSnippet/RawSnippet.php");