Difference between revisions of "Mediawiki RawFile"
m (→Credits) |
(→ChangeLog: Say have a work-around for 0.5.1) |
||
(68 intermediate revisions by 4 users not shown) | |||
Line 12: | Line 12: | ||
** an extended ?action=raw that will strip the raw output to keep the desired code |
** an extended ?action=raw that will strip the raw output to keep the desired code |
||
+ | |||
− | ==Documentation used to create the extension== |
||
− | * http://www.mediawiki.org/wiki/Manual:Extensions |
||
− | * http://www.mediawiki.org/wiki/Manual:Magic_words |
||
− | * http://www.mediawiki.org/wiki/Manual:Parser_functions |
||
− | * http://meta.wikimedia.org/wiki/Help:Parser_function |
||
− | * http://www.mediawiki.org/wiki/Manual:Hooks/RawPageViewBeforeOutput |
||
− | * http://en.wikipedia.org/wiki/Literate_programming |
||
==Syntax== |
==Syntax== |
||
+ | The extension introduces 3 elements: |
||
− | There are 2 kinds of elements to add to the wiki language: |
||
+ | ;Anchor |
||
− | * anchors that will flag which code blocks belong to a specific file |
||
+ | : Used to flag that the next code block in the wiki text belongs to a specific file. The code block can be any wiki block (such as '''<code><pre></code>''', '''<code><code></code>''', '''<code><tt></code>''', '''<code><source></code>'''...). '''<code><br></code>''' tags are ignored. Note that anchors are invisible in the wiki display. |
||
− | ** <code><nowiki>{{#fileAnchor: myscript.sh}}</nowiki></code> |
||
+ | ;Link |
||
− | ** Not visible in the regular wiki display |
||
+ | : They are transformed by the extension into links that allows for downloading all blocks attached to a given anchor name. |
||
− | * links that will allow to download the file |
||
+ | ;Anchor-link |
||
− | ** <code><nowiki>{{#fileLink: myscript.sh}}</nowiki></code> |
||
+ | : A shortcut notation mixing both an anchor and download link, handy for regular use, when a single code block is used and when the download link can be at the same position as the anchor. |
||
− | ** Transformed into new regular wikicode that will be eventually transformed to real URLs: <br><code><nowiki>{{fullurl:{{PAGENAME}}|action=raw&file=myscript.sh}}</nowiki></code><br><code><nowiki>http://wiki.yobi.be/index.php?title=Mediawiki_RawFile&action=raw&file=myscript.sh</nowiki></code> |
||
+ | The syntax is as follows. The syntax using tag <code><file></code> and tag attribute <code>class</code> is new since v0.4. Note that elements of both syntaxes can be mixed in a same page. |
||
− | For regular use, when a single code block is used and when the download link can be at the same position as the anchor, there is a shortcut notation mixing both anchor & link properties: |
||
+ | {| border="2" cellspacing="4" cellpadding="3" style="margin: 1em 1em 1em 0; background: #f9f9f9; border: 1px #aaaaaa solid; border-collapse: collapse; empty-cells:show;" |
||
− | <code><nowiki>{{#file: myscript.sh}}</nowiki></code> |
||
+ | !width="80em" style="background: #8da7d6;"|Element!!style="background: #8da7d6;"|Syntax and description |
||
+ | |- |
||
+ | |'''Anchor''' |
||
+ | |<pre><nowiki> |
||
+ | {{#fileAnchor: anchorname}} |
||
+ | <pre class='anchorname'>...</pre> |
||
+ | <code class="anchorname">...</code> |
||
+ | <code class="cssclass anchorname">...</code> |
||
+ | ... |
||
+ | </nowiki></pre> |
||
+ | Indicates that the next wiki block is attached to an anchor ''anchorname''. The content of that block will be downloaded (possibly appended with other blocks if there are several blocks attached to the same ''anchorname'') when a file link is clicked on.<br/> |
||
+ | '''(since v0.4)''' To attach an anchor ''anchorname'' to a wiki block, simply add an attribute <code>class="anchorname"</code> to it. The extension supports multi-class specification, meaning that a same block can be associated to different files, and that the <code>class</code> attribute can still be used to specify custom CSS properties as in standard wiki text. |
||
+ | ; ''anchorname'' |
||
+ | ; class="''anchorname''" |
||
+ | : The name of the anchor to which the wiki block is attached |
||
+ | |- |
||
+ | |'''Link''' |
||
+ | |<pre><nowiki> |
||
+ | [{{#fileLink: anchorname}} link text] |
||
+ | [{{#fileLink: anchorname|pagetitle}} link text] |
||
+ | <file anchor="anchorname" [name="filename"] [title="pagetitle"]>link text</file> |
||
+ | </nowiki></pre> |
||
+ | Creates a link to download all blocks that are attached to an anchor ''anchorname''. |
||
+ | ;''anchorname'' |
||
+ | ;anchor="''anchorname''" |
||
+ | : The name of the anchor to look for. All blocks attached to an anchor ''anchorname'' will be downloaded. |
||
+ | ;name="''filename''" |
||
+ | :''Optional'' - Specifies the name of the file to download. If absent, ''anchorname'' is then used as the name of the downloaded file. |
||
+ | ; ''pagetitle'' |
||
+ | ;title="''pagetitle''" |
||
+ | : ''Optional'' - Indicates that the blocks to download are on the wiki page titled ''pagetitle''. If absent, blocks are looked for on the current page. |
||
+ | ; ''link text'' |
||
+ | : The text of the link to display. |
||
+ | |- |
||
+ | |'''Anchor-link''' |
||
+ | |<pre><nowiki> |
||
+ | [{{#file: filename}} link text] |
||
+ | <file name="filename" [tag="''tagname''"]>link text</file> |
||
+ | </nowiki></pre> |
||
+ | Creates a link to download the next wiki block as a file named ''filename''.<br/> |
||
+ | '''(since v0.4)''' The attribute <code>tag</code> can be used to specify the ''tagname'' of the block to download.<br> |
||
+ | ; ''filename'' |
||
+ | ; name="''filename''" |
||
+ | : The name of the file to download. |
||
+ | ;tag="''tagname''" |
||
+ | :''Optional'' - When set, the extension only looks for blocks whose name matches the given ''tagname''. This attribute is particularly useful when there are some irrelevant blocks between the '''anchor-link''' and the block you want to download. If absent, the first encountered block following the anchor is downloaded. |
||
+ | ; ''link text'' |
||
+ | : The text of the link to display. |
||
+ | |} |
||
===Short example=== |
===Short example=== |
||
+ | The extension works with any block such as pre, nowiki, js, css, code, source,... |
||
− | <pre>Let's save the following code [{{#file: myscript.sh}} as myscript.sh] |
||
+ | <br>This example is using the syntax highlighting <nowiki><source></nowiki> tag provided by [http://www.mediawiki.org/wiki/Extension:SyntaxHighlight_GeSHi SyntaxHighlight extension] (using [http://qbnz.com/highlighter/ GeSHi Highlighter]) |
||
+ | <br>If you didn't install that extension on your MediaWiki, you can try the example by using <nowiki><pre></nowiki> instead of <nowiki><source></nowiki>. |
||
+ | |||
+ | <pre><nowiki>Let's save the following code [{{#file: myscript.sh}} as myscript.sh] |
||
<source lang=bash> |
<source lang=bash> |
||
#!/bin/bash |
#!/bin/bash |
||
Line 39: | Line 86: | ||
echo 'Hello world!' |
echo 'Hello world!' |
||
exit 0 |
exit 0 |
||
− | </source> |
+ | </source></nowiki> |
</pre> |
</pre> |
||
will give: |
will give: |
||
Line 54: | Line 101: | ||
And a full example with anchors & link: |
And a full example with anchors & link: |
||
− | <pre> |
+ | <pre><nowiki> |
Let's start with the Bash usual header: |
Let's start with the Bash usual header: |
||
{{#fileanchor: myotherscript.sh}} |
{{#fileanchor: myotherscript.sh}} |
||
Line 63: | Line 110: | ||
{{#fileanchor: myotherscript.sh}} |
{{#fileanchor: myotherscript.sh}} |
||
<source lang=bash> |
<source lang=bash> |
||
− | echo 'Welcome on |
+ | echo 'Welcome on Earth!' |
</source> |
</source> |
||
And we finally exit cleanly: |
And we finally exit cleanly: |
||
Line 71: | Line 118: | ||
</source> |
</source> |
||
[{{#filelink: myotherscript.sh}} myotherscript.sh is now available for download below the code] |
[{{#filelink: myotherscript.sh}} myotherscript.sh is now available for download below the code] |
||
− | </pre> |
+ | </nowiki></pre> |
will give: |
will give: |
||
---- |
---- |
||
Line 82: | Line 129: | ||
{{#fileanchor: myotherscript.sh}} |
{{#fileanchor: myotherscript.sh}} |
||
<source lang=bash> |
<source lang=bash> |
||
− | echo 'Welcome on |
+ | echo 'Welcome on Earth!' |
</source> |
</source> |
||
And we finally exit cleanly: |
And we finally exit cleanly: |
||
Line 91: | Line 138: | ||
[{{#filelink: myotherscript.sh}} myotherscript.sh is now available for download below the code] |
[{{#filelink: myotherscript.sh}} myotherscript.sh is now available for download below the code] |
||
+ | === Example using templates === |
||
− | ==The code== |
||
+ | |||
+ | The new syntax <code><file name="..." [tag="..."]>...</file></code> allows for using the RawFile extension in templates as well. |
||
+ | |||
+ | The example below uses the template [[Template:Rawfiledownloadexample]] to avoid duplication between the file name in the text and as a parameter in the <code><file></code> tag. |
||
+ | |||
+ | See the template source for instructions on how to create the template. |
||
+ | |||
+ | <pre> |
||
+ | {{Rawfiledownloadexample|name=myfile.txt|content=Once upon a time |
||
+ | There was a Tag |
||
+ | Tag was clickable |
||
+ | And clicked it was}} |
||
+ | </pre> |
||
+ | |||
+ | The code above gives |
||
+ | <hr/> |
||
+ | {{Rawfiledownloadexample|name=myfile.txt|content=Once upon a time |
||
+ | There was a Tag |
||
+ | Tag was clickable |
||
+ | And clicked it was}} |
||
+ | |||
+ | ==The code (the ultimate example)== |
||
Which you can of course download just by following [{{#filelink: RawFile.php}} this link :-)] |
Which you can of course download just by following [{{#filelink: RawFile.php}} this link :-)] |
||
Line 97: | Line 166: | ||
===Hooks=== |
===Hooks=== |
||
First some hooks for our functions... |
First some hooks for our functions... |
||
+ | |||
+ | We will create: |
||
+ | * a [http://www.mediawiki.org/wiki/Manual:Parser_functions Parser Function] (see also [http://meta.wikimedia.org/wiki/Help:Parser_function here]), with help of |
||
+ | ** [http://www.mediawiki.org/wiki/Manual:%24wgExtensionFunctions $wgExtensionFunctions] or [http://www.mediawiki.org/wiki/Manual:Hooks/ParserFirstCallInit ParserFirstCallInit global hook] to define the setup function |
||
+ | ** [http://www.mediawiki.org/wiki/Manual:Magic_words Magic Words] |
||
+ | ** [http://www.mediawiki.org/wiki/Manual:Tag_extensions Tag extensions] |
||
+ | ** [http://www.mediawiki.org/wiki/Manual:Hooks/LanguageGetMagic LanguageGetMagic] hook to initialize the magic words |
||
+ | * a [http://www.mediawiki.org/wiki/Manual:Hooks/RawPageViewBeforeOutput RawPageViewBeforeOutput] hook to intercept the raw output |
||
+ | |||
{{#fileanchor: RawFile.php}} |
{{#fileanchor: RawFile.php}} |
||
<source lang=php> |
<source lang=php> |
||
Line 103: | Line 181: | ||
if (defined('MEDIAWIKI')) { |
if (defined('MEDIAWIKI')) { |
||
+ | //Avoid unstubbing $wgParser on setHook() too early on modern (1.12+) MW versions, as per r35980 |
||
− | # Define a setup function |
||
+ | if ( defined( 'MW_SUPPORTS_PARSERFIRSTCALLINIT' ) ) { |
||
− | $wgExtensionFunctions[] = 'efRawFile_Setup'; |
||
+ | $wgHooks['ParserFirstCallInit'][] = 'efRawFile_Setup'; |
||
− | # Add a hook to initialise the magic words |
||
+ | } else { // Otherwise do things the old fashioned way |
||
+ | $wgExtensionFunctions[] = 'efRawFile_Setup'; |
||
+ | } |
||
$wgHooks['LanguageGetMagic'][] = 'efRawFile_Magic'; |
$wgHooks['LanguageGetMagic'][] = 'efRawFile_Magic'; |
||
− | # Add a hook to intercept the raw output |
||
$wgHooks['RawPageViewBeforeOutput'][] = 'fnRawFile_Strip'; |
$wgHooks['RawPageViewBeforeOutput'][] = 'fnRawFile_Strip'; |
||
</source> |
</source> |
||
+ | |||
===Setup function=== |
===Setup function=== |
||
− | For the wiki parsing to create download links, file and fileLink are equally treated, while fileAnchor will be simply left out. |
+ | For the wiki parsing to create download links, the parser functions '''file''' and '''fileLink''' are equally treated, while '''fileAnchor''' will be simply left out. We also create a new tag '''file''' as explained [http://www.mediawiki.org/wiki/Manual:Tag_extensions here]. |
{{#fileanchor: RawFile.php}} |
{{#fileanchor: RawFile.php}} |
||
<source lang=php> |
<source lang=php> |
||
function efRawFile_Setup() { |
function efRawFile_Setup() { |
||
global $wgParser; |
global $wgParser; |
||
− | # Set a function hook associating the "file" magic word with our function |
||
$wgParser->setFunctionHook( 'file', 'efRawFile_Render' ); |
$wgParser->setFunctionHook( 'file', 'efRawFile_Render' ); |
||
$wgParser->setFunctionHook( 'filelink', 'efRawFile_Render' ); |
$wgParser->setFunctionHook( 'filelink', 'efRawFile_Render' ); |
||
$wgParser->setFunctionHook( 'fileanchor', 'efRawFile_Empty' ); |
$wgParser->setFunctionHook( 'fileanchor', 'efRawFile_Empty' ); |
||
+ | $wgParser->setHook( 'file', 'efRawFile_FileTagRender' ); |
||
+ | return true; |
||
} |
} |
||
+ | |||
</source> |
</source> |
||
===Hook to initialize the magic words=== |
===Hook to initialize the magic words=== |
||
+ | We add the magic words here: the first array element indicates if it is case sensitive, in this case it is not case sensitive. We could add extra elements to create synonyms for our parser function. |
||
+ | <br>Unless we return true, other parser functions extensions will not get loaded. |
||
{{#fileanchor: RawFile.php}} |
{{#fileanchor: RawFile.php}} |
||
<source lang=php> |
<source lang=php> |
||
function efRawFile_Magic( &$magicWords, $langCode ) { |
function efRawFile_Magic( &$magicWords, $langCode ) { |
||
− | # Add the magic word |
||
− | # The first array element is case sensitive, in this case it is not case sensitive |
||
− | # All remaining elements are synonyms for our parser function |
||
$magicWords['file'] = array( 0, 'file' ); |
$magicWords['file'] = array( 0, 'file' ); |
||
$magicWords['filelink'] = array( 0, 'filelink' ); |
$magicWords['filelink'] = array( 0, 'filelink' ); |
||
$magicWords['fileanchor'] = array( 0, 'fileanchor' ); |
$magicWords['fileanchor'] = array( 0, 'fileanchor' ); |
||
− | # unless we return true, other parser functions extensions will not get loaded. |
||
return true; |
return true; |
||
} |
} |
||
+ | |||
</source> |
</source> |
||
===Parser functions of the magic words=== |
===Parser functions of the magic words=== |
||
− | The transformation rule to replace link shortcuts to actual links for download |
+ | The transformation rule to replace link shortcuts to actual links for download, handling an optional local wiki page title if present. |
+ | <br>The input parameters are wikitext with templates expanded, the output should be wikitext too |
||
<br>'''TODO''': what error to send out if there is no filename given? |
<br>'''TODO''': what error to send out if there is no filename given? |
||
+ | <br>'''EDIT''': It seems that [http://svn.wikimedia.org/viewvc/mediawiki?view=rev&revision=27667 commit 27667] (1.11 -> 1.12) changed the default parser, which breaks the recursive parsing. Thanks to Tim Starling for helping me to get around the problem! |
||
− | <br>'''TODO''': supports links to files located in other local wiki pages, sth like 2nd arg default to $pagename='{{PAGENAME}}' |
||
{{#fileanchor: RawFile.php}} |
{{#fileanchor: RawFile.php}} |
||
<source lang=php> |
<source lang=php> |
||
− | function efRawFile_Render( &$parser, $filename = '') { |
+ | function efRawFile_Render( &$parser, $filename = '', $titleText = '') { |
+ | if( $titleText == '' ) |
||
− | # The parser function itself |
||
+ | $title = $parser->mTitle; |
||
− | # The input parameters are wikitext with templates expanded |
||
+ | else |
||
− | # The output should be wikitext too |
||
+ | $title = Title::newFromText( $titleText ); |
||
− | return '{{fullurl:{{PAGENAME}}|action=raw&file='.$filename.'}}'; |
||
+ | //Don't expand templates or we'll lose our anchors {{#...}} |
||
+ | return $title->getFullURL( 'action=raw&anchor='.urlencode( $filename ) ); |
||
} |
} |
||
+ | |||
</source> |
</source> |
||
And the other one, just removing the anchors from the rendered wiki page. |
And the other one, just removing the anchors from the rendered wiki page. |
||
Line 160: | Line 246: | ||
return ''; |
return ''; |
||
} |
} |
||
+ | |||
+ | </source> |
||
+ | |||
+ | ===Parser functions of the new tag <tt><file></tt>=== |
||
+ | The transformation rule to replace <code><file></code> tag to actual links for download. The same parser function is used for both '''anchors''' and '''anchor-links'''. Since the link text may contain wiki text, we generate the link as wiki text that we ask the parser to parse again. |
||
+ | {{#fileanchor: RawFile.php}} |
||
+ | <source lang=php> |
||
+ | function efRawFile_FileTagRender( $input, $args, $parser, $frame ) { |
||
+ | if( $args['title'] == '' ) |
||
+ | $title = $parser->mTitle; |
||
+ | else |
||
+ | $title = Title::newFromText($parser->recursiveTagParse( $args['title'], $frame )); |
||
+ | |||
+ | //We expand templates, so <file> tag cannot be mixed with {{#fileanchor}} anchors |
||
+ | $link=$title->getFullURL( 'action=raw&templates=expand' ); |
||
+ | if( $args['name'] != '' ) |
||
+ | $link.='&name='.urlencode( $parser->recursiveTagParse( $args['name'], $frame ) ); |
||
+ | if( $args['anchor'] != '' ) |
||
+ | $link.='&anchor='.urlencode( $parser->recursiveTagParse( $args['anchor'], $frame ) ); |
||
+ | if( $args['tag'] != '' ) |
||
+ | $link.='&tag='.urlencode( $parser->recursiveTagParse( $args['tag'], $frame ) ); |
||
+ | |||
+ | return $parser->recursiveTagParse( "[$link $input]", $frame ); |
||
+ | } |
||
+ | |||
</source> |
</source> |
||
Line 165: | Line 276: | ||
This part of the code doesn't look that nice because we've to parse the raw wiki page ourselves to retrieve the code sections we want. |
This part of the code doesn't look that nice because we've to parse the raw wiki page ourselves to retrieve the code sections we want. |
||
+ | First we define a helper function that we will use to report error messages. This is simply done by replacing the content of the downloaded file with the error message and when necessary a copy of the raw text relevant to the error. |
||
− | First let's see if <code>?action=raw</code> was used in the context of this extension: in that case we receive the filename as GET parameter, otherwise we simply return from our extension with return value=true which means we authorize the raw display (originally the hook was created to add an authentication point) |
||
+ | <br>'''TODO''': Cancel the file download header and return a proper error page |
||
+ | {{#fileanchor: RawFile.php}} |
||
+ | <source lang=php> |
||
+ | function fnRawFile_Strip_Error($msg,$out,&$text) { |
||
+ | $text=$msg; |
||
+ | if($out != '') |
||
+ | $text.="\nCandidate match: $out"; |
||
+ | return true; |
||
+ | } |
||
+ | |||
+ | </source> |
||
+ | |||
+ | Next let's see if <code>?action=raw</code> was used in the context of this extension: in that case we receive the filename as GET parameter, otherwise we simply return from our extension with return value=true which means we authorize the raw display (originally the hook was created to add an authentication point) |
||
{{#fileanchor: RawFile.php}} |
{{#fileanchor: RawFile.php}} |
||
<source lang=php> |
<source lang=php> |
||
function fnRawFile_Strip(&$rawPage, &$text) { |
function fnRawFile_Strip(&$rawPage, &$text) { |
||
− | + | $filename=$_GET['name']; |
|
+ | $anchor=$_GET['anchor']; |
||
+ | // for backward compatibility, accept also URLs with parameter 'file' |
||
+ | if( $anchor=='' ) |
||
+ | $anchor=$_GET['file']; |
||
+ | $tag=$_GET['tag']; |
||
+ | // Either anchor or name must be specified |
||
+ | if( $filename=='' ) |
||
+ | $filename=$anchor; |
||
+ | if ( $filename=='' ) |
||
return true; |
return true; |
||
+ | </source> |
||
− | $filename=$_GET['file']; |
||
+ | By default the downloadable file will still be handled by the ob_gzhandler session made by Mediawiki. To avoid output buffering and gzipping, one can uncomment the following line: |
||
+ | {{#fileanchor: RawFile.php}} |
||
+ | <source lang=php> |
||
+ | // Uncomment the following line to avoid output buffering and gzipping: |
||
+ | // wfResetOutputBuffers(); |
||
</source> |
</source> |
||
Raw action already set the headers with some client cache pragmas and is supposed to be displayed in the browser but in our case we want to make this "page" a downloadable file so we overwrite the headers which were defined and we add a few more, to ensure there is no caching on the client (it's very hard for the client to force a refresh on a file download, contrary to a web page) and to provide the adequate filename. |
Raw action already set the headers with some client cache pragmas and is supposed to be displayed in the browser but in our case we want to make this "page" a downloadable file so we overwrite the headers which were defined and we add a few more, to ensure there is no caching on the client (it's very hard for the client to force a refresh on a file download, contrary to a web page) and to provide the adequate filename. |
||
− | <br>At the end once we know the size of the data we'll transfer, we'll add a Content-Length header |
||
{{#fileanchor: RawFile.php}} |
{{#fileanchor: RawFile.php}} |
||
<source lang=php> |
<source lang=php> |
||
− | header("Content-disposition:filename= |
+ | header("Content-disposition: attachment;filename={$filename}"); |
− | header("Content-type:application/ |
+ | header("Content-type: application/octet-stream"); |
header("Content-Transfer-Encoding: binary"); |
header("Content-Transfer-Encoding: binary"); |
||
header("Expires: 0"); |
header("Expires: 0"); |
||
Line 184: | Line 321: | ||
header("Cache-Control: no-store"); |
header("Cache-Control: no-store"); |
||
</source> |
</source> |
||
− | Then we'll strip the output, first we've to locate the anchors but there are anchors that could be protected in literal blocks like nowiki. |
+ | Then we'll strip the output, first we've to locate the anchors but there are anchors that could be protected in literal blocks like <code>nowiki</code>. |
− | <br>So we'll mask the literal blocks before searching for the anchors (we mask with the same string length because we'll retrieve an offset that we will use on the initial string and offsets must match) |
+ | <br>So we'll mask the literal blocks before searching for the anchors (we mask with the same string length because we'll retrieve an offset that we will use on the initial string and offsets must match). This is done with the scary regex below: |
+ | * we use <code>!</code> instead of <code>/</code> as pattern indicator so that the pattern string is self-matching. This is necessary since we will apply the extension on this page as well. |
||
− | <br>'''TODO''': should we care also of source, js, css, pre,... blocks? |
||
+ | * we use option <code>s</code> (multiline) and <code>e</code> (evaluate replace expression) |
||
+ | * Evaluated expression replaces all characters in the matched string with X's. However if there are single quote (<code>'</code>) in the matched string, they will be escaped with <code>\</code>. So we need to search for <code>\'|.</code>. The many back-slashes is because the expression is evaluated several times. |
||
+ | '''TODO''': should we care also of source, js, css, pre,... blocks? |
||
{{#fileanchor: RawFile.php}} |
{{#fileanchor: RawFile.php}} |
||
<source lang=php> |
<source lang=php> |
||
− | $maskedtext=preg_replace_callback(' |
+ | $maskedtext=preg_replace_callback('!<nowiki>.*?</nowiki>!s', |
+ | function($m) { return ereg_replace(".","X",$m[0]); }, |
||
− | create_function( |
||
− | '$matches', |
||
− | 'return ereg_replace(".","X",$matches[0]);' |
||
− | ), |
||
$text); |
$text); |
||
</source> |
</source> |
||
− | Now we can search for the anchors |
+ | Now we can search for the anchors: |
+ | * If an anchor name is specified, we looked for '''all''' magic words <code><nowiki>{{#fileanchor:...}}</nowiki></code> or blocks with attribute <code>class="[someclass ]anchorname"</code> |
||
− | <br>And we free the memory used for the masked version |
||
+ | * Otherwise we look for the first magic word <code><nowiki>{{#file:...}}</nowiki></code> with specified file name, |
||
− | <br>'''TODO''': instead of cowardly returning if we don't find our anchors, we should cancel the headers and return a proper error page |
||
+ | * And finally for the first <code><file></code> tag with the specified file name (no multiple blocks support) |
||
+ | And we free the memory used for the masked version |
||
{{#fileanchor: RawFile.php}} |
{{#fileanchor: RawFile.php}} |
||
<source lang=php> |
<source lang=php> |
||
− | if (preg_match_all('/{{#fileanchor: |
+ | if (($anchor!='') && preg_match_all('/({{#fileanchor: *'.$anchor.' *}})|(<[^>]+ class *= *"([^"]*\w)?'.$anchor.'(\w[^"]*)?"[^>]*>)/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE)) |
$offsets=$matches[0]; |
$offsets=$matches[0]; |
||
− | else if (preg_match_all('/{{#file: |
+ | else if (preg_match_all('/{{#file: *'.$anchor.' *}}/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE)) |
$offsets=array($matches[0][0]); |
$offsets=array($matches[0][0]); |
||
+ | else if (preg_match_all('/<file( [^>]*)? name *= *"'.$filename.'"[^>]*>/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE)) |
||
− | else |
||
+ | $offsets=array($matches[0][0]); |
||
− | // We didn't find our anchor, let's output all the raw... |
||
− | + | else { |
|
+ | // We didn't find our anchor |
||
+ | return fnRawFile_Strip_Error("ERROR - RawFile: anchor not found (anchor=$anchor, name=$filename, tag=$tag)","",$text); |
||
+ | } |
||
unset($maskedtext); |
unset($maskedtext); |
||
</source> |
</source> |
||
Line 221: | Line 363: | ||
foreach ($offsets as $offset) { |
foreach ($offsets as $offset) { |
||
</source> |
</source> |
||
+ | We start from the position of the current anchor. If the tag name of the block attached to the anchor is not specified, we look for the first block that follows the anchor, excluding <code><br></code> and <code><file></code> block. The search can be easily done with a regular expression, using the ''lookahead negative assertion'' <code>(?!br\b|file\b)</code> to exclude the tags to ignore. Note that we need to ignore the anchor-link block <code><file></code> since the anchor starts right before that tag, and so the regular expression would match the anchor-link block it that tag is not specifically excluded. |
||
− | Let's remove the text up to the tag following the anchor |
||
− | <br>'''TODO''': the next tag could be a < br >, which we should skip |
||
{{#fileanchor: RawFile.php}} |
{{#fileanchor: RawFile.php}} |
||
<source lang=php> |
<source lang=php> |
||
$out = substr($textorig, $offset[1]); |
$out = substr($textorig, $offset[1]); |
||
+ | // If no tag specified, we take the first one |
||
− | $out = substr($out, strpos($out, '<')); |
||
+ | if ($tag == '') |
||
+ | { |
||
+ | // With a regex assertion, we can easily ignore 'br' and 'file' tags |
||
+ | if (!preg_match('/<((?!br\b|file\b)\w+\b)/', $out, $matches)) |
||
+ | return fnRawFile_Strip_Error ("ERROR - RawFile: Can't find opening tag after anchor '$offset[0]' (anchor=$anchor, name=$filename, tag=$tag)",$out,$text); |
||
+ | $tag=$matches[1]; |
||
+ | } |
||
</source> |
</source> |
||
+ | Now, we know the tag name of the block to download, either because it was already specified as a GET attribute in the URL, or because we've found it in the search above. Again, using a regular expression, we look for the first block matching the specified tag name that follows the current anchor, and extract the content of the blocks. Note the use of the regex option <code>/.../<u>s</u></code> to tell the regex engine that the matched text can span on multiple lines (with that option, <code>.</code> does match any character or a newline character). Also, we skip the first carriage return after the opening tag, if any (with <code>\n?</code>). |
||
− | What type of tag do we have? |
||
− | <br>Note that we're looking to the word directly following '<' up to '>' or a space, e.g. if there are arguments to the tag. |
||
− | <br>'''TODO''': once again, better handling of errors than just returning. |
||
{{#fileanchor: RawFile.php}} |
{{#fileanchor: RawFile.php}} |
||
<source lang=php> |
<source lang=php> |
||
+ | // Find the first tag matching $tag, and return enclosed text |
||
− | if (!preg_match('/^<([^> ]+)/', $out, $matches)) |
||
+ | if (!preg_match('/<'.$tag.'( [^>]*)?>\n?(.*?)<\/'.$tag.'>/s', $out, $matches)) |
||
− | return true; |
||
+ | return fnRawFile_Strip_Error ("ERROR - RawFile: no closing '$tag' found after anchor '$offset[0]' (anchor=$anchor, name=$filename, tag=$tag)",$out,$text); |
||
− | $key = $matches[1]; |
||
+ | $text .= $matches[2]; |
||
− | </source> |
||
− | OK, let's extract the text up to the closing tag |
||
− | <br>We skip the first carriage return after the opening tag, if any |
||
− | <br>We look for the closing tag and we take what's in between. |
||
− | <br>'''TODO''': once again, better handling of errors than just returning. |
||
− | {{#fileanchor: RawFile.php}} |
||
− | <source lang=php> |
||
− | $begin = strpos($out, '>')+1; |
||
− | if (ord(substr($out,$begin,1))==10) |
||
− | $begin++; |
||
− | if (preg_match_all('/<\/'.$key.'>/', $out, $matches, PREG_OFFSET_CAPTURE)) |
||
− | $text .= substr($out, $begin, $matches[0][0][1]-$begin); |
||
− | else |
||
− | // error, we could not find end of bloc |
||
− | $text .= substr($out, $begin); |
||
} |
} |
||
</source> |
</source> |
||
+ | No need to deal with a Content-Length header because Mediawiki will do it for us, moreover more properly than we could if the output is sent gzipped, which is the default. |
||
− | Be nice with the browser and tell it how much data we'll send. |
||
− | <br> |
+ | <br>So that's it, $text contains our file! |
{{#fileanchor: RawFile.php}} |
{{#fileanchor: RawFile.php}} |
||
<source lang=php> |
<source lang=php> |
||
− | header("Content-Length: ".strlen($text)); |
||
return true; |
return true; |
||
} |
} |
||
+ | |||
</source> |
</source> |
||
Line 268: | Line 400: | ||
<source lang=php> |
<source lang=php> |
||
$wgExtensionCredits['parserhook'][] = array('name' => 'RawFile', |
$wgExtensionCredits['parserhook'][] = array('name' => 'RawFile', |
||
− | 'version' => '0.1', |
+ | 'version' => '0.5.1', |
− | 'author' => 'Philippe Teuwen', |
+ | 'author' => 'Philippe Teuwen, Michael Peeters', |
− | + | 'url' => 'http://www.mediawiki.org/wiki/Extension:RawFile', |
|
− | + | // 'url' => 'http://wiki.yobi.be/wiki/Mediawiki_RawFile', |
|
'description' => 'Downloads a RAW copy of <nowiki><tag>data</tag></nowiki> in a file<br>'. |
'description' => 'Downloads a RAW copy of <nowiki><tag>data</tag></nowiki> in a file<br>'. |
||
'Useful e.g. to download a script or a patch<br>'. |
'Useful e.g. to download a script or a patch<br>'. |
||
Line 279: | Line 411: | ||
?> |
?> |
||
</source> |
</source> |
||
+ | And finally registration of the extension at the Mediawiki website according to the [http://www.mediawiki.org/wiki/Manual:Extensions Extensions Manual]. |
||
+ | |||
+ | So this extension has now [http://www.mediawiki.org/wiki/Extension:RawFile its own page on the official Mediawiki site]. |
||
==Installation== |
==Installation== |
||
Line 287: | Line 422: | ||
require_once("$IP/extensions/RawFile/RawFile.php"); |
require_once("$IP/extensions/RawFile/RawFile.php"); |
||
</source> |
</source> |
||
+ | ==Status== |
||
+ | If you use the extension properly the code is fully functional but it's rather raw on error handling. |
||
+ | ==ChangeLog== |
||
+ | '''0.5.1''' |
||
+ | * Integrate patch from Jani Uusitalo to recursively parse tag attribute values in case they contain [https://www.mediawiki.org/wiki/Help:Magic_words magic words] such as wiki templates or parameters. This is useful when using <code><file>...</file></code> in wiki templates. |
||
+ | * Prepare fix for <code>anchor not found</code> bug when using short notation <code><file name="filename" [tag="''tagname''"]>link text</file></code> in wiki templates. This still doesn't work because of the way MediaWiki (v1.22.1) expands templates in raw output (see [https://bugzilla.wikimedia.org/show_bug.cgi?id=61341 bugzilla 61341]). |
||
+ | ** Work-around available. See bugzilla bug report, or example section above. |
||
+ | '''0.5''' |
||
+ | * Fix since PHP 5.5.0: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead. Thanks to Stephen Kent for reporting. |
||
+ | * <span style="color:red">'''WARNING''' you should upgrade ASAP as the previous versions 0.4 and 0.4.1 are vulnerable to remote PHP code injection!!!</span><br>See [[Talk:Mediawiki_RawFile]] for more info. |
||
+ | * Fix for MediaWiki 1.22.1. |
||
+ | '''0.4.1''' |
||
+ | * Fix octet-stream MIME type bug which was affecting Epiphany & Opera 11. Thanks to Jani Uusitalo for reporting & [[User:Sicvolo|Sicvolo]] for finding the solution |
||
+ | '''0.4''' |
||
+ | * '''Anchors''' can be specified using html '''class''' attribute |
||
+ | * New syntax for '''Links''' and '''Anchor-links''': |
||
+ | :<code><nowiki><file [name="..."] [anchor="..."] [tag="..."] [title="..."] >Link text</file></nowiki></code> |
||
+ | * Support multiple files on the same page with same name (differentiated by their anchor name) or even common blocks in multiple files. |
||
+ | * Can specify the tag name of the block to download (to skip some irrelevant blocks when using an '''anchor-link'''). |
||
+ | * Ignore '''<code><br></code>''' tag. |
||
+ | * Some error reporting. |
||
+ | '''0.3''' |
||
+ | * Added optional parameter to <code>#fileLink</code> to indicate that the file is on another local wiki page |
||
+ | '''0.2''' |
||
+ | * Fix problem with Content-Length mismatch when transport is gzipped (default for Mediawiki if client supports it) |
||
+ | '''0.1''' |
||
+ | * Initial version |
||
+ | |||
+ | ==Known bugs== |
||
+ | ==Questions and feedback== |
||
+ | If you've any trouble, questions or suggestions, you can [[User:PhilippeTeuwen|contact me]]. |
||
+ | ==Known sites using the extension== |
||
+ | <!-- * [http://tech.ivkin.net/wiki/Main_Page Tech Knowledge Base Wiki], website of [[User:Sicvolo|Alex Ivkin]] --> |
||
+ | <!-- * [http://nginx.asia NGINX Asia] --> |
||
+ | <!-- * [http://www.gnutelephony.org GNU Telephony] runs a fork of the extension: Fram --> |
||
+ | * [http://wiki.hpc.ufl.edu University of Florida Research Computing Wiki] |
||
+ | * [http://wiki.scribus.net Scribus wiki] |
||
+ | * [http://nfc-tools.org/ NFC-Tools] |
||
+ | * [https://smkent.net/wiki/ smkent.net], website of Stephen Kent |
||
+ | * [http://mummila.net/wiki/ mummila.net], website of Jani Uusitalo |
||
+ | * [http://www.richud.com/wiki/ richud.com], website of Richard Moore |
||
+ | <!-- * [http://risca.eu risca.eu], website of Riccardo Scartozzi |
||
+ | * Well, this site of course! where I update the actual file on the server with a simple <br><code>wget -N --content-disposition "http://wiki.yobi.be/index.php?title=Mediawiki_RawFile&action=raw&file=RawFile.php"</code> |
||
+ | <!-- * [http://far.no/fram/index.php?title=Fram Far.no/Fram], website of Haakon Meland Eriksen --> |
||
+ | |||
+ | See also wikiapiary to follow usage of extensions [https://wikiapiary.com/wiki/Extension:RawFile RawFile] and [https://wikiapiary.com/wiki/Extension:Fram Fram] |
Latest revision as of 11:14, 17 February 2014
Very short introduction
Just have a look to the 2 examples to see how to use the extension
and to the Installation section to see how to install the extension in your MediaWiki server
Introduction
Originally the idea was to be able to download directly a portion of code as a file.
I've numerous code examples in my wiki and I wanted an easy way to download them, easier than a copy/paste!
But from there it was rather easy to get something very close to literate programming just by allowing multiple blocks referring to the same file, which will be concatenated together at download time.
- It must work with pre, nowiki, js, css, code, source, so let's make it general: take the tag that comes after the parser function we'll create and select data up to the closing tag.
- There are two distinct functionalities provided by the extension:
- the parser that will convert a magic word into a link to the download URL
- an extended ?action=raw that will strip the raw output to keep the desired code
Syntax
The extension introduces 3 elements:
- Anchor
- Used to flag that the next code block in the wiki text belongs to a specific file. The code block can be any wiki block (such as
<pre>
,<code>
,<tt>
,<source>
...).<br>
tags are ignored. Note that anchors are invisible in the wiki display. - Link
- They are transformed by the extension into links that allows for downloading all blocks attached to a given anchor name.
- Anchor-link
- A shortcut notation mixing both an anchor and download link, handy for regular use, when a single code block is used and when the download link can be at the same position as the anchor.
The syntax is as follows. The syntax using tag <file>
and tag attribute class
is new since v0.4. Note that elements of both syntaxes can be mixed in a same page.
Element | Syntax and description |
---|---|
Anchor | {{#fileAnchor: anchorname}} <pre class='anchorname'>...</pre> <code class="anchorname">...</code> <code class="cssclass anchorname">...</code> ... Indicates that the next wiki block is attached to an anchor anchorname. The content of that block will be downloaded (possibly appended with other blocks if there are several blocks attached to the same anchorname) when a file link is clicked on.
|
Link | [{{#fileLink: anchorname}} link text] [{{#fileLink: anchorname|pagetitle}} link text] <file anchor="anchorname" [name="filename"] [title="pagetitle"]>link text</file> Creates a link to download all blocks that are attached to an anchor anchorname.
|
Anchor-link | [{{#file: filename}} link text] <file name="filename" [tag="''tagname''"]>link text</file> Creates a link to download the next wiki block as a file named filename.
|
Short example
The extension works with any block such as pre, nowiki, js, css, code, source,...
This example is using the syntax highlighting <source> tag provided by SyntaxHighlight extension (using GeSHi Highlighter)
If you didn't install that extension on your MediaWiki, you can try the example by using <pre> instead of <source>.
Let's save the following code [{{#file: myscript.sh}} as myscript.sh] <source lang=bash> #!/bin/bash echo 'Hello world!' exit 0 </source>
will give:
Let's save the following code [{{#file: myscript.sh}} as myscript.sh]
#!/bin/bash
echo 'Hello world!'
exit 0
Complete example
And a full example with anchors & link:
Let's start with the Bash usual header: {{#fileanchor: myotherscript.sh}} <source lang=bash> #!/bin/bash </source> Then we'll display a welcome message: {{#fileanchor: myotherscript.sh}} <source lang=bash> echo 'Welcome on Earth!' </source> And we finally exit cleanly: {{#fileanchor: myotherscript.sh}} <source lang=bash> exit 0 </source> [{{#filelink: myotherscript.sh}} myotherscript.sh is now available for download below the code]
will give:
Let's start with the Bash usual header: {{#fileanchor: myotherscript.sh}}
#!/bin/bash
Then we'll display a welcome message: {{#fileanchor: myotherscript.sh}}
echo 'Welcome on Earth!'
And we finally exit cleanly: {{#fileanchor: myotherscript.sh}}
exit 0
[{{#filelink: myotherscript.sh}} myotherscript.sh is now available for download below the code]
Example using templates
The new syntax <file name="..." [tag="..."]>...</file>
allows for using the RawFile extension in templates as well.
The example below uses the template Template:Rawfiledownloadexample to avoid duplication between the file name in the text and as a parameter in the <file>
tag.
See the template source for instructions on how to create the template.
{{Rawfiledownloadexample|name=myfile.txt|content=Once upon a time There was a Tag Tag was clickable And clicked it was}}
The code above gives
You can download file "myfile.txt" below, just click this link: <file name="myfile.txt" tag="pre">myfile.txt</file>.
Once upon a time There was a Tag Tag was clickable And clicked it was
The code (the ultimate example)
Which you can of course download just by following [{{#filelink: RawFile.php}} this link :-)]
So let's explain a bit the code in a Literate Programming way...
Hooks
First some hooks for our functions...
We will create:
- a Parser Function (see also here), with help of
- $wgExtensionFunctions or ParserFirstCallInit global hook to define the setup function
- Magic Words
- Tag extensions
- LanguageGetMagic hook to initialize the magic words
- a RawPageViewBeforeOutput hook to intercept the raw output
{{#fileanchor: RawFile.php}}
<?php
if (defined('MEDIAWIKI')) {
//Avoid unstubbing $wgParser on setHook() too early on modern (1.12+) MW versions, as per r35980
if ( defined( 'MW_SUPPORTS_PARSERFIRSTCALLINIT' ) ) {
$wgHooks['ParserFirstCallInit'][] = 'efRawFile_Setup';
} else { // Otherwise do things the old fashioned way
$wgExtensionFunctions[] = 'efRawFile_Setup';
}
$wgHooks['LanguageGetMagic'][] = 'efRawFile_Magic';
$wgHooks['RawPageViewBeforeOutput'][] = 'fnRawFile_Strip';
Setup function
For the wiki parsing to create download links, the parser functions file and fileLink are equally treated, while fileAnchor will be simply left out. We also create a new tag file as explained here. {{#fileanchor: RawFile.php}}
function efRawFile_Setup() {
global $wgParser;
$wgParser->setFunctionHook( 'file', 'efRawFile_Render' );
$wgParser->setFunctionHook( 'filelink', 'efRawFile_Render' );
$wgParser->setFunctionHook( 'fileanchor', 'efRawFile_Empty' );
$wgParser->setHook( 'file', 'efRawFile_FileTagRender' );
return true;
}
Hook to initialize the magic words
We add the magic words here: the first array element indicates if it is case sensitive, in this case it is not case sensitive. We could add extra elements to create synonyms for our parser function.
Unless we return true, other parser functions extensions will not get loaded.
{{#fileanchor: RawFile.php}}
function efRawFile_Magic( &$magicWords, $langCode ) {
$magicWords['file'] = array( 0, 'file' );
$magicWords['filelink'] = array( 0, 'filelink' );
$magicWords['fileanchor'] = array( 0, 'fileanchor' );
return true;
}
Parser functions of the magic words
The transformation rule to replace link shortcuts to actual links for download, handling an optional local wiki page title if present.
The input parameters are wikitext with templates expanded, the output should be wikitext too
TODO: what error to send out if there is no filename given?
EDIT: It seems that commit 27667 (1.11 -> 1.12) changed the default parser, which breaks the recursive parsing. Thanks to Tim Starling for helping me to get around the problem!
{{#fileanchor: RawFile.php}}
function efRawFile_Render( &$parser, $filename = '', $titleText = '') {
if( $titleText == '' )
$title = $parser->mTitle;
else
$title = Title::newFromText( $titleText );
//Don't expand templates or we'll lose our anchors {{#...}}
return $title->getFullURL( 'action=raw&anchor='.urlencode( $filename ) );
}
And the other one, just removing the anchors from the rendered wiki page.
Curiously enough if the function doesn't exist at all the effect is exactly the same, MW doesn't throw any error.
But let's keep things clean...
{{#fileanchor: RawFile.php}}
function efRawFile_Empty( &$parser, $filename = '') {
return '';
}
Parser functions of the new tag <file>
The transformation rule to replace <file>
tag to actual links for download. The same parser function is used for both anchors and anchor-links. Since the link text may contain wiki text, we generate the link as wiki text that we ask the parser to parse again.
{{#fileanchor: RawFile.php}}
function efRawFile_FileTagRender( $input, $args, $parser, $frame ) {
if( $args['title'] == '' )
$title = $parser->mTitle;
else
$title = Title::newFromText($parser->recursiveTagParse( $args['title'], $frame ));
//We expand templates, so <file> tag cannot be mixed with {{#fileanchor}} anchors
$link=$title->getFullURL( 'action=raw&templates=expand' );
if( $args['name'] != '' )
$link.='&name='.urlencode( $parser->recursiveTagParse( $args['name'], $frame ) );
if( $args['anchor'] != '' )
$link.='&anchor='.urlencode( $parser->recursiveTagParse( $args['anchor'], $frame ) );
if( $args['tag'] != '' )
$link.='&tag='.urlencode( $parser->recursiveTagParse( $args['tag'], $frame ) );
return $parser->recursiveTagParse( "[$link $input]", $frame );
}
Hook to intercept the raw output
This part of the code doesn't look that nice because we've to parse the raw wiki page ourselves to retrieve the code sections we want.
First we define a helper function that we will use to report error messages. This is simply done by replacing the content of the downloaded file with the error message and when necessary a copy of the raw text relevant to the error.
TODO: Cancel the file download header and return a proper error page
{{#fileanchor: RawFile.php}}
function fnRawFile_Strip_Error($msg,$out,&$text) {
$text=$msg;
if($out != '')
$text.="\nCandidate match: $out";
return true;
}
Next let's see if ?action=raw
was used in the context of this extension: in that case we receive the filename as GET parameter, otherwise we simply return from our extension with return value=true which means we authorize the raw display (originally the hook was created to add an authentication point)
{{#fileanchor: RawFile.php}}
function fnRawFile_Strip(&$rawPage, &$text) {
$filename=$_GET['name'];
$anchor=$_GET['anchor'];
// for backward compatibility, accept also URLs with parameter 'file'
if( $anchor=='' )
$anchor=$_GET['file'];
$tag=$_GET['tag'];
// Either anchor or name must be specified
if( $filename=='' )
$filename=$anchor;
if ( $filename=='' )
return true;
By default the downloadable file will still be handled by the ob_gzhandler session made by Mediawiki. To avoid output buffering and gzipping, one can uncomment the following line: {{#fileanchor: RawFile.php}}
// Uncomment the following line to avoid output buffering and gzipping:
// wfResetOutputBuffers();
Raw action already set the headers with some client cache pragmas and is supposed to be displayed in the browser but in our case we want to make this "page" a downloadable file so we overwrite the headers which were defined and we add a few more, to ensure there is no caching on the client (it's very hard for the client to force a refresh on a file download, contrary to a web page) and to provide the adequate filename. {{#fileanchor: RawFile.php}}
header("Content-disposition: attachment;filename={$filename}");
header("Content-type: application/octet-stream");
header("Content-Transfer-Encoding: binary");
header("Expires: 0");
header("Pragma: no-cache");
header("Cache-Control: no-store");
Then we'll strip the output, first we've to locate the anchors but there are anchors that could be protected in literal blocks like nowiki
.
So we'll mask the literal blocks before searching for the anchors (we mask with the same string length because we'll retrieve an offset that we will use on the initial string and offsets must match). This is done with the scary regex below:
- we use
!
instead of/
as pattern indicator so that the pattern string is self-matching. This is necessary since we will apply the extension on this page as well. - we use option
s
(multiline) ande
(evaluate replace expression) - Evaluated expression replaces all characters in the matched string with X's. However if there are single quote (
'
) in the matched string, they will be escaped with\
. So we need to search for\'|.
. The many back-slashes is because the expression is evaluated several times.
TODO: should we care also of source, js, css, pre,... blocks? {{#fileanchor: RawFile.php}}
$maskedtext=preg_replace_callback('!<nowiki>.*?</nowiki>!s',
function($m) { return ereg_replace(".","X",$m[0]); },
$text);
Now we can search for the anchors:
- If an anchor name is specified, we looked for all magic words
{{#fileanchor:...}}
or blocks with attributeclass="[someclass ]anchorname"
- Otherwise we look for the first magic word
{{#file:...}}
with specified file name, - And finally for the first
<file>
tag with the specified file name (no multiple blocks support)
And we free the memory used for the masked version {{#fileanchor: RawFile.php}}
if (($anchor!='') && preg_match_all('/({{#fileanchor: *'.$anchor.' *}})|(<[^>]+ class *= *"([^"]*\w)?'.$anchor.'(\w[^"]*)?"[^>]*>)/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE))
$offsets=$matches[0];
else if (preg_match_all('/{{#file: *'.$anchor.' *}}/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE))
$offsets=array($matches[0][0]);
else if (preg_match_all('/<file( [^>]*)? name *= *"'.$filename.'"[^>]*>/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE))
$offsets=array($matches[0][0]);
else {
// We didn't find our anchor
return fnRawFile_Strip_Error("ERROR - RawFile: anchor not found (anchor=$anchor, name=$filename, tag=$tag)","",$text);
}
unset($maskedtext);
$text is both input & output so we copy it and start with an empty output. {{#fileanchor: RawFile.php}}
$textorig=$text;
$text='';
For each anchor found we've to isolate the content of the next block. {{#fileanchor: RawFile.php}}
foreach ($offsets as $offset) {
We start from the position of the current anchor. If the tag name of the block attached to the anchor is not specified, we look for the first block that follows the anchor, excluding <br>
and <file>
block. The search can be easily done with a regular expression, using the lookahead negative assertion (?!br\b|file\b)
to exclude the tags to ignore. Note that we need to ignore the anchor-link block <file>
since the anchor starts right before that tag, and so the regular expression would match the anchor-link block it that tag is not specifically excluded.
{{#fileanchor: RawFile.php}}
$out = substr($textorig, $offset[1]);
// If no tag specified, we take the first one
if ($tag == '')
{
// With a regex assertion, we can easily ignore 'br' and 'file' tags
if (!preg_match('/<((?!br\b|file\b)\w+\b)/', $out, $matches))
return fnRawFile_Strip_Error ("ERROR - RawFile: Can't find opening tag after anchor '$offset[0]' (anchor=$anchor, name=$filename, tag=$tag)",$out,$text);
$tag=$matches[1];
}
Now, we know the tag name of the block to download, either because it was already specified as a GET attribute in the URL, or because we've found it in the search above. Again, using a regular expression, we look for the first block matching the specified tag name that follows the current anchor, and extract the content of the blocks. Note the use of the regex option /.../s
to tell the regex engine that the matched text can span on multiple lines (with that option, .
does match any character or a newline character). Also, we skip the first carriage return after the opening tag, if any (with \n?
).
{{#fileanchor: RawFile.php}}
// Find the first tag matching $tag, and return enclosed text
if (!preg_match('/<'.$tag.'( [^>]*)?>\n?(.*?)<\/'.$tag.'>/s', $out, $matches))
return fnRawFile_Strip_Error ("ERROR - RawFile: no closing '$tag' found after anchor '$offset[0]' (anchor=$anchor, name=$filename, tag=$tag)",$out,$text);
$text .= $matches[2];
}
No need to deal with a Content-Length header because Mediawiki will do it for us, moreover more properly than we could if the output is sent gzipped, which is the default.
So that's it, $text contains our file!
{{#fileanchor: RawFile.php}}
return true;
}
Credits
There is an official way to register the extension in a Mediawiki installation, so that it will be visible on the Special:Version page.
Let's say the extension is in the category of parser hooks even if there is also a hook on Raw action.
{{#fileanchor: RawFile.php}}
$wgExtensionCredits['parserhook'][] = array('name' => 'RawFile',
'version' => '0.5.1',
'author' => 'Philippe Teuwen, Michael Peeters',
'url' => 'http://www.mediawiki.org/wiki/Extension:RawFile',
// 'url' => 'http://wiki.yobi.be/wiki/Mediawiki_RawFile',
'description' => 'Downloads a RAW copy of <nowiki><tag>data</tag></nowiki> in a file<br>'.
'Useful e.g. to download a script or a patch<br>'.
'It also allows what is called [http://en.wikipedia.org/wiki/Literate_programming Literate Programming]');
}
?>
And finally registration of the extension at the Mediawiki website according to the Extensions Manual.
So this extension has now its own page on the official Mediawiki site.
Installation
Download [{{#filelink: RawFile.php}} RawFile.php] and save it under the MediaWiki directory as extensions/RawFile/RawFile.php
Add at the end of LocalSettings.php:
require_once("$IP/extensions/RawFile/RawFile.php");
Status
If you use the extension properly the code is fully functional but it's rather raw on error handling.
ChangeLog
0.5.1
- Integrate patch from Jani Uusitalo to recursively parse tag attribute values in case they contain magic words such as wiki templates or parameters. This is useful when using
<file>...</file>
in wiki templates. - Prepare fix for
anchor not found
bug when using short notation<file name="filename" [tag="tagname"]>link text</file>
in wiki templates. This still doesn't work because of the way MediaWiki (v1.22.1) expands templates in raw output (see bugzilla 61341).- Work-around available. See bugzilla bug report, or example section above.
0.5
- Fix since PHP 5.5.0: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead. Thanks to Stephen Kent for reporting.
- WARNING you should upgrade ASAP as the previous versions 0.4 and 0.4.1 are vulnerable to remote PHP code injection!!!
See Talk:Mediawiki_RawFile for more info. - Fix for MediaWiki 1.22.1.
0.4.1
- Fix octet-stream MIME type bug which was affecting Epiphany & Opera 11. Thanks to Jani Uusitalo for reporting & Sicvolo for finding the solution
0.4
- Anchors can be specified using html class attribute
- New syntax for Links and Anchor-links:
<file [name="..."] [anchor="..."] [tag="..."] [title="..."] >Link text</file>
- Support multiple files on the same page with same name (differentiated by their anchor name) or even common blocks in multiple files.
- Can specify the tag name of the block to download (to skip some irrelevant blocks when using an anchor-link).
- Ignore
<br>
tag. - Some error reporting.
0.3
- Added optional parameter to
#fileLink
to indicate that the file is on another local wiki page
0.2
- Fix problem with Content-Length mismatch when transport is gzipped (default for Mediawiki if client supports it)
0.1
- Initial version
Known bugs
Questions and feedback
If you've any trouble, questions or suggestions, you can contact me.
Known sites using the extension
- University of Florida Research Computing Wiki
- Scribus wiki
- NFC-Tools
- smkent.net, website of Stephen Kent
- mummila.net, website of Jani Uusitalo
- richud.com, website of Richard Moore
See also wikiapiary to follow usage of extensions RawFile and Fram