wyz.code.rdoc Tutorial

Fabien GELINEAU

Last update 2019.08.27

offensive programming - R documentation

Package wyz.code.rdoc aims to ease manual page creation in a very flexible way. It aims to free you from learning R documentation specific language and its arcanes, while providing good and reliable results in a quick and reproducible way.

1 Understanding manual page generation

Manual pages are associated either to describe processing functions or to describe data. Both are important, and each comes with its own specification set.

From a pratical point of view, it exists several ways to produce manual pages. Indeed two are currently the most commonly used. First one is based on handcrafted manual pages. Second one is based on generated manual pages.

1.1 Handcrafted manual page

Standard R documentation tools to generate manual pages belongs to the first approach. The global manual page generation process looks like following one.

handcrafted manual page

In this is a two steps process, you first generate a manual page template once and only once, and then fill-in the blanks with the desired content.

In practice, you end up repeating a variable number of times the fill-in phase for each manual page. Moreover, it requires you to get acknowledge about R documentation arcanes, and this is quite complex due to syntax issues, character escaping and some other not so simple to fulfil needs.

1.2 roxygen2 generated manual page

Package roxygen2 meets the second approach. Its manual page generation process looks like following one

handcrafted manual page

Theoritically, this is a two steps process, where you first fill-in the code comments according to roxygen2 specification, and then generate on-demand the related manual pages. This is a much more industrial approach.

In practice, compliance with code comments specification is not so easy and still requires deep understanding of R documentation scheme. Manual page generation although robust and fast, may sometimes be cumbersome.

1.3 wyz.code.rdoc generated manual page

Package wyz.code.rdoc meets the already presentend second approach. Its manual page generation process looks like following one.

handcrafted manual page

Theoritically, this is a three steps process

  1. create your manual page customization
  2. generate the related manual page
  3. edit the resulting manual page

In practice, this is often a two steps process, as editing resulting manual page is an optional step, only required when the cost of the modification is higher to be achieved by code than by hand.

1.4 Comparison of the three approaches

approach pros cons
handcrafted manual page ▶ easy to understand
▶ straightforward process
▶ hyper repetitive task
▶ difficult to master
▶ time consuming activity
▶ great variability of the result from person to person
roxygen2 generated manual page ▶ fast
▶ robust
▶ high quality of result
▶ Hadley powered
▶ code commenting
▶ sometimes tricky
▶ repetitive task
▶ time consuming activity
wyz.code.rdoc generated manual page ▶ pure code, only code
▶ highly customizable result
▶ time saving activity
▶ highly reproducible results
▶ high reuse of customization as code
▶ requires some experimentation to feel at ease with
▶ still repetitive task, although less

2 Package wyz.code.rdoc approach

2.1 Why another manual page generation tool?

Mainly for following reasons.

First, documentation production is an activity consuming too much time. We should reduce the amount of time spent on documentation generation while garanteing a high level of quality of produced documentation.

Second, documentation is mandatory. So we need very powerful tools to alleviate the burden and to reduce variability of documentation quality.

Third, I do not believe that standard R or roxygen2 ways are the right ones. They are for sure helpful but to my opinion clearly not enough. I wish I could write documentation from code instead of writing documentation. I should be doing so using a high level interface, not requiring me to know much about final R documentation format. Thus will allow me to produce better documentation as I will only have to focus on the content and the style, not on the format of the documentation.

Fourth, current level of industrialization of documentation generation provided by the two presented approaches is insufficient too me. I wish to be able to reuse one already generated part from one manual page into another one. This is possible and quite easy to achieve if I use code, difficult otherwise. I also wish to be possible to produce a complete manual page, whatever its format and content, in a fully reproducible and replayable way.

2.2 What can actually be generated?

Manual page can be generated

  1. from a single R function
  2. from a R object instanciated from a R class
  3. for a package
  4. from a data set

Currently, version 1.1.8 of wyz.code.rdoc allows to generate manual pages for each of these cases. See use cases to know more.

2.3 Code organization

Package wyz.code.rdoc provides low, medium and high level tools functions to generate manual pages. You can discover them using following R sequence.

dt <- wyz.code.rdoc::opRdocInformation()

Core level tools deals mainly with deep package internals.

sort(dt[stratum == 'CORE' & nature == 'EXPORTED']$name)
[1] "GenerationContext" "InputContext"      "ManualPageBuilder"
[4] "ProcessingContext" "rdocKeywords"     

Low level tools deal with R documentation format.

sort(dt[stratum == 'LAYER_1' & nature == 'EXPORTED']$name)
 [1] "auditDocumentationFiles"   "escapeContent"            
 [3] "generateEnc"               "generateEnumeration"      
 [5] "generateMarkup"            "generateOptionLink"       
 [7] "generateOptionSexpr"       "generateParagraph"        
 [9] "generateParagraph2NL"      "generateParagraphCR"      
[11] "generateReference"         "generateS3MethodSignature"
[13] "generateSection"           "generateTable"            
[15] "getStandardSectionNames"   "produceDocumentationFile" 
[17] "sentensize"                "verifyDocumentationFile"  

Medium level tools deal essentially with presentation and beautifying.

sort(dt[stratum == 'LAYER_2' & nature == 'EXPORTED']$name)
[1] "beautify"

High level tools deal with end-user facilities to ease manual page generation, manage end-user customizations, and increase productivity.

sort(dt[stratum == 'LAYER_3' & nature == 'EXPORTED']$name)
 [1] "completeManualPage"              "computeDocumentationStatistics" 
 [3] "convertExamples"                 "dummy"                          
 [5] "family"                          "identifyReplacementVariables"   
 [7] "interpretResults"                "opRdocInforamtion"              
 [9] "produceAllManualPagesFromObject" "produceManualPage"              
[11] "producePackageLink"