R help page as object

Edited with suggestion of Hadley

You can do this a bit easier by:

getHTMLhelp <- function(...){
    thefile <- help(...)
    capture.output(
      tools:::Rd2HTML(utils:::.getHelpFile(thefile))
    )
}

Using tools:::Rd2txt instead of tools:::Rd2HTML will give you plain text. Just getting the file (without any parsing) gives you the original Rd format, so you can write your custom parsing function to parse it into an object (see the solution of @Jeroen, which does a good job in extracting all info into a list).

This function takes exactly the same arguments as help() and returns a vector with every element being a line in the file, eg:

> head(HelpAnova)
[1] "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">"      
[2] "<html><head><title>R: Anova Tables</title>"                             
[3] "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">"
[4] "<link rel=\"stylesheet\" type=\"text/css\" href=\"R.css\">"             
[5] "</head><body>"                                                          
[6] ""           

Or :

> HelpGam <- getHTMLhelp(gamm,package=mgcv)
> head(HelpGam)
[1] "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">"      
[2] "<html><head><title>R: Generalized Additive Mixed Models</title>"        
[3] "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">"
[4] "<link rel=\"stylesheet\" type=\"text/css\" href=\"R.css\">"             
[5] "</head><body>"                                                          
[6] ""           

Leave a Comment