作者:Yihui Xie
译者:郑宝童
日期:2021.05.21


3.1 HTML

The main difference between rendering a book (using bookdown) with rendering a single R Markdown document (using rmarkdown) to HTML is that a book will generate multiple HTML pages by default — normally one HTML file per chapter. This makes it easier to bookmark a certain chapter or share its URL with others as you read the book, and faster to load a book into the web browser. Currently we have provided a number of different styles for HTML output: the GitBook style, the Bootstrap style, and the Tufte style.

使用bookdown渲染一本书为HTML和使用rmarkdown渲染一个 R Markdown文档为HTML之间的主要区别是,一本书将默认生成多个HTML页面—-通常每个章节各生成一个HTML文件。这使得你在阅读时可以更容易地将某一章节设为书签,也使得与其他人分享它的URL时更容易,这同时也让一本书在网络浏览器中加载更快。目前,我们已经为HTML输出提供了许多不同的样式:GitBook样式、Bootstrap样式和Tufte样式。

3.1.1 GitBook style

3.1.1 GitBook 样式

The GitBook style was borrowed from GitBook, a project launched by Friendcode, Inc. (https://www.gitbook.com) and dedicated to helping authors write books with Markdown. It provides a beautiful style, with a layout consisting of a sidebar showing the table of contents on the left, and the main body of a book on the right. The design is responsive to the window size, e.g., the navigation buttons are displayed on the left/right of the book body when the window is wide enough, and collapsed into the bottom when the window is narrow to give readers more horizontal space to read the book body.

GitBook的风格借鉴了Friendcode, Inc. (https://www.gitbook.com)推出的GitBook项目,该项目致力于帮助作者用Markdown写书。它提供了一种漂亮的风格,布局是由左边显示目录的边栏,右边显示书的主体组成。该设计也会根据窗口的大小进行调整,例如,当窗口足够宽时,导航按钮会显示在正文的左/右;当窗口足够窄时,导航按钮会折叠到底部,给读者更多的水平空间来阅读正文。

We have made several improvements over the original GitBook project. The most significant one is that we replaced the Markdown engine with R Markdown v2 based on Pandoc, so that there are a lot more features for you to use when writing a book:

  • You can embed R code chunks and inline R expressions in Markdown, and this makes it easy to create reproducible documents and frees you from synchronizing your computation with its actual output (knitr will take care of it automatically).
  • The Markdown syntax is much richer: you can write anything that Pandoc’s Markdown supports, such as LaTeX math expressions and citations.
  • You can embed interactive content in the book (for HTML output only), such as HTML widgets and Shiny apps.

我们对原来的GitBook项目做了一些改进。最重要的是,我们用基于Pandoc的R Markdown v2取代了Markdown引擎,这样你在写书的时候就有了更多的功能::

  • 你可以在Markdown中嵌入R代码块和行内R表达式,这使得创建可重复的文档变得容易,并使你免于与实际输出同步计算(knitr会自动处理它)。
  • Markdown语法更丰富:您可以编写Pandoc的Markdown支持的任何内容,例如LaTeX数学表达式和引用。
  • 您可以在书中嵌入交互式内容(仅用于HTML输出),如HTML小部件和Shiny的应用程序。

We have also added some useful features in the user interface that we will introduce in detail soon. The output format function for the GitBook style in bookdown is gitbook(). Here are its arguments:

我们还在用户界面中添加了一些有用的功能,我们将很快详细介绍。bookdown中GitBook风格的输出格式函数是GitBook()。以下是它的参数:

  1. gitbook(fig_caption = TRUE, number_sections = TRUE,
  2. self_contained = FALSE, anchor_sections = TRUE,
  3. lib_dir = "libs", pandoc_args = NULL, ..., template = "default",
  4. split_by = c("chapter", "chapter+number", "section",
  5. "section+number", "rmd", "none"), split_bib = TRUE,
  6. config = list(), table_css = TRUE)

Most arguments are passed to rmarkdown::html_document(), including fig_caption, lib_dir, and .... You can check out the help page of rmarkdown::html_document() for the full list of possible options. We strongly recommend you to use fig_caption = TRUE for two reasons: 1) it is important to explain your figures with captions; 2) enabling figure captions means figures will be placed in floating environments when the output is LaTeX, otherwise you may end up with a lot of white space on certain pages. The format of figure/table numbers depends on if sections are numbered or not: if number_sections = TRUE, these numbers will be of the format X.i, where X is the chapter number, and i in an incremental number; if sections are not numbered, all figures/tables will be numbered sequentially through the book from 1, 2, …, N. Note that in either case, figures and tables will be numbered independently.

大多数参数被传递给 rmarkdown::html_document(), 包含fig_caption, lib_dir, 和...等参数. 您可以查看 rmarkdown::html_document()的帮助页面,以获得完整的可用选项列表. 出于两个原因,我们强烈建议您使用 fig_caption = TRUE : 1) 用标题来解释你的突变是很重要的; 2) 启用图形标题意味着当输出为LaTeX时,图形将被放置在浮动环境中,否则在某些页面上可能会出现大量空白. 图/表编号的格式取决于是否对分段进行编号: 如果 number_sections = TRUE, 这些数字的格式将是 X.i, X是章节号, i 是一个自增数字; 如果章节没有编号,所有的图表将从1,2,…,n按顺序编号。注意,不论是在以上的哪一种情况,图和表( figures and tables )都将是独立编号。

Among all possible arguments in ..., you are most likely to use the css argument to provide one or more custom CSS files to tweak the default CSS style. There are a few arguments of html_document() that have been hard-coded in gitbook() and you cannot change them: toc = TRUE (there must be a table of contents), theme = NULL (not using any Bootstrap themes), and template (there exists an internal GitBook template).

...所有可以的参数中,你很可能使用css 参数来提供一个或多个自定义css文件来调整默认的css样式。 html_document() 有一些参数已经被硬编码在 gitbook() 并且你无法去改变它,比如: toc = TRUE (必须要有一个目录), theme = NULL (不使用任何Bootstrap主题), and template (存在一个内部的GitBook模板).

Please note that if you change self_contained = TRUE to make self-contained HTML pages, the total size of all HTML files can be significantly increased since there are many JS and CSS files that have to be embedded in every single HTML file.

Besides these html_document() options, gitbook() has three other arguments: split_by, split_bib, and config. The split_by argument specifies how you want to split the HTML output into multiple pages, and its possible values are:

  • rmd: use the base filenames of the input Rmd files to create the HTML filenames, e.g., generate chapter3.html for chapter3.Rmd.
  • none: do not split the HTML file (the book will be a single HTML file).
  • chapter: split the file by the first-level headers.
  • section: split the file by the second-level headers.
  • chapter+number and section+number: similar to chapter and section, but the files will be numbered.

For chapter and section, the HTML filenames will be determined by the header identifiers, e.g., the filename for the first chapter with a chapter title # Introduction will be introduction.html by default. For chapter+number and section+number, the chapter/section numbers will be prepended to the HTML filenames, e.g., 1-introduction.html and 2-1-literature.html. The header identifier is automatically generated from the header text by default,9 and you can manually specify an identifier using the syntax {#your-custom-id} after the header text, e.g.,

  1. # An Introduction {#introduction}
  2. The default identifier is `an-introduction` but we changed
  3. it to `introduction`.

By default, the bibliography is split and relevant citation items are put at the bottom of each page, so that readers do not have to navigate to a different bibliography page to see the details of citations. This feature can be disabled using split_bib = FALSE, in which case all citations are put on a separate page.
There are several sub-options in the config option for you to tweak some details in the user interface. Recall that all output format options (not only for bookdown::gitbook) can be either passed to the format function if you use the command-line interface bookdown::render_book(), or written in the YAML metadata. We display the default sub-options of config in the gitbook format as YAML metadata below (note that they are indented under the config option):

  1. bookdown::gitbook:
  2. config:
  3. toc:
  4. collapse: subsection
  5. scroll_highlight: yes
  6. before: null
  7. after: null
  8. toolbar:
  9. position: fixed
  10. edit : null
  11. download: null
  12. search: yes
  13. fontsettings:
  14. theme: white
  15. family: sans
  16. size: 2
  17. sharing:
  18. facebook: yes
  19. github: no
  20. twitter: yes
  21. linkedin: no
  22. weibo: no
  23. instapaper: no
  24. vk: no
  25. whatsapp: no
  26. all: ['facebook', 'twitter', 'linkedin', 'weibo', 'instapaper']
  27. info: yes

The toc option controls the behavior of the table of contents (TOC). You can collapse some items initially when a page is loaded via the collapse option. Its possible values are subsection, section, none (or null). This option can be helpful if your TOC is very long and has more than three levels of headings: subsection means collapsing all TOC items for subsections (X.X.X), section means those items for sections (X.X) so only the top-level headings are displayed initially, and none means not collapsing any items in the TOC. For those collapsed TOC items, you can toggle their visibility by clicking their parent TOC items. For example, you can click a chapter title in the TOC to show/hide its sections.
The scroll_highlight option in toc indicates whether to enable highlighting of TOC items as you scroll the book body (by default this feature is enabled). Whenever a new header comes into the current viewport as you scroll down/up, the corresponding item in TOC on the left will be highlighted.
Since the sidebar has a fixed width, when an item in the TOC is truncated because the heading text is too wide, you can hover the cursor over it to see a tooltip showing the full text.
You may add more items before and after the TOC using the HTML tag <li>. These items will be separated from the TOC using a horizontal divider. You can use the pipe character | so that you do not need to escape any characters in these items following the YAML syntax, e.g.,

  1. toc:
  2. before: |
  3. <li><a href="...">My Awesome Book</a></li>
  4. <li><a href="...">John Smith</a></li>
  5. after: |
  6. <li><a href="https://github.com/rstudio/bookdown">
  7. Proudly published with bookdown</a></li>

As you navigate through different HTML pages, we will try to preserve the scroll position of the TOC. Normally you will see the scrollbar in the TOC at a fixed position even if you navigate to the next page. However, if the TOC item for the current chapter/section is not visible when the page is loaded, we will automatically scroll the TOC to make it visible to you.
3.1 HTML格式--ing - 图1
FIGURE 3.1: The GitBook toolbar.
The GitBook style has a toolbar (Figure 3.1) at the top of each page that allows you to dynamically change the book settings. The toolbar option has a sub-option position, which can take values fixed or static. The default is that the toolbar will be fixed at the top of the page, so even if you scroll down the page, the toolbar is still visible there. If it is static, the toolbar will not scroll with the page, i.e., once you scroll away, you will no longer see it.
The first button on the toolbar can toggle the visibility of the sidebar. You can also hit the S key on your keyboard to do the same thing. The GitBook style can remember the visibility status of the sidebar, e.g., if you closed the sidebar, it will remain closed the next time you open the book. In fact, the GitBook style remembers many other settings as well, such as the search keyword and the font settings.
The second button on the toolbar is the search button. Its keyboard shortcut is F (Find). When the button is clicked, you will see a search box at the top of the sidebar. As you type in the box, the TOC will be filtered to display the sections that match the search keyword. Now you can use the arrow keys Up/Down to highlight the previous/next match in the search results. When you click the search button again (or hit F outside the search box), the search keyword will be emptied and the search box will be hidden. To disable searching, set the option search: no in config.
The third button is for font/theme settings. The reader can change the font size (bigger or smaller), the font family (serif or sans serif), and the theme (White, Sepia, or Night). You can set the initial value of these settings via the fontsettings option. Font size is measured on a scale of 0-4; the initial value can be set to 1, 2 (default), 3, or 4. The button can be removed from the toolbar by setting fontsettings: null (or no).

  1. # changing the default
  2. fontsettings:
  3. theme: night
  4. family: serif
  5. size: 3

The edit option is the same as the option mentioned in Section 4.4. If it is not empty, an edit button will be added to the toolbar. This was designed for potential contributors to the book to contribute by editing the book on GitHub after clicking the button and sending pull requests. The history and view options work the same way.
If your book has other output formats for readers to download, you may provide the download option so that a download button can be added to the toolbar. This option takes either a character vector, or a list of character vectors with the length of each vector being 2. When it is a character vector, it should be either a vector of filenames, or filename extensions, e.g., both of the following settings are okay:

  1. download: ["book.pdf", "book.epub"]
  2. download: ["pdf", "epub", "mobi"]

When you only provide the filename extensions, the filename is derived from the book filename of the configuration file _bookdown.yml (Section 4.4). When download is null, gitbook() will look for PDF, EPUB, and MOBI files in the book output directory, and automatically add them to the download option. If you just want to suppress the download button, use download: no. All files for readers to download will be displayed in a drop-down menu, and the filename extensions are used as the menu text. When the only available format for readers to download is PDF, the download button will be a single PDF button instead of a drop-down menu.
An alternative form for the value of the download option is a list of length-2 vectors, e.g.,

  1. download: [["book.pdf", "PDF"], ["book.epub", "EPUB"]]

You can also write it as:

  1. download:
  2. - ["book.pdf", "PDF"]
  3. - ["book.epub", "EPUB"]

Each vector in the list consists of the filename and the text to be displayed in the menu. Compared to the first form, this form allows you to customize the menu text, e.g., you may have two different copies of the PDF for readers to download and you will need to make the menu items different.
On the right of the toolbar, there are some buttons to share the link on social network websites such as Twitter, Facebook, and Linkedin. You can use the sharing option to decide which buttons to enable. If you want to get rid of these buttons entirely, use sharing: null (or no).
Another button shown on the toolbar is the information (‘i’) button that lists keyboard shortcuts available to navigate the document. This button can be hidden by setting info: no.
Finally, there are a few more top-level options in the YAML metadata that can be passed to the GitBook HTML template via Pandoc. They may not have clear visible effects on the HTML output, but they may be useful when you deploy the HTML output as a website. These options include:

  • description: A character string to be written to the content attribute of the tag <meta name="description" content=""> in the HTML head (if missing, the title of the book will be used). This can be useful for search engine optimization (SEO). Note that it should be plain text without any Markdown formatting such as _italic_ or **bold**.
  • url: The URL of book’s website, e.g., https\://bookdown.org/yihui/bookdown/.10
  • github-repo: The GitHub repository of the book of the form user/repo.
  • cover-image: The path to the cover image of the book.
  • apple-touch-icon: A path to an icon (e.g., a PNG image). This is for iOS only: when the website is added to the Home screen, the link is represented by this icon.
  • apple-touch-icon-size: The size of the icon (by default, 152 x 152 pixels).
  • favicon: A path to the “favorite icon”. Typically this icon is displayed in the browser’s address bar, or in front of the page title on the tab if the browser support tabs.

Below we show some sample YAML metadata (again, please note that these are top-level options):

  1. ---
  2. title: "An Awesome Book"
  3. author: "John Smith"
  4. description: "This book introduces the ABC theory, and ..."
  5. url: 'https\://bookdown.org/john/awesome/'
  6. github-repo: "john/awesome"
  7. cover-image: "images/cover.png"
  8. apple-touch-icon: "touch-icon.png"
  9. apple-touch-icon-size: 120
  10. favicon: "favicon.ico"
  11. ---

A nice effect of setting description and cover-image is that when you share the link of your book on some social network websites such as Twitter, the link can be automatically expanded to a card with the cover image and description of the book.

3.1.2 Bootstrap style

If you have used R Markdown before, you should be familiar with the Bootstrap style (http://getbootstrap.com), which is the default style of the HTML output of R Markdown. The output format function in rmarkdown is html_document(), and we have a corresponding format html_book() in bookdown using html_document() as the base format. In fact, there is a more general format html_chapters() in bookdown and html_book() is just its special case:

  1. html_chapters(toc = TRUE, number_sections = TRUE, fig_caption = TRUE,
  2. lib_dir = "libs", template = bookdown_file("templates/default.html"),
  3. pandoc_args = NULL, ..., base_format = rmarkdown::html_document,
  4. split_bib = TRUE, page_builder = build_chapter, split_by = c("section+number",
  5. "section", "chapter+number", "chapter", "rmd", "none"))

Note that it has a base_format argument that takes a base output format function, and html_book() is basically html_chapters(base_format = rmarkdown::html_document). All arguments of html_book() are passed to html_chapters():

  1. html_book(...)

That means that you can use most arguments of rmarkdown::html_document, such as toc (whether to show the table of contents), number_sections (whether to number section headings), and so on. Again, check the help page of rmarkdown::html_document to see the full list of possible options. Note that the argument self_contained is hard-coded to FALSE internally, so you cannot change the value of this argument. We have explained the argument split_by in the previous section.
The arguments template and page_builder are for advanced users, and you do not need to understand them unless you have strong need to customize the HTML output, and those many options provided by rmarkdown::html_document() still do not give you what you want.
If you want to pass a different HTML template to the template argument, the template must contain three pairs of HTML comments, and each comment must be on a separate line:

  • <!--bookdown:title:start--> and <!--bookdown:title:end--> to mark the title section of the book. This section will be placed only on the first page of the rendered book;
  • <!--bookdown:toc:start--> and <!--bookdown:toc:end--> to mark the table of contents section, which will be placed on all HTML pages;
  • <!--bookdown:body:start--> and <!--bookdown:body:end--> to mark the HTML body of the book, and the HTML body will be split into multiple separate pages. Recall that we merge all R Markdown or Markdown files, render them into a single HTML file, and split it.

You may open the default HTML template to see where these comments were inserted:

  1. bookdown:::bookdown_file("templates/default.html")
  2. # you may use file.edit() to open this file

Once you know how bookdown works internally to generate multiple-page HTML output, it will be easier to understand the argument page_builder, which is a function to compose each individual HTML page using the HTML fragments extracted from the above comment tokens. The default value of page_builder is a function build_chapter in bookdown, and its source code is relatively simple (ignore those internal functions like button_link()):

  1. build_chapter = function(
  2. head, toc, chapter, link_prev, link_next, rmd_cur, html_cur, foot
  3. ) {
  4. # add a has-sub class to the <li> items that has sub lists
  5. toc = gsub('^(<li>)(.+<ul>)$', '<li class="has-sub">\\2', toc)
  6. paste(c(
  7. head,
  8. '<div class="row">',
  9. '<div class="col-sm-12">',
  10. toc,
  11. '</div>',
  12. '</div>',
  13. '<div class="row">',
  14. '<div class="col-sm-12">',
  15. chapter,
  16. '<p style="text-align: center;">',
  17. button_link(link_prev, 'Previous'),
  18. source_link(rmd_cur, type = 'edit'),
  19. source_link(rmd_cur, type = 'history'),
  20. source_link(rmd_cur, type = 'view'),
  21. button_link(link_next, 'Next'),
  22. '</p>',
  23. '</div>',
  24. '</div>',
  25. foot
  26. ), collapse = '\n')
  27. }

Basically, this function takes a number of components like the HTML head, the table of contents, the chapter body, and so on, and it is expected to return a character string which is the HTML source of a complete HTML page. You may manipulate all components in this function using text-processing functions like gsub() and paste().
What the default page builder does is to put TOC in the first row, the body in the second row, navigation buttons at the bottom of the body, and concatenate them with the HTML head and foot. Here is a sketch of the HTML source code that may help you understand the output of build_chapter():

  1. <html>
  2. <head>
  3. <title>A Nice Book</title>
  4. </head>
  5. <body>
  6. <div class="row">TOC</div>
  7. <div class="row">
  8. CHAPTER BODY
  9. <p>
  10. <button>PREVIOUS</button>
  11. <button>NEXT</button>
  12. </p>
  13. </div>
  14. </body>
  15. </html>

For all HTML pages, the main difference is the chapter body, and most of the rest of the elements are the same. The default output from html_book() will include the Bootstrap CSS and JavaScript files in the <head> tag.
The TOC is often used for navigation purposes. In the GitBook style, the TOC is displayed in the sidebar. For the Bootstrap style, we did not apply a special style to it, so it is shown as a plain unordered list (in the HTML tag <ul>). It is easy to turn this list into a navigation bar with some CSS techniques. We have provided a CSS file toc.css in this package that you can use, and you can find it here: https://github.com/rstudio/bookdown/blob/master/inst/examples/css/toc.css
You may copy this file to the root directory of your book, and apply it to the HTML output via the css option, e.g.,

  1. ---
  2. output:
  3. bookdown::html_book:
  4. toc: yes
  5. css: toc.css
  6. ---

There are many possible ways to turn <ul> lists into navigation menus if you do a little bit searching on the web, and you can choose a menu style that you like. The toc.css we just mentioned is a style with white menu texts on a black background, and supports sub-menus (e.g., section titles are displayed as drop-down menus under chapter titles).
As a matter of fact, you can get rid of the Bootstrap style in html_document() if you set the theme option to null, and you are free to apply arbitrary styles to the HTML output using the css option (and possibly the includes option if you want to include arbitrary content in the HTML head/foot).

3.1.3 Tufte style

Like the Bootstrap style, the Tufte style is provided by an output format tufte_html_book(), which is also a special case of html_chapters() using tufte::tufte_html() as the base format. Please see the tufte package (Xie and Allaire 2020) if you are not familiar with the Tufte style. Basically, it is a layout with a main column on the left and a margin column on the right. The main body is in the main column, and the margin column is used to place footnotes, margin notes, references, and margin figures, and so on.
All arguments of tufte_html_book() have exactly the same meanings as html_book(), e.g., you can also customize the CSS via the css option. There are a few elements that are specific to the Tufte style, though, such as margin notes, margin figures, and full-width figures. These elements require special syntax to generate; please see the documentation of the tufte package. Note that you do not need to do anything special to footnotes and references (just use the normal Markdown syntax ^[footnote] and [@citation]), since they will be automatically put in the margin. A brief YAML example of the tufte_html_book format:

  1. ---
  2. output:
  3. bookdown::tufte_html_book:
  4. toc: yes
  5. css: toc.css
  6. ---

References

Xie, Yihui, and JJ Allaire. 2020. Tufte: Tufte’s Styles for R Markdown Documents. https://github.com/rstudio/tufte.


  1. To see more details on how an identifier is automatically generated, see the auto_identifiers extension in Pandoc’s documentation http://pandoc.org/MANUAL.html#header-identifiers↩︎
  2. The backslash before : is due to a technical issue: we want to prevent Pandoc from translating the link to HTML code <a href="..."></a>. More details at https://github.com/jgm/pandoc/issues/2139.↩︎