demo.md 13 KB


aastexopts: [singlecolumn] preambleinput: ["macros.tex"] texpackages:

  • name: xspace

    opts:

  • name: amsmath

    opts:

    bibstyle: aasjournal bibliography: demo.bib received: "\today" #revised: "January 7, 2018" #accepted: "\today" #submitjournal: ApJ title: "Preparation of Articles using Markdown and Pandoc: General Description and Templates" shorttitle: "Articles in Markdown and Pandoc" shortauthors: Privon author:

  • name: George C. Privon ORCID: 0000-0003-3474-1125 affiliation: ["Department of Astronomy, University of Florida, 211 Bryant Space Sciences Center, Gainesville, 32611 FL, USA"]

    nocollaboration: 1

    #- name: Josiah Carberry

    ORCID: 0000-0002-1825-0097

    affiliation: ["Department of Psychoceramics, Wesleyan University, Middletown, CT", "Department of Psychoceramics, Brown University: Providence, RI"]

    collaboration: "(ORCID Demonstration)"

    keywords: [miscellaneous] software: ["pandoc"] facility: [] abstract: | The Markdown format can be used to create articles with easily readable plain-text source while making it easy to translate to other formats (e.g., \TeX\, HTML, docx, and PDF [via \TeX]). This article announces and briefly describes templates and code which can use the pandoc software to convert Markdown into journal-compatible \TeX. The advantage of this approach is ease of readability of the source files and flexibility in output formats (e.g., for output to HTML). This article describes and demonstrates this technique for \aastex\ output, however the source repository also includes barebones examples for MNRAS and A&A. I am releasing the code and templates under free software / open culture licenses.

    Other journals or output formats only require the creation of new template files and/or modifications of the YAML header for the Markdown source.

Introduction {#sec:intro}

Manuscript preparation is an integral part of disseminating research. Currently papers are predominantly prepared in \latex\ or sometimes WYSIWYG editors such as Microsoft Word or Apple Pages. While powerful in their own ways, each of these have their own drawbacks. In particular \latex\ often suffers from a steep learning curve and cryptic error messages while WYSIWYG editors have historically had sub-par mathematics rendering ability and suffered from difficulties with robust internal referencing.

Here I describe and demonstrate a method of preparing manuscripts by writing them in Markdown ([Section @sec:markdown]) and using pandoc ([Section @sec:pandoc]) to convert the Markdown file into a format suitable for submission to journals (e.g., \TeX, Microsoft Word's .docx). This approach simplifies the writing process while retaining the power of \latex.

Markdown {#sec:markdown}

The Markdown specification was released by John Gruber in 2004^[https://daringfireball.net/projects/markdown/]. Markdown was originally intended to specify a plain text format which could be converted to HTML, with the motivation that:

A Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions. – John Gruber

Since its release, Markdown (and its various flavors) have been extended and become widely used. Describing Markdown is beyond this scope of this document. I assume the reader is familiar with the synxatx and refer the reader to the pandoc Markdown description^[https://pandoc.org/MANUAL.html#pandocs-markdown] and Gruber's original specification.

Pandoc {#sec:pandoc}

pandoc is "a universal document converter"^[https://pandoc.org], originally written by John MacFarlane^[http://johnmacfarlane.net/]. At present it supports 25 input formats and 47 output formats (including variations of several standards such as Markdown). Additional formats can be supported by providing user-defined writers, written in the lua language. pandoc is written in the Haskell programming language and supports extensions written as filters. This template

Note that the author can write \TeX\ into the Markdown file and pandoc will happily pass it through to the finished product. However, this may compromise alternate (non-\TeX) output formats. For example, the \aastex deluxetable environment can be used, but will not properly render in non-\TeX\ formats. Pandoc filers^[https://pandoc.org/filters.html] may be crafted to convert simple pandoc tables into deluxetables on the fly, if desired.

Paper Organization

I broadly divide this article into demonstrations of how to prepare a manuscript in Markdown such that it generates nearly-submittable \TeX\ ([Section @sec:prep]). This includes how to specify the article style via the YAML header of the Markdown file ([Section @sec:style]). I then demonstrate how to include images ([Section @sec:images]), tables ([Section @sec:tables]), and citations ([Section @sec:citations]). I conclude by discussing some practical considerations for this paper writing process ([Section @sec:notes]).

Throughout I assume the reader is familiar with Markdown and do not discuss Markdown's text formatting. Instead I discuss the general behavior of the template file and actions which are necessary for generating \aastex-compatible output.

The Markdown file, pandoc invocation, and associated filters used to create the \TeX\ for this document are available at: https://github.com/privong/papers-in-markdown. I remind the reader that this approach can be extended to the templates of other journals by modifying the YAML header in the Markdown file and the \TeX\ template file.

Manuscript Preparation in Markdown {#sec:prep}

Manuscript Metadata and Styles {#sec:style}

The Markdown file can be prefixed with a header in the YAML ("YAML Ain't Markup Language") format. Article data such as the title, relevant dates, author list, keywords, etc. is specified here. This header information is extracted via a \TeX\ template file and passed through to the desired output file. The pandoc template also derives the \aastex\ style information from this YAML header, via the aastexopts entry. The YAML header given below is that used for the preparation of this document:

---
aastexopts: [singlecolumn]
preambleinput: ["macros.tex"]
texpackages:
- name: xspace
#  opts:
- name: amsmath
#  opts:
bibstyle: aasjournal
bibliography: demo.bib
received: "January 1, 2018"
#revised: "January 7, 2018"
#accepted: "\\today"
#submitjournal: ApJ
title: "Preparation of Articles using Markdown and Pandoc"
shorttitle: "Articles in Markdown and Pandoc"
shortauthors: Privon
author:
- name: George C. Privon
  ORCID: 0000-0003-3474-1125
  affiliation: ["Department of Astronomy, University of Florida, 211 Bryant Space Sciences Center, \
Gainesville, 32611 FL, USA"]
#  nocollaboration: 1
#- name: Josiah Carberry
#  ORCID: 0000-0002-1825-0097
#  affiliation: ["Wesleyan University, Middletown, CT", "Brown University: Providence, RI"]
#  collaboration: "(ORCID Demonstration)"
keywords: [miscellaneous]
software: ["[`pandoc`](http://pandoc.org)"]
facility: []
abstract: |
  A short abstract.
---

Unwanted entries can be commented out with a # or safely deleted (here they have been commented so they appear for reference purposes). If a different style (e.g., twocolumn) is desired, this can be changed in aastexopts. YAML header entries and corresponding \TeX\ template code have been created to correspond to most (if not all) of the \aastex\ metadata options.

Images {#sec:images}

Images can be included, captioned, and labeled. A demonstration is [Figure @fig:dm1647], which was included as:

![A r-band image of dm1647+21 in grayscale with a lookalike N-body simulation overlaid as \
the colored points. From @Privon2017b.](images/dm1647.png){#fig:dm1647 width=3in height=3in}

{#fig:dm1647 width=3in height=3in}

Tables {#sec:tables}

This tool will pass \latex\ tables through pandoc to the chosen \latex\ parser. Thus, any tables which are part of \aastex will work for producing pdfs. However, those will not propagate through to other output formats with which pandoc is compatible.

[Table @tbl:storms] is an example of a "simple table"^[https://pandoc.org/MANUAL.html#tables]:

Date Day Number of storms


2018-05-21 Monday ... 2018-05-22 Tuesday 2 2018-05-23 Wednesday 1 2018-05-24 Thursday 3 2018-05-25 Friday 0 2018-05-26 Saturday 0 2018-05-27 Sunday 0

Table: Number of imaginary thunderstorms in Gainesville, FL during the 21st week of 2018. {#tbl:storms}

Citations {#sec:citations}

Citations can be incorporated using the pandoc-citeproc filter^[https://pandoc.org/MANUAL.html#citations]. These citations take the form of: [@Astropy2018] to correspond to [@Astropy2018]. pandoc-citeproc uses the Citation Style Language^[https://citationstyles.org/] to format citations.

Equations {#sec:equations}

Equations can be specified and labeled in the text. $$e^{i\pi} + 1 = 0$$ {#eq:euler} results in this output:

$$e^{i\pi} + 1 = 0$$ {#eq:euler}

And [equation @eq:euler] can subsequently be referenced. This method of specifying math and equations can be coupled with pandoc's support for a variety of methods to render math in HTML^[https://pandoc.org/MANUAL.html#math-rendering-in-html].

Notes on Preparation for Submission {#sec:notes}

Available Templates

I have created templates and demonstration files for the AAS Journals, Monthly Notices of the Royal Astronomical Society, and Astronomy & Astrophysics. The AAS Journals example (which you are reading now) is the most complete, while the others are bare-bones and intended to constitute a minimal starting point. However, the Markdown used to generate this article should work in the demonstration files for the other journals.

The templates are different because each journal has different options and handling of author lists, abstracts, and metadata. For example collaboration information can be provded to AAS Journals, but not to MNRAS or A&A. Hence it makes some sense to keep the Markdown templates separate for these journals. However, if a manuscript is prepared using the Markdown template for A&A but the authors later decide to submit to MNRAS, the only major changes needed will be to the YAML header in the Markdown file. So this method of manuscript preparation may be somewhat more flexible than directly writing the document in \TeX.

Internal References

Naively pandoc does not presently support internal reference to figures or equations and does not support numbered section references. However the pandoc-crossref^[https://github.com/lierdakil/pandoc-crossref] filter adds support for this (and has been used in the preparation of this document). Note that pandoc-crossref uses the same syntax as pandoc-citeproc, the former filter must be invoked before the latter. For example the \TeX\ for this document was generated with:

pandoc demo.md -s --template aastex62_template.tex -o demo.tex \
               -F pandoc-crossref -F pandoc-citeproc

Document Filters

pandoc supports user-written filters. We have already seen two filters, pandoc-citeproc and pandoc-crossref. These filters enable customized processing of documents during conversion. Commonly used languages for this include Haskell, lua, and python^[Using either the panflute or pandocfilters modules.]. Note that a lua parser is included with pandoc versions 2.0 and newer, and the use of lua filters is faster than other options.

With output formats besides \aastex\ in mind, the acknowledgments portion of the document has been delineated in the Markdown file as a macros: {{acknowledgments}}. However, is desirable to automatically convert this to an \acknowledgments macro when creating a \TeX\ file. As a filter demonstration, the following lua code performs this translation:

return {
  {
    Str = function (elem)
      if elem.text == "{{acknowledgments}}" then
        if string.find(FORMAT, "latex") then
          return pandoc.RawInline("tex", "\\acknowledgements")
        else
          return elem
        end
      end
    end,
  }
}

This filter is included as aastex62/filters/acknowledgments.lua in the template distribution. It can be used with the --lua-filter= command-line argument. This filer be easily extended to other output formats, including HTML.

Generally, creation of filters would be more broadly useful in automating the conversion of Markdown files into journal-compatible \TeX. A opportunity for this is to write a filter that takes the Markdown "simple table" format and converts it into an \aastex\ deluxetable.

Summary

I have provided a brief demonstration for a method of writing research articles in Markdown and converting them to an \aastex-compatible format for submission to AAS Journals. This method is easily extended to other research journals. The advantage of this approach is improved ease of reading the source material and added flexibility for output formats. The template and demonstration text are made publicly available for use and enhancement by the community: https://github.com/privong/papers-in-markdown.

{{acknowledgments}}

G.C.P acknowledges support from the University of Florida.