Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Pandoc to convert files in Linux

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

Editor to share with you how to use Pandoc to convert files in Linux. I hope you will get something after reading this article. Let's discuss it together.

Pandoc is a command-line tool for converting files from one markup language to another. Markup languages use tags to mark up parts of a document. Common markup languages include Markdown, ReStructuredText, HTML, LaTex, ePub, and Microsoft Word DOCX.

Pandoc installation and requirements

Pandoc is installed by default in most Linux distributions. This tutorial uses pandoc-2.2.3.2 and pandoc-citeproc-0.14.3. If you don't plan to generate PDF, then these two packages will suffice. However, I recommend that you also install texlive so that you can choose to generate PDF.

Install these programs on Linux with the following command:

Sudo apt-get install pandoc pandoc-citeproc texlive

You can find installation instructions for other platforms on the Pandoc website.

I strongly recommend installing pandoc-crossref, a "filter for numbering charts, equations, tables, and cross-references". The easiest way to install is to download the prebuilt executable, but you can also install it from Haskell's package manager cabal with the following command:

Cabal updatecabal install pandoc-crossref

If you need additional Haskell installation information, please refer to pandoc-crossref 's GitHub repository.

A few examples

I'll demonstrate how Pandoc works by explaining how to generate three types of documents:

Web pages created by LaTeX files containing mathematical formulas Reveal.js slides generated by Markdown files mixed with Markdown and LaTeX contract files to create a website containing mathematical formulas

One of the advantages of Pandoc is that it displays mathematical formulas in different output file formats. For example, we can generate a web page from an LaTeX document called math.tex that contains some mathematical symbols (written in LaTeX).

The math.tex document is as follows:

% Pandoc math demos$ a ^ 2 + b ^ 2 = c ^ 2$ v (t) = v ^ 0 +\ frac {1} at ^ 2 $\ gamma =\ frac {1} {sqrt {1-v ^ 2 / c ^ 2}} $\ exists x\ forall y (Rxy\ equiv Ryx) $p\ wedge Q\ models packs $\ Box\ diamond p\ equiv\ diamond packs $\ int_ {0} x dx =\ left [\ frac {1} {2} x ^ 2\ right] _ {0} ^ {1} =\ frac {1} {2} $e ^ x =\ sum_ {n ^ 0} ^\ infty\ frac {x ^ n} {n!} =\ lim_ {n\ rightarrow\ infty} (1+x/n) ^ n $

Convert the LaTeX document to a Web site named mathMathML.html by entering the following command:

Pandoc math.tex-s-mathml-o mathMathML.html

The parameter-s tells Pandoc to generate a separate web page (instead of a fragment of the page, so it will include head and body tags in HTML), and the-mathml parameter forces Pandoc to convert mathematical formulas in LaTeX into MathML, which can be rendered by modern browsers.

Take a look at the web page effects and code. Makefile in the code repository makes it easier to run.

Make a Reveal.js slide

It is easy to generate a simple presentation from a Markdown file using Pandoc. Slides contain top-level slides and nested slides below. You can control the presentation from the keyboard, jump from one top-level slide to the next, or display nested slides below the top-level slide. This structure is common in HTML-based presentation frameworks.

Create a slide document called SLIDES (see code repository). First, add the slide's meta-information (for example, title, author, and date) after%:

% Case Study% Kiko Fernandez Reyes% Sept 27, 2017

This meta-information also creates the first slide. To add more slides, use Markdown's first-level title (line 5 in the following example, refer to Markdown's first-level title) to generate a top-level slide.

For example, you can create a presentation titled "Case Study" and a top-level slide named "Wine Management System" with the following command:

% Case Study% Kiko Fernandez Reyes% Sept 27, 2018 Wine Management System

Use the secondary title of Markdown to put the content (such as a slide containing a description and implementation of a new management system) into the top-level slide you just created. Let's add two more slides (in lines 7 and 14 in the example below, refer to the secondary title of Markdown).

The first secondary slide is titled "Idea" and displays an image of the Swiss flag. The second secondary slide is titled "Implementation" [cc]% Case Study% Kiko Fernandez Reyes% Sept 27, 2018 Wine Management System## Idea## implementation [/ cc].

We now have a top-level slide (# Wine Management System) containing two slides (# # Idea and # # Implementation).

Add something to these two slides by creating a Markdown list that starts with the symbol >. Based on the above code, add two items (lines 9-10) to the first slide and five items (lines 16-20) to the second slide:

[cc]% Case Study% Kiko Fernandez Reyes% Sept 27, 2018 Wine Management System## Idea >-Swiss love their * * wine** and cheese >-Create a * simple* wine tracker system! [] (img/matterhorn.jpg) # # Implementation >-Bottles have a RFID tag >-RFID reader (emits and read signal) >-* * Raspberry Pi** >-* * Server (online shop) * * >-Mobile app [/ cc]

The above code adds an image of Matterhorn, and you can also use pure Markdown syntax or add HTML tags to improve the slide.

To generate slides, Pandoc needs to reference the Reveal.js library, so it must be in the same folder as the SLIDES file. The command to generate the slide is as follows:

Pandoc-t revealjs-s-self-contained SLIDES\-V theme=white-V slideNumber=true-o index.html

The above Pandoc command takes the following parameters:

-t revealjs means that a revealjs presentation will be output-s tells Pandoc to generate a separate document-self-contained generates a HTML file with no external dependencies-V sets the following variable: theme=white sets the theme of the slide to white slideNumber=true display slide number-o index.html generates a slide in a file named index.html to simplify the operation and avoid typing such a long command, create the following Makefile:

All: generategenerate: pandoc-t revealjs-s-self-contained SLIDES\-V theme=white-V slideNumber=true-o index.htmlclean: index.html rm index.html.PHONY: all clean generate

All the code can be found in this warehouse.

Make a contract in multiple formats

Suppose you are preparing a document, and (this is common these days) some people want to use Microsoft Word format, others use free software, want ODT format, and others need PDF. Instead of using OpenOffice or LibreOffice to generate files in DOCX or PDF format, you can create a document in Markdown (use some LaTeX syntax if you need an advanced format) and generate any of these file types.

As before, first declare the meta-information (title, author, and date) of the document:

% Contract Agreement for Software X% Kiko Fernandez-Reyes% August 28th, 2018 then write the document in Markdown (add LaTeX if advanced format is required). For example, create a table at regular intervals (declared with\ hspace {3cm} in LaTeX) and the lines that clients and contractors should fill in (declared with\ hrulefill in LaTeX). After that, add a table written in Markdown.

The document created is as follows:

The code to create this document is as follows:

Contract Agreement for Software X Kiko Fernandez-Reyes% August 28th 2018. Official # Work Order\ begin {table} [h]\ begin {tabular} {ccc} The Contractor &\ hspace {3cm} & The Customer\ &\ hrulefill\ hrulefill &\ hspace {3cm} &\ hrulefill\% Name &\ hspace {3cm} & Name &\ hrulefill &\ hspace {3cm} &\ hrulefill\.\ end {tabular}\ end {table}\ vspace {1cm} +-- | |-+ | Type of Service | Cost | Total | +: = = + =: +: + | Game Engine | | 70.0 | 70.0 | +-| -+ | | +-- |-|-+ | Extra: Comply with defined API functions | | 10.0 | 10.0 | | and expected returned format | +-- |-|-+ | | | +-- |-|-+ | * * Total Cost** | | * * 80 | .0 * * | +-- |-|-+

To generate the three different output formats required for this document, write the following Makefile:

DOCS=contract-agreement.mdall: $(DOCS) pandoc-s $(DOCS)-o $(DOCS:md=pdf) pandoc-s $(DOCS)-o $(DOCS:md=docx) pandoc-s $(DOCS)-o $(DOCS:md=odt) clean: rm * .pdf * .docx * .odt.PHONY: all clean

Lines 4 through 7 are specific commands for generating three different output formats:

If you have multiple Markdown files and want to merge them into one document, you need to write commands in the order in which you want them to appear. For example, at the time of writing this article, I created three documents: an introduction, three examples, and some advanced usage. The following command tells Pandoc to merge these files together in the specified order and generate a PDF file named document.pdf.

Pandoc-s introduction.md examples.md advanced-uses.md-o document.pdf templates and meta-information

Writing complex documents is not easy, and you need to follow a series of content-independent rules, such as using specific templates, writing summaries, embedding specific fonts, and perhaps even declaring keywords. All of this has nothing to do with the content: simply put, it is meta-information.

Pandoc uses templates to generate different output formats. For example, there is a template for LaTeX, a template for ePub, and so on. There are unassigned variables in the meta-information of these templates. Use the following command to find the meta-information available in the Pandoc template:

Pandoc-D FORMAT

For example, the template for LaTex is:

Pandoc-D latex

Output in the following format:

$if (title) $\ title {$title$$if (thanks) $\ thanks {$thanks$} $endif$} $endif$$if (subtitle) $\ providecommand {\ subtitle} [1] {}\ subtitle {$subtitle$} $endif$$if (author) $\ author {$for (author) $author$$sep$\ and $endfor$} $endif$$if (institute) $\ providecommand {\ institute} [1] {institute {$for (institute) $institute$$sep$\ and $endfor$ $endif$\ date {date$} $if (if) $if (beamer) $\ beamer {\ if {$if} $titlegraphic () $\ {{$titlegraphic} $

As you can see, the output includes title, thank you, author, subtitle, and agency template variables (and many other available variables). You can easily set up these using the YAML meta-block. In lines 1-5 of the following example, we declare a YAML meta-block and set some variables (using the example of the contract agreement above):

-title: Contract Agreement for Software Xauthor: Kiko Fernandez-Reyesdate: August 28th, 2018 Murray-(continue writing document as in the previous example)

This works very well, equivalent to the previous code:

% Contract Agreement for Software X% Kiko Fernandez-Reyes% August 28th, 2018

However, doing so links the meta information to the content, that is, Pandoc will always use this information to output the file in the new format. If you are going to generate multiple file formats, you'd better be careful. For example, what if you need to generate contracts in ePub and HTML format, and ePub and HTML need different style rules?

Consider these situations:

If you just try to embed the YAML variable css:style-epub.css, it will be removed from the HTML version. It doesn't work. Copying documents is obviously not a good solution, because changes in one version are not synchronized with another. You can also add variables to the Pandoc command as follows:

Pandoc-s-V css=style-epub.css document.md document.epubpandoc-s-V css=style-html.css document.md document.html

In my opinion, it is easy to ignore these variables from the command line, especially if you need to set dozens of variables (which may occur in the case of writing complex documents). Now, if you put them in the same file (meta.yaml file), you only need to update or create a new meta-information file to generate the desired output format. Then you will write a command like this:

Pandoc-s meta-pub.yaml document.md document.epubpandoc-s meta-html.yaml document.md document.html

This is a more concise version where you can update all meta information from a single file without updating the contents of the document.

After reading this article, I believe you have a certain understanding of "how to use Pandoc to convert documents in Linux". If you want to know more about it, you are welcome to follow the industry information channel. Thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report