I wanted to read a long text in the form of a webpage but the computer screen makes it painful; so I converted the text to an ebook for my e-reader.
First download the webpage.
wget https://urbigenous.net/library/how_to_build.html
Convert it with pandoc
.
pandoc -o book.epub how_to_build.html
And open book.epub
.
Is that simple.
With no edits looks ok and pandoc
was able to extract the title and author from HTML headers.
You can also edit it before converting it into an epub.
Convert to markdown first.
pandoc -o book.md how_to_build.html
Edit book.md
as you wish, something like this will do. You can find more details in pandoc’s documentation.
---
title:
- type: main
text: How to Build a Universe That Doesn’t Fall Apart Two Days Later
creator:
- role: author
text: Philip K. Dick
date: 1978
ibooks:
version: 1.3.4
...
And convert it to epub.
pandoc -o book2.epub book.md
Both work. The second one takes more time but you can remove some unwanted content and modify the metadata.
Results
Original:
[urbigenous.net](https://urbigenous.net/) \> [Etext](/library/) \> How
to Build a Universe That Doesn't Fall Apart Two Days Later
::: {#text role="main"}
# How to Build a Universe That Doesn't Fall Apart Two Days Later
## Philip K. Dick, 1978
First, before I begin to bore you with the usual sort of things science
fiction writers say in speeches, let me bring you official greetings
After edit:
---
title:
- type: main
text: How to Build a Universe That Doesn’t Fall Apart Two Days Later
creator:
- role: author
text: Philip K. Dick
date: 1978
ibooks:
version: 1.3.4
...
::: {#text role="main"}
First, before I begin to bore you with the usual sort of things science
fiction writers say in speeches, let me bring you official greetings