A plaintext tool suite
So, you’re curious about the man behind the curtain?
Well, that man is definitely not as secretive as the wizard of Oz. Come, and I’ll tell you everything.
First off, know that I’m one of those weirdoes that run Linux. OK, I know, I know, I'm a hopeless geek. However, all the tools I shall describe work under Windows and Mac OS X should you be so inclined.
Second, I use plain text files for all my source files.
Plain text, really?
Yes, plain text.
Oh, of course, in a novel you want some formatting: italics, chapter breaks, chapter headings, footnotes (yes, sorry, I like putting comments and clarifications in footnotes) and so on. However, it’s not because the original file is plain text that you cannot add such things.
There are special ways to enter plain text, known as “lightweight markup languages” which can be used as a source file to create ePub and MOBI files for ereaders, as well as PDF for printing. The process is automated in theory (however, a later post will explain why the process isn’t 100% automated for me; I hope by the next novel I’ll manage to fully automate the transformation to ePub). You do need a bit of savvy, but it’s still less effort than doing everything by hand.
There are several lightweight markup languages available. Two have retained my interest: Markdown and reStructuredText (also known as reST).
This website is actually written using reStructuredText, and the initial versions of “I, Cunningham” also used that markup language. However, reStructuredText has less mind share than Markdown, so in the end, I converted the novel to Markdown using Pandoc.
There are several advantages to using such plain text markup:
- Basic revision control systems work. This means you can use tools such as Git to keep all versions of your work as time goes by. If you find that you don’t like the direction you’re taking, you can revert to older an version and start from there.
- Plain text editors don’t distract you with toolbars or menus. You can use anything, really. I use GNU Emacs but that’s just old habit from my university days. There are many “distraction free” editors available, such as FocusWriter.
- As a corollary, transferring files to different text editors and operating systems will never cause problems. Contrast that with moving a Microsoft Word DOCX file to Google Docs, then to Open Office. You will eventually lose some formatting, or even some data such as annotations.
- Since you’re just editing text files, any computer can do, even a lowly Chromebook or old EeePC Netbooks of yore. Heck, you could use a PineBook if you’d like! (and since the latter runs Linux rather than ChromeOS, you might actually be able to use it for generating the ePub and PDF files, since it will have access to the tools I’m using on my own PC)
The basic process
So what does the writing process look like?
Basically, you write something that's similar to plain text. Here's a sample from my novel:
Norman's own ears and features were cast from the same mould, although his brown hair was worn short. Which, in his opinion, made a lot more sense than his boss's hair style in a zero-*g* environment. However, he was just the co-pilot; he didn't feel it was his place to ask the flight captain to tie down her darn hair. She'd either remember to do it herself, or not. "Been trying for years, Rub', but I can't stand plain water," Norman pointed out good-naturedly. "Doesn't taste anything, and it's somebody's recycled piss... Ugh!"
OK, so hopefully that didn’t gross you out :)
Here's the rendered version:
Norman’s own ears and features were cast from the same mould, although his brown hair was worn short. Which, in his opinion, made a lot more sense than his boss’s hair style in a zero-g environment. However, he was just the co-pilot; he didn’t feel it was his place to ask the flight captain to tie down her darn hair. She’d either remember to do it herself, or not.
“Been trying for years, Rub’, but I can’t stand plain water,” Norman pointed out good-naturedly. “Doesn’t taste anything, and it’s somebody’s recycled piss… Ugh!”
Notice how typography was done (smart quotes, nice ellipsis, etc.) Also, this HTML can now be embedded in epub. The actual markup:
<p> Norman’s own ears and features were cast from the same mould, although his brown hair was worn short. Which, in his opinion, made a lot more sense than his boss’s hair style in a zero-<em>g</em> environment. However, he was just the co-pilot; he didn’t feel it was his place to ask the flight captain to tie down her darn hair. She’d either remember to do it herself, or not. </p> <p> “Been trying for years, Rub’, but I can’t stand plain water,” Norman pointed out good-naturedly. “Doesn’t taste anything, and it’s somebody’s recycled piss… Ugh!” </p>
Big deal, you say. Why bother with that, it’s just a few HTML tags, right? Well, when you’re writing and trying to stay in a flow state, it matters a lot.
So conversion can usually be done with a program such as Pandoc, but I wasn’t 100% happy with the result. For one, my book has footnotes even though it’s a novel. Yes, I know, this goes against a bunch of style manuals. Still, I felt such footnotes were useful to the readers less versed in common science-fiction tropes that were not so important to the plot as to warrant an in-text explanation. Save, maybe, for one exception, which was footnooted to let the reader exercise their cleverness in figuring out a slightly unobvious pun.
All that to say, Pandoc puts the footnotes at the end of each chapter in the epub, which did not suit me at all. I could’ve modified the source code... but Pandoc is written in this odd language called Haskell which I don’t know very well.
I looked for a couple other solutions, but in the end opted to just run the Markdown file through Calibre. Calibre, for the unintiated, is mainly an ebook conversion tool. It is mostly written in Python, which I do know very well, so even though I didn’t like its footnote treatment 100% (it was much closer to what I wanted than Pandoc’s), it was easy for me to tweak.
There are still quite a few manual steps to the process, namely:
- Make the ISBN the primary identifier for the epub file. By default (and I haven’t found a good way to override this), Calibre generates a GUID (a random number which is pretty much guaranteed to be unique) and sets that as the primary identifier for the file, even if you specify the ISBN. Now, there’s nothing wrong with that and it’s acutally a good idea for conversions between ebook formats, but for a book meant for publication, it’s best if the ISBN is the primary identifier, so catalogs can index it properly and find the marketing cover image.
- Mark footnotes as nonlinear. My modified footnote script creates a nice set of divs at the end of the file, one per footnote, so each can have a separate page. This is very important on old Kindles which don’t display a pop-up for footnotes (yes, Kindles don’t read epub; More on that later). However, those files must not be seen as part of the book’s body. There’s no way to request that in Calibre’s page split configuration.
- Change epub_type attributes to epub:type. For some reason, Calibre’s processing pipeline doesn’t let me specify namespaces properly. I never found the root cause so I ended up emitting the attributes in namespace epub as epub_attributeName, then replacing through a global search-and-replace. Thankfully, Calibre’s ebook editor lets you do that easily.
- Change file extensions inside the epub from .html to .xhtml. Now, Calibre’s author didn’t want to make this automatic because he feels the file extension does not matter. And by and large, the epub spec agrees. Unfortunately, the Kobo Writing Life platform uses an old version of the epub validator, which does complain needlessly about this. So, to shut it up, I rename the internal files. Thankfully, Calibre’s ebook editor can do that. Unfortunately, it cannot do it from the command line...
- Move footnotes after the TOC. This is a simple structural change, just copy-paste of a specific block. Unfortunately Calibre can only generate the TOC at start or end of the epub; there is no way to ask it to move it to a specific location.
- Remove special characters from TOC. Some ereaders don’t like ellipsis or smart quotes in the TOC although they’re displayed properly in the body text. There are only two headings that need this so it is not a huge deal.
- Fix the CSS stylesheet. I have a few tweaks to do to existing Calibre styles and I prefer to tweak post conversion than try to figure out how to override it with additional custom CSS.
- Run kepubify on the epub, to get a nice Kobo-specific epub. Send it to Kobo Writing Life after checking on my Kobo.
- Finally, create a copy of the original epub, change the ISBN primary identifier in it, and run kindlegen on it. This generates a .mobi file, which I can then check on my ancient Kindle 4g non-touch, and finally send to KDP. This is needed for Kindle because you need a different ISBN for different ebook formats. Silly rule.
So that’s a lot of annoying steps, and it took me a while to figure out.
Eventually I will automate most of this through some Python script. The worse is the file extension rename. I wish Kobo would just update their epubcheck version...
To generate PDF for actual print, I haven’t worked out all details yet. However, I had good luck running the Markdown file through Pandoc to get a nice LaTeX file, then run pdflatex on it to get a very, very nice-looking PDF. The main problem was related to odd-even page margins as well as cover handling, I think most Print On Demand places ask you to provide cover separately.
Unfortunately, because of Covid-19, Amazon is not shipping proof copies to authors in Canada anymore... Early in the process I did a single copy print on Lulu Xpress, but for actual sales I need to see what the POD book would look like from the place which would sell it. I’m still debating at this point. Possibly I’ll have to go with IngramSpark but I’m a bit wary of having to pay some money up front. Maybe they’ll have some sort of promo I can use. That would likely be just a stopgap until KDP ships author proofs to Canada again.
If you’ve seen my book, like the formatting, and are having some issues with the process I’ve outlined above, feel free to drop me an email.