tonybaldwin | blog

non compos mentis

DjVu: Free alternative to PDF

leave a comment »

djview4

this article, as .djvu in djview4

First, a bit of ranting about open standards and free file formats:
Okay, you know I’m always harping about using Open Document Formats.
So, on the LibreOffice user list today there was discussion of a viable Free/Open alternative to .pdf files. After all, PDF is, indeed, a proprietary format, owned by Adobe, and it is ubiquitous, and there really should (must, perhaps), be a free, open alternative. As such, someone on the list mentioned DjVu, which, frankly, I’d never looked at before (I had heard of it, but knew not what it was). It’s a free/open file format that was initially created for scanned documents, from what I gather, and has been around since the late 80s, still maintained by the original authors, and is now used for all kinds of gro0vy stuff.
I did a bit of research, googling, apt=cache searching, and poking around. Eventually, I aptitude installed djview4 and djvulibre and experimented a little. I have drawn the conclusion that, yes, in my opinion, DjVu would be an excellent candidate to be used as, in fact, a better option for many reasons, for the purposes .pdf currently serves (a portable document format that preserves formatting, essentially). Works great.

But there IS a rather glaring drawback…
The one big drawback is, conversion tools are lacking.
One can not, for instance, simply write a DjVu file in any kind of document editor, as you can write a pdf with many different editors, web browsers, most office software, LaTeX editors, and basic text editors, such as tcltext, and, frankly, even in a command line interface.
But to create DjVu, you can only convert other files to DjVu.
Then, in general, and this is what most irritates me, it seems you have to convert from non-free formats. There are no tools, for instance, to convert directly from plain text, LaTeX (.tex), .odf (.odt), .png, or even html files to a .dvju file. What’s worse, is that all of your Free and/or open source browsers, document editors, etc., will export or print a file to .pdf, but not to .djvu. OpenOffice.org will write a .pdf. LibreOffice, and Abiword will write a .pdf. LaTeX editors will write a .pdf….Everybody will write a .pdf, but nobody has written code to write a file directly to .djvu. In my opinion, that needs changing. We need to use open standards and free/open file formats (all kinds of reasons for that discussed in this entry to this blog).

That said, today I wrote a script to convert a plain text file to DjVu (but, yes, I had to round-trip it through .pdf, darn it).
This script was written on a Debian/Stable (lenny at the time of this writing) system, on AMD64 arch, using all tools available in the lenny repos.
It requires (obvious when you read the script) enscript, ps2pdf, and pdf2djvu (part of dvjulibre).
The script first converts your text file to postscript with enscript, the from postscript to pdf, with, surprise, ps2pdf, and, then, the final step of converting to .djvu.

The script looks like this:
#!/bin/bash

if [[ $(echo $*) ]]; then
text="$*"
else
echo "try again, and include a file name, and ONLY 1 file name at a time. Thank you." && exit
fi

echo converting $text to $text.ps

enscript $text -q -B -p $text.ps

echo converting $text.ps to $text.pdf

ps2pdf $text.ps

echo converting $text.pdf to $text.djvu

pdf2djvu $text.pdf -o $text.djvu

echo renaming ...

rename.ul .txt.djvu .djvu $text.djvu

echo cleaning up ...

rm $text.ps $text.pdf

echo done

exit

I actually turned the script on itself, and created a DjVu file of this text, available here.
With this, I may very add the capacity to export a .djvu file to tcltext. Why not? It’s just a shame, imho, that such an export is not direct, without having the cross into proprietary territory via .pdf, in order to be accomplished.

Also, as a gift to my fellow freedom fighters, foss hackers, and open standards supports, I have created a DjVu of my poetry here which contains all the poems published in my recent book  (but not the paintings and photographs).

And, this full article in djvu format here.  This last was fun, because I ended up having to change the text encoding first.  Apparently enscript doesn’t like utf8. I had copy/pasted the article into tcltext, which generates utf8 here (system default).  I made a .dvju that had all these weird character substitutions (like /200a#blahblah for a quotation mark?).  Here’s how to handle the conversion.

iconv iconv -f utf8 --to-code=ascii//TRANSLIT yourfile > newfile

Now, if you use firefox or some other mozilla derivative, there’s actually a plugin for view such files in your browser, included in the djvulibre packages..  Otherwise, you’ll need a djvu viewer, such as djview or evince.

Anyway,
Enjoy.

./tony


Este artículo en español: http://www.gnewbook.org/pg/blog/tonybaldwin/read/83736/djvu-excelente-substituto-al-formato-nolibre-de-pdf-pero
Esse artigo em português: http://softwarelivre.org/tonybaldwin/blog/djvu-otimo-substituto-ao-formato-nao-livre-de-pdf-mas…

Advertisements

Written by tonybaldwin

January 27, 2011 at 2:06 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: