tonybaldwin | blog

non compos mentis

Archive for the ‘literature’ Category

DjVu: Free alternative to PDF (and a script to convert plain text to DjVu)

with 14 comments

djview4

this article, as .djvu in djview4

First, a bit of ranting about open standards and free file formats:
Okay, you know I’m always harping about using Open Document Formats.
So, on the LibreOffice user list today there was discussion of a viable Free/Open alternative to .pdf files. After all, PDF is, indeed, a proprietary format, owned by Adobe, and it is ubiquitous, and there really should (must, perhaps), be a free, open alternative. As such, someone on the list mentioned DjVu, which, frankly, I’d never looked at before (I had heard of it, but knew not what it was). It’s a free/open file format that was initially created for scanned documents, from what I gather, and has been around since the late 80s, still maintained by the original authors, and is now used for all kinds of gro0vy stuff.
I did a bit of research, googling, apt=cache searching, and poking around. Eventually, I aptitude installed djview4 and djvulibre and experimented a little. I have drawn the conclusion that, yes, in my opinion, DjVu would be an excellent candidate to be used as, in fact, a better option for many reasons, for the purposes .pdf currently serves (a portable document format that preserves formatting, essentially). Works great.

But there IS a rather glaring drawback…
The one big drawback is, conversion tools are lacking.
One can not, for instance, simply write a DjVu file in any kind of document editor, as you can write a pdf with many different editors, web browsers, most office software, LaTeX editors, and basic text editors, such as tcltext, and, frankly, even in a command line interface.
But to create DjVu, you can only convert other files to DjVu.
Then, in general, and this is what most irritates me, it seems you have to convert from non-free formats. There are no tools, for instance, to convert directly from plain text, LaTeX (.tex), .odf (.odt), .png, or even html files to a .dvju file. What’s worse, is that all of your Free and/or open source browsers, document editors, etc., will export or print a file to .pdf, but not to .djvu. OpenOffice.org will write a .pdf. LibreOffice, and Abiword will write a .pdf. LaTeX editors will write a .pdf….Everybody will write a .pdf, but nobody has written code to write a file directly to .djvu. In my opinion, that needs changing. We need to use open standards and free/open file formats (all kinds of reasons for that discussed in this entry to this blog).

That said, today I wrote a script to convert a plain text file to DjVu (but, yes, I had to round-trip it through .pdf, darn it).
This script was written on a Debian/Stable (lenny at the time of this writing) system, on AMD64 arch, using all tools available in the lenny repos.
It requires (obvious when you read the script) enscript, ps2pdf, and pdf2djvu (part of dvjulibre).
The script first converts your text file to postscript with enscript, the from postscript to pdf, with, surprise, ps2pdf, and, then, the final step of converting to .djvu.

The script looks like this:
#!/bin/bash

#!/bin/bash

# Converting a text file to a DjVu file
# copyright © tony baldwin / tony@baldwinsoftware.com
# release according to the terms of the GNU Public License, v. 3 or later

# first, make sure you named a file. duh.

if [[ $(echo $*) ]]; then
text="$*"
else
echo "try again, and include the file name..hello!" && exit
fi

# okay, enscript like ASCII best, so let's test our file encoding
# if we have anything other than ASCII, we will convert with iconv

enc="$(file --brief --mime-encoding $text)"
echo This file is encoded as $enc

if [ $enc != us-ascii ] ; then
echo We need to convert to ascii first.
echo Converting text encoding now ...
iconv -f $enc --to-code=ascii//TRANSLIT $text > tempy
mv tempy $text
newenc="$(file --brief --mime-encoding $text)"
echo Ok, now we have $newenc encoding and can proceed with conversion to djvu ...
fi

# from here, things are fairly self-explanatory

echo converting $text to $text.ps

enscript $text -q -B -p $text.ps

echo converting $text.ps to $text.pdf

ps2pdf $text.ps

echo converting $text.pdf to $text.djvu

pdf2djvu $text.pdf -o $text.djvu

echo renaming ...

rename.ul .txt.djvu .djvu $text.djvu

echo cleaning up ...

rm $text.ps $text.pdf

echo all done

# here, we are using the variable $text, which is $filename.txt, and changing it to $filename
# so we can append .djvu and open the resulting file in djview4

ntx=${text%.*}

djview4 $ntx.djvu &

exit

# This program was written by anthony baldwin - tony@baldwinsoftware.com
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

What it’s doing? The first thing the file does is check the file encoding of the file in question. Enscript seems to play nice with ASCII, but not utf8 or some other encodings, so we’re converting to ASCII before doing anything else. Then, the script converts your text file to postscript with enscript, then from postscript to pdf, with, surprise, ps2pdf, and, then, the final step of converting to .djvu. At the end, the file cleans up the directory, removing the .ps and .pdf files. Then, it opens your file in Djview4. I have commented the script accordingly.

I actually turned the script on itself, and created a DjVu file of this text, available here.
With this, I may very add the capacity to export a .djvu file to tcltext. Why not? It’s just a shame, imho, that such an export is not direct, without having the cross into proprietary territory via .pdf, in order to be accomplished.

Also, as a gift to my fellow freedom fighters, foss hackers, and open standards supports, I have created a DjVu of my poetry here which contains all the poems published in my recent book  (but not the paintings and photographs).

And, this full article in djvu format here.  This last was fun, because I ended up having to change the text encoding first.  Apparently enscript doesn’t like utf8. I had copy/pasted the article into tcltext, which generates utf8 here (system default).  I made a .dvju that had all these weird character substitutions (like /200a#blahblah for a quotation mark?).  This is why I updated the script with the enscript text encoding conversion feature.

Now, if you use firefox or some other mozilla derivative, there’s actually a plugin for view such files in your browser, included in the djvulibre packages..  Otherwise, you’ll need a djvu viewer, such as djview or evince.

Anyway,
Enjoy.

./tony


Este artículo en español: http://www.gnewbook.org/pg/blog/tonybaldwin/read/83736/djvu-excelente-substituto-al-formato-nolibre-de-pdf-pero
Esse artigo em português: http://softwarelivre.org/tonybaldwin/blog/djvu-otimo-substituto-ao-formato-nao-livre-de-pdf-mas…

Written by tonybaldwin

January 19, 2011 at 3:43 pm

The Firewater by Anthony Baldwin

leave a comment »

Written by tonybaldwin

October 7, 2010 at 5:34 am

Posted in literature, poetry

Tagged with ,

art of tony baldwin

with one comment

I have published a book of my poetry, art, and photos.

Find it here:
art of tony baldwin
art of tony baldwin 2010

It can be read online, or purchased both as a pdf download, or a bound and print copy.

Written by tonybaldwin

August 22, 2010 at 11:44 am

Voice Recordings: audio literature

leave a comment »

These recordings were made in my office, not a professional studio, using a cheap microphone plugged into my soundcard.

All software used to make these recordings was Free Open Source Software.
I used Audacity to record and export files to .wav and .ogg, and ffmpeg to create mp3 files.

The computer used runs Debian Gnu/Linux (stable/lenny).

To hear music I’ve recorded in similar fashion (I play guitar and sing): click here.


English Literature:

Chaucer

  • Prologue to the Canterbury Tales (in Middle English): mp3 / ogg
  • Edgar Allen Poe

  • The Telltale Heart: mp3 / ogg

  • Spanish Literature / Literatura en Español

    Pablo Neruda

  • Cuerpo de Mujer: mp3 / ogg

  • French Literature / Literature en Français

    Charles Beaudelaire:

  • L’Homme et la Mer: mp3 / ogg

  • Brazilian and Portuguese Literature / Literatura Brasileira e Portuguesa

    Brasileira:

  • Soneta de Fidelidade por Vinícius Moraes: mp3 / ogg
  • Written by tonybaldwin

    May 12, 2010 at 12:09 pm

    Old English text found in New Haven basement

    with one comment

    I found these papers in my basement.


    click to enlarge

    page 2 / page 3.

    PDF version, all 3 pages

    Now, we weren’t hit so bad with all of the flooding going on in the state, but, we did have a slight issue in the basement. It wasn’t so bad, since, about month ago, with snow melt and rain, we’d had a problem and the landlord sent over the maintenance guy to clean out the drain and repair the pump, etc., but, it seems the foundation has cracked, and there was water coming in.
    I hadn’t been down to the basement, since we hadn’t had any issues (last time, with the flooding, the pilot light went out in the water heater), until I went down this morning to bring up some bicycling gear, what with the weather warming up.
    When I got down there, I saw a couple of cinder blocks out of place back in one corner of the basement. There was mud and water seeping in, slowly, but, more important, there was a space large enough to push my hands back in there, and even see back into the gap. I saw a cylinder, which appeared like one of those tube thingies you see drafting students carrying around, for transporting rolled up drafting drawings, or whatever. I pulled it out. It appeared to be made of wood (mostly rotted) covered in some form of leather (also rotted, but not entirely).
    I opened it up, and, well, within I found these pages.

    Now, I studied Beowulf in college, being an English major, and even read the portions thereof in Old English. I even still recall a very few lines (hwear eart þu nu, ge-fera? = where are you now, friend?).
    I’m pretty sure the text on these pages is Old English.

    Yeah. I find that quite odd, too. Nonetheless, I recall reading speculation that the stone dwellings in the Gungywamp Forest area of Groton may have been built by Europeans, long before the arrival of our Puritan forefathers (vikings?), since they are decidedly not like any type of dwelling constructed by the local, native Algonquian tribes.
    I don’t know. It’s all kind of weird, to me…
    But, there you have it folks.

    I have what appears to be some Old English manuscript in my hands.


    posted with Xpostulate

    Written by tonybaldwin

    April 1, 2010 at 7:02 am

    Sláinte chugat : Happy St. Patty's!

    leave a comment »

    Great stuff from Ireland:
    First a poem from Irishman, W.B. Yeats:
    The Stolen Child

    WHERE dips the rocky highland
    Of Sleuth Wood in the lake,
    There lies a leafy island
    Where flapping herons wake
    The drowsy water rats;
    There we’ve hid our faery vats,
    Full of berrys
    And of reddest stolen cherries.
    Come away, O human child!
    To the waters and the wild
    With a faery, hand in hand,
    For the world’s more full of weeping than you can understand.

    Where the wave of moonlight glosses
    The dim gray sands with light,
    Far off by furthest Rosses
    We foot it all the night,
    Weaving olden dances
    Mingling hands and mingling glances
    Till the moon has taken flight;
    To and fro we leap
    And chase the frothy bubbles,
    While the world is full of troubles
    And anxious in its sleep.

    Come away, O human child!
    To the waters and the wild
    With a faery, hand in hand,
    For the world’s more full of weeping than you can understand.

    Where the wandering water gushes
    From the hills above Glen-Car,
    In pools among the rushes
    That scarce could bathe a star,
    We seek for slumbering trout
    And whispering in their ears
    Give them unquiet dreams;
    Leaning softly out
    From ferns that drop their tears
    Over the young streams.
    Come away, O human child!
    To the waters and the wild
    With a faery, hand in hand,
    For the world’s more full of weeping than you can understand.

    Away with us he’s going,
    The solemn-eyed:
    He’ll hear no more the lowing
    Of the calves on the warm hillside
    Or the kettle on the hob
    Sing peace into his breast,
    Or see the brown mice bob
    Round and round the oatmeal chest.
    For he comes, the human child,
    To the waters and the wild
    With a faery, hand in hand,
    For the world’s more full of weeping than he can understand.

    The Collected Poems of William Butler Yeats, one of my favorite poets of all times…


    Now some great music:


    Waterboys – Fisherman’s Blues

    Great music! I saw the Waterboys in some tiny club in Chicago, on tour for this ablum back in about 1989, I believe, with my good friend, Harry,
    while I was out in Indiana, attending Purdue University. The show was phenomenal. This is truly amazing music! This particular album includes the musical rendering of Yeat’s Stolen Child which I posted earlier from youtube.


    And now a couple of great books:

    How the Irish Save Civilization: The Untold Story of the Irish role in the Fall of the Roman Empire and the Rise of Medieval Europe

    Beginner’s Gaelic (Hippocrene Beginners Language Series)

    Saol fada chugat

    ./tony

    Written by tonybaldwin

    March 17, 2010 at 10:34 am

    Top o' the morning to all ye lads & lassies!

    leave a comment »

    Written by tonybaldwin

    March 17, 2010 at 5:43 am