tonybaldwin | blog

non compos mentis

Archive for the ‘gnu/linux’ Category

Preview of Xpostulate Improvements

leave a comment »

A preview of what’s to come…

Thinking of UI enhancements, I added the Xpostulate little icon thingy right into the GUI.

What do you think?

Other items on their way:

  • Posterous support. I have interacted with the posterous api via bash with curl, so, just need to translate my scripting for that to tcl with http. Cake, but requires time. I thought I would have that done this past week, but, no joy…too much work (somebody’s gotta pay the rent around here).
  • Blogger support. – The great and benevolent Google® has granted me an API key, and I have looked at the API, but not yet played with it, but this is likely to come this season…soon, me dro0gies.
  • Read your statusnet public timeline or updates from a specific person. This I have, again, done in bash, so just a matter of coding it into tcl. Although, I question if this is appropriate for Xpostulate, and whether it might not be better to do this with iDenTickles only, since iDenTickles is a microblogging client, and Xpostulate is intended for crossposting to blogs, not reading others’ updates.
  • Download, edit, & republish older entries. This is on my todo list, but for each blogging service I have to look at how their API handles this, and then code stuff in, and develop new GUI elements for housing various functions, and blah, blah, blah. It will be work..heavy lifting…but it’s on my TODO list.

posted with Xpostulate

Advertisements

Written by tonybaldwin

September 24, 2011 at 10:16 pm

Web Word Count – count the words on a website with bash, lynx, curl, wget, sed, and wc

with one comment

Web Word Count: Get the word count for a list of webpages on a website.

A colleague asked what the easiest way was to get the word count for a list of pages on a website (for estimation purposes for a translation project).

This is what I came up with:

#!/bin/bash

# get word counts and generate estimated price for localization of a website
# by tony baldwin / baldwinsoftware.com
# with help from the linuxfortranslators group on yahoo!
# released according to the terms of the Gnu Publi License, v. 3 or later

# collecting necessary data:
read -p "Please enter the per word rate (only numbers, like 0.12): " rate
read -p "Enter currency (letters only, EU, USD, etc.): " cur
read -p "Enter domain (do not include http://www, just, for example, somedomain.com): " url

# if we've run this script in this dir, old files will mess us up
for i in pagelist.txt wordcount.txt plist-wcount.txt; do
	if [[ -f $i ]]; then
		echo removing old $i
		rm $i
	fi
done

echo "getting pages ...  this could take a bit ... "

wget -m -q -E -R jpg,tar,gz,png,gif,mpg,mp3,iso,wav,ogg,ogv,css,zip,djvu,js,rar,mov,3gp,tiff,mng $url
find . -type f | grep html > pagelist.txt

echo "okay, counting words...yeah...we're counting words..."

for file in $(cat pagelist.txt); do
	lynx -dump -nolist  $file | wc -w >> wordcount.txt
done
paste pagelist.txt wordcount.txt > plist-wcount.txt

echo "adding up totals...almost there..."
total=0
for t in $(cat wordcount.txt); do
	total=$((total + t))
done

echo "calculating price ... "
price=`echo "$total * $rate" | bc`

echo -e "\n-------------------------------\nTOTAL WORD COUNT = $total" >> plist-wcount.txt
echo -e "at $rate, the estimated price is $cur $price
------------------------------" >> plist-wcount.txt

echo "Okay, that should just about do it!"
echo  -------------------------------
sed 's/\.\///g' plist-wcount.txt > $url.estimate.txt
rm plist-wcount.txt
cat $url.estimate.txt
echo This information is saved in $url.estimate.txt
exit

So, then I ran the script on my site, tonybaldwin.net, with a rate of US$012/word, and this is the final output:

—————————————-
tonybaldwin.net/log/archives/environment/index.html 38
tonybaldwin.net/log/archives/cuisine/index.html 38
tonybaldwin.net/log/archives/music/index.html 52
tonybaldwin.net/log/archives/philosophy/index.html 38
tonybaldwin.net/log/archives/nanoblogger-help/index.html 52
tonybaldwin.net/log/archives/2011/09/11/911/index.html 322
tonybaldwin.net/log/archives/2011/09/index.html 774
tonybaldwin.net/log/archives/2011/09/01/mit_intro_to_cs_and_programming_assignment_1/index.html 494
tonybaldwin.net/log/archives/2011/08/26/come_on_irene/index.html 382
tonybaldwin.net/log/archives/2011/08/26/welcome_to_nanoblogger_3_4_2/index.html 289
tonybaldwin.net/log/archives/2011/08/26/here_we_roll_again/index.html 618
tonybaldwin.net/log/archives/2011/08/27/couldnt_stand_the_weather/index.html 93
tonybaldwin.net/log/archives/2011/08/index.html 1205
tonybaldwin.net/log/archives/2011/index.html 133
tonybaldwin.net/log/archives/technology/index.html 56
tonybaldwin.net/log/archives/politic/index.html 38
tonybaldwin.net/log/archives/religion/index.html 38
tonybaldwin.net/log/archives/art/index.html 38
tonybaldwin.net/log/archives/index.html 85
tonybaldwin.net/log/archives/personal/index.html 65
tonybaldwin.net/log/archives/health/index.html 38
tonybaldwin.net/log/articles/about/index.html 671
tonybaldwin.net/log/index.html 2027
tonybaldwin.net/log.1.html 2027
tonybaldwin.net/index.html 96
tonybaldwin.net/social.html 82

———————————————–
TOTAL WORD COUNT = 9789
at 0.12, the estimated price is USD 1174.68
———————————————–

Now, this is simple, of course, for a simple website, like tonybaldwin.net, which is largely all static html pages. Sites with dynamic content are going to be an entirely different story, of course.

The comments explain what’s going on here, but I explain in greater detail here on the baldwinsoftware wiki.

Now, if you just want the wordcount for one page, try this:

    #!/bin/bash

# add up wordcounts for one webpage

if [[ ! $* ]]; then
    read -p "Please enter a webpage url: " ur
else
    url=$*
 fi
 read -p "How much to you charge per word? " rate
 count=`lynx -dump -nolist $url | wc -w`
 price=`echo "$count * $rate" | bc`
 echo -e "$url has $count words. At $rate, the price would be US\$$price."
 exit

Special thanks to out to the Linux 4 Translator list for some assistance with this script.

Enjoy!

./tony

Written by tonybaldwin

September 20, 2011 at 10:31 pm

New Xpostulate release in the works

leave a comment »

Okay, I just pushed new code for Xpostulate to github with the following changes:

  • removed iziblog, scribbld, inksome (spam SEO havens anyway)
  • removed twitter until I can get oauth working
  • added support for custom wordpress installations
  • added support for posting to friendika with bbcode insertions
  • changed identi.ca feature to support any status.net installation.
  • also, various pertinent alterations to gui, of course

all in ONE DAY! because I F–KING ROCK!

I have NOT updated the win/lin installers on the main Xpostulate page, yet.
I have to play with installjammer and get those worked up again, and will probably give a day or two for this new code to be tested,
since, it seems, I now have a contributor on the project who seems willing to test and prod this code.

WELCOME ABOARD, Charles Roth!

Still to do:

  • I really, really want a button to click to automagically translate bbcode to html or vice-versa. That I can do, but need time.
  • Get oauth working for twitter…maybe
  • add support for blogger
  • change the LJ, IJ, DJ, DW to be simple moveabletype, with multiple options, rather than hardwired for 4 different sites, so, say, if you only use LJ and DW, you don’t have DJ and IJ cluttering your interface, or, even, if you have multiple LJ accts (I do, one for my art, other for hackery), you can do that, etc.

Now, I really must get back to translating these Brazilian pharma regulations.

Written by tonybaldwin

September 18, 2011 at 1:47 pm

Image UP

leave a comment »

Image Up

a quick-n-dirty script to copy an image (or other file) to your server. (wiki page for this script)

I basically use this to upload screenshots for display here on this wiki and my blog, etc., so have the images directory “hardwired” in the script, but this could easily be customized to choose a different directory and use with any manner of files.

#!/bin/bash

# script to upload images to my server
# by tony baldwin

if [ ! $1 ]; then
        # If you didn't tell it which file, it asks here
	read -p "Which image, pal? " img
        else
        img = $1
fi

# using scp to copy the file to the server
scp $img username@server_url_or_IP_address:/path/to/directory/html/images/
# you will be asked for your password, of course.  This is a function of scp, so not written into the script.

echo "Your image is now at http://www.yoursite.com/images/$img."
read -p "Would you like to open it in your a browser now? (y/n): " op

if [ $op = "y" ]; then
	# you can replace xdg-open with with your favorite browser, but this should choose your default browser, anyway.
	xdg-open http://www.yoursite.com/images/$img
        # if you chose yes, the browser will open the image.
        # Otherwise, it won't, but you have the url, so you can copy/paste to a browser or html document, blog entry, tweet, etc., at will.
fi

exit

This image was uploaded with the above script:

(editing website with Tcltext)

This script, of course, assumes you are in the same directory as your image file, too.

Enjoy!

./tony


EDIT: What would be cool is if I could make your filemanager allow this in a right-click action. Like, I use PCManFM. If I could just right-click an image and choose this, then pop-up the url with zenity, or, perhaps, even just automatically run the xdg-open…Hmmmm…One can probably work this out with some filemanagers more easily than others.

With some work, I could rewrite the script so that it choose a clicked image and auto-opens with the browser, and then just choose the script with “right-click > open with …”, perhaps…

Of course, I can just F4 (open dir in terminal), then bang off the script.

Written by tonybaldwin

September 4, 2011 at 1:31 pm

search wiktionary from the bash cli

with 2 comments

Last week I posted several scripts for searching google, google translate, google dictionary, reverso, and wikipedia from the bash command line.

Today I wrote another script, this time for searching wiktionary.org, the multilingual, user-edit online dictionary:

#!/bin/bash

# get definitions from wikitionary

if [ ! $1 ];
	then
read -p "Enter 2 letter language code: " lang
read -p "Enter search term: " sterm
lynx -dump $lang.wiktionary.org/wiki/$sterm | less
else
lynx -dump $1.wiktionary.org/wiki/$2 | less
fi

I tucked this into my PATH as simply “wikt”, and usage is thus:
you@machine:~$ wikt en cows
or, if you neglect to use the language code and search term, of course, the script asks for them:
you@machine:~$ wikt
Enter 2 letter language code: en
Enter search term: cows

Enjoy!

./tony

Written by tonybaldwin

May 16, 2011 at 7:11 am

search google, wikipedia, reverso from the bash terminal

with one comment

searching in bash

searching in bash

Okay, so, I like to use my bash terminal. Call me a geek all you like; it matters not to me. I wear that badge with pride.

The bash terminal is quick and efficient for doing a lot of stuff that one might otherwise use some bloated, cpu sucking, eye-candied, gui monstrosity to do. So, when I find ways to use it for more stuff, more stuff I do with it.

Now, for my work (recall, I am professionally a translator) I must often do research, some of which entails heavy lifting, and, otherwise, often simply searching for word definitions and translations. I use TclDict, which I wrote, frequently, but, I also use a lot of online resources that I never programmed TclDict to access, and would generally use a browser for that stuff. Unless, of course, I can do it my terminal!

For precisely such purposes, here are a couple of handy scripts I use while working.

First, let’s look up terms at Dict.org:

#!/bin/bash
# default db=all

if [[ $(echo $*) ]]; then

searchterm="$*"
else

read -p "Enter your search term: " searchterm
fi
read -p "choose database (enter 'list' to list all options, leave blank for first match): " db

if [[ $db = list ]] ; then
curl dict://dict.org/show:db

read -p "choose database, again: " db
fi

curl dict://dict.org/d:$searchterm:$db | less

Now, let’s search google from the command line:

#!/bin/bash
if [[ $(echo $*) ]]; then
searchterm="$*"
else
read -p "Enter your search term: " searchterm
fi
lynx -accept_all_cookies http://www.google.com/search?q=$searchterm
# I accept all cookies to go direct to search results without having to approve each cookie.
# you can disable that, of course.

I saved that in ~/bin/goose # for GOOgle SEarch
and just do
goose $searchterm
Or, search the google dictionary to translate a term:

#!/bin/bash
echo -e "Search google dictionary.\n"
read -p "Source language (two letters): " slang
read -p "Target language (two letters): " tlang
read -p "Search term: " sterm
lynx -dump "http://www.google.com/dictionary?langpair=$slang|$tlang&q=$sterm" | less

Note: For a monolingual search, just use the same language for source and target. Don’t leave either blank.

Or:

#!/bin/bash
if [ ! $3 ];
then
echo -e "usage requires 3 parameters: source language, target language, search term. \n
Thus, I have this as ~/bin/googdict, and do \n
googdict en es cows \n
to translate "cows" to Spanish. \n
For monolingual search, enter the language twice. \n
As indicated, use the two letter code: \n
\"en\" for English, \"fr\" for French, etc."
exit
fi
lynx -dump "http://www.google.com/dictionary?langpair=$1|$2&q=$3" | less

For the above, I have it in ~/bin/gd, usage being simply “gd $sourcelanguage $targetlanguage $searchterm”.
Example:
me@machine:~$ gd en es cow
Searches the Englist to Spanish dictionary for “cow”.

We can use similar principles to search reverso:

#!/bin/bash
#search reverso

if [ ! $1 ];
then
read -p "Enter the source language: " slang
read -p "Enter target language: " tlang
read -p "Enter your search term: " searchterm
lynx -dump dictionary.reverso.net/$slang-$tlang/$searchterm | less

else

lynx -dump dictionary.reverso.net/$1-$2/$3 | less
fi

With the google dictionary, you use the two-letter language code (i.e., “en” for English, “fr” for French, etc.). With reverso, you have to spell out the language (“english” for English, etc.).

With all of the above, I’ve used the program, less, to display the results, rather than spitting it all out to to the terminal at once. Click here to learn how to use less, if needed.

Additionally, most of the above require Lynx Browser, which is generally available for any gnu/linux distribution via your favorite package manager (apt, synaptic, aptitude, yum, portage, pacman, etc.). For the dict.org script, I used cURL (also part of most gnu/linux distributions and installable with your favorite package manager).

Google Translate can also be accessed, but for this, we’ll use a bit of python magic (I know, I pick on google translate, a lot, but it can be useful):

#!/usr/bin/env python
from urllib2 import urlopen
from urllib import urlencode
import sys

# The google translate API can be found here:
# http://code.google.com/apis/ajaxlanguage/documentation/#Examples

lang1=sys.argv[1]
lang2=sys.argv[2]
langpair='%s|%s'%(lang1,lang2)
text=' '.join(sys.argv[3:])
base_url='http://ajax.googleapis.com/ajax/services/language/translate?'
params=urlencode( (('v',1.0),
('q',text),
('langpair',langpair),) )
url=base_url+params
content=urlopen(url).read()
start_idx=content.find('"translatedText":"')+18
translation=content[start_idx:]
end_idx=translation.find('"}, "')
translation=translation[:end_idx]
print translation

Originally found that here, on the ubuntuforums.

And now for Wikipedia we have a couple of options.
First, we have this awesome little handy script, tucked into my $PATH as “define”:

#!/bin/bash
dig +short txt $1.wp.dg.cx
exit

I use it simply with “define $searchterm”, and it gives a short definition from wikipedia.  I originally found it here.

Another extremely handy tool is Wikipedia2Text, which I simply installed from the debian repos via aptitude. When I use this, I also pipe it to less:

#!/bin/bash
if [[ $(echo $*) ]]; then

searchterm="$*"
else

read -p "Enter your search term: " searchterm
fi

wikipedia2text $searchterm | less

I have that tucked into ~/bin/wikit, thus, do simply

wikit $searchterm
to get my results.

Enjoy!

All code here that I have written is free and released according to the GPL v. 3. Check the links for code I borrowed for licensing information (pretty sure it’s all GPL-ed, too).

./tony

Written by tonybaldwin

April 25, 2011 at 7:44 pm

Oggify – convert all your wma, wav and mp3 to ogg

with one comment

A script to convert .mp3 files to .ogg

Requires mpg123 and oggenc, uses perl rename, but I can make one with the old rename (rename.ul now in ubuntu and debian).

Why should we use ogg?

cd into the dir fullo tunes, and:

#!/bin/bash

# convert mp3 and wav to ogg
# tony baldwin  http://www.BaldwinSoftware.com
# cleaning up file names

echo cleaning file names...

rename 's/ /_/g' *
rename y/A-Z/a-z/ *
rename 's/_-_/-/g' *
rename 's/\,//g' *

# converting all mp3 files to wav,
#so there will be nothing but wavs

echo Converting mp3 to wav...

for i in $(ls -1 *.mp3)
do
n=$i
mpg123 -w "$n.wav" "$n"
done

# and, now, converting those wav files to ogg

echo Converting .wav to .ogg

for i in *.wav
do
oggenc $i
done


# Clean up, clean up, everybody everywhere
# Clean up, clean up, everybody do your share...

# cleaning some file names
# removing ".mp3" from $filename.mp3.ogg
# for result of $filename.ogg

rename 's/.mp3//g' *.ogg

# removing all those big, fat wav files.

rm -f *.wav
rm -f *.mp3

Cleaning up after ourselves...

echo -e "Your files are ready, friend.\nHappy listening!"

exit

# This program was written by tony baldwin - tony @ baldwinsoftware.com
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

wiki page for this script.

I have a version with a little gui-ness with zenity, if anyone wants it. (i.e. with a graphical user interface)

./tony


UPDATE: 2011.02.24

I wrote a separate script for those darned .wma files. Requires FFMpeg:

#!/bin/bash

# tony baldwin http://www.BaldwinSoftware.com
# cleaning up file names

echo cleaning file names...

rename 's/ /_/g' *
rename y/A-Z/a-z/ *
rename 's/_-_/-/g' *
rename 's/\,//g' *

echo convert wma files to ogg

for i in *.wma;
do ffmpeg -i $i -acodec libvorbis -aq 100 $i.ogg;
if [ -f $i.ogg ]; then
rename 's/.wma//g' $i.ogg
rm $i
fi
ls *.ogg
echo 'all done'
done

Written by tonybaldwin

February 22, 2011 at 11:25 am