tonybaldwin | blog

non compos mentis

Archive for the ‘hacking’ Category

The Free Web

leave a comment »

the free web

A FREE, uncensored, neutral internet is ESSENTIAL to Free Expression and Freedom of Information.

VIVA LA FREE WEB

¡Viva la FREE WEB!

In recent times, we have seen governments restrict access to the internet. We’ve seen the huge, corporate owned social networking and microblogging sites censor content their investors do not like, and even remove accounts belonging to protesting entities, such as Twitter’s recent removal of the account for OccupyWallStreet, and Facebook’s censorship of protest related photos.

But, we need the internet to communicate, not just locally, but nation-wide, and world-wide, to express our views, to make our voices heard, and to share what it is we are doing, and how our oppressors react, EVERYWHERE…

To that end:

The decentralized, federated, FREE (as in freedom, as well as price), social networks on which I currently play are:

Diaspora*

Diaspora* is a software that can be installed on a server by anyone that has the knowledge to do so. They in turn can allow people to register for an account on what is called their “pod”. There are many of these pods already established across the internet (list here podupti.me) with many users. You register for a free account on a pod and you can seamlessly connect with other users on other pods the same as if you were making someone a friend on other social networking sites. No matter which pod you are on, you are all using Diaspora. If you have the technical skills, you can even set up your own pod for your family and or friends. They can in turn connect to family and friends on your pod or even other pods with ease.

Diaspora* has many of the features of other popular social networks, including groupings of friends (like G+ circles, but called “aspects”. Oh, and Diaspora* had this feature over a year before G+ was even launched!), sharing of photos, links, videos, etc. Diaspora will allow you cross-post materials to twitter, facebook, and tumblr, and allow you to connect to friends on Friendika, as well. The aspects give you great control over you can view your content, so you have complete control over your privacy. Also, YOU own all content that you post. Diaspora* has not advertisements, and nobody on Diaspora* is tracking you, either on the site or across the internet. Diaspora* will not censor your communications with others. Also, on Diaspora* you can use any name or pseudonym you like.

There are numerous Diaspora sites, but they are all connected, so contacts on any Diaspora site can be connected to folks on another Diaspora site.

Here is my Diaspora profile:

tonybaldwin@poddery.com

I recommend joining diaspora at poddery.com or diasp.org.

StatusNet

StatusNET is for microblogging (like twitter, and can forward updates to twitter) built on free/open source software. StatusNEt is uncensored, free, and you can roll your own. StatusNet has features that twitter lacks, including posting of longer “blog” entries, sharing of events, uploading photos and music files, creation of polls and questions, and cross-connections with folks on any other StatusNet site. Also, one can make their StatusNet updates forward to Twitter, thus sharing with twitter contacts and StatusNet contacts, simultaneously. One more great feature of StatusNet are groups. By posting updates with a certain tag, the messages are grouped, and one can choose to be a member of that group and follow conversations on that topic. For instance, on the statusnet installation at Free-Haven.org/status/, there is a group for Occupy New Haven, and any update with !occupynewhaven or !onh is posted to that group. So, statusnet is kind of like twitter on steriods. Much more powerful, many more features. It is also more configurable. Our statusnet installation, for instance, is set to accept updates with up to 200 characters, as opposed to twitter’s 140 (one can change this up to 500 characters).

There is a statusnet installation on free-haven.org at http://free-haven.org/status/ Check it out!
My profile is tonybaldwin@free-haven.org
From there, I am following friends from all around the world on http://identi.ca, http://parlementum.net, and a few other smaller, private StatusNet installations, who are also following me from those sites, and I have my updates forwarded to twitter, from whence they forward to Google Buzz, Tumblr, and Facebook. If any of those proprietary networks cut me off or censored me, my friends all around the world on http://identi.ca and http://parlementum.net would still see my updates, as would, of course, anyone on our installation, or any other StatusNet installation who chose to follow me.

One can even export updates from any statusnet site, group, or individual to an rss feed, or, one can follow an rss feed. I have my free-haven updates embedded on my free-haven wiki profile here. Also, I have all public updates to our statusnet installation embedded on the front page of this wiki here.

Friendika

Friendika

Friendika

But, best of all, in my opinion, is Friendika.

Friendika is decentralized and federated, but also allows you to connect to contacts on twitter, identi.ca, diaspora, facebook, and other sites, from friendika. I recommend Friendika most highly of all (although a combination of statusnet for microblogging and friendika is a good idea). Friendika has photo galleries, an event calendar, friend groups, and all the other functions you already use on other social networks.

Learn more about friendika at http://project.friendika.com/

The creator, Mike Macgrivin, is a friend (he was part of the team that developed Netscape Browser for AOL!). I have developed software to interact with the Friendika’s API, and may be developing some plugins.

My current friendika profile is http://frndk.de/profile/tony

Comparison of Social Networks

In Diaspora, StatusNet, and Friendika, unlike FB, G+, and other sites, you own your own data, and completely control your own privacy. The sites are not corporate owned, and, in fact, if you have access to a server and the know-how, you can install and run a site yourself (kind of like you can with wordpress, joomla, etc.), and still connect to all the other friendika and/or diaspora sites. In this way, a truly FREE, open, neutral internet is forming, uncensored and unfettered by corporate interests.

Here is an excellent breakdown of the differences and similarities in social networks.
You will see that Friendika is richer in features than any other.

./tony


Creative Commons License
The Free Web by tony baldwin is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Based on a work at www.tonybaldwin.info.

xposted with: Xpostulate | original article

Web Word Count

leave a comment »

Web Word Count: Get the word count for a list of webpages on a website.

A colleague asked what the easiest way was to get the word count for a list of pages on a website (for estimation purposes for a translation project).

This is what I came up with:

#!/bin/bash
# add up wordcounts for website

total=0 # initialize variable for total

# scan through a list of pages
# strip the html elements and count the words
# append the count to wordcount.txt

for i in $(cat pagelist.txt); do
     curl -s  $i |  sed -ne '{s/]*>//g;s/^[ \t]*//;p}' | wc -w >> wordcount.txt
done

# this is for purely aesthetic purposes, 
# but we're merging the list of pages with the wordcount file:
paste pagelist.txt wordcount.txt > pagewordcount.list

# for each number in the wordcount.txt file, add it to the previous number (get a total)
for t in $(cat wordcount.txt); do 
	total=$((total + $t))
done

# append the total to the end of the merged pagelist+wordcount file:
echo "Total word count = $total" >> pagewordcount.list

# read back the file:
cat pagewordcount.list

# ciao
exit

I ssh-ed to my server and did
ls -1 *.html > pagelist.txt
which lallowed me to feed the script this list.

baldwinlinguas.com/index.html
baldwinlinguas.com/esp.html
baldwinlinguas.com/fran.html
baldwinlinguas.com/port.html
baldwinlinguas.com/empregos.html
baldwinlinguas.com/transquote.html

So, then I ran the script on this list of the pages, and this is the output:

baldwinlinguas.com/index.html 535
baldwinlinguas.com/esp.html 342
baldwinlinguas.com/fran.html 295
baldwinlinguas.com/port.html 337
baldwinlinguas.com/empregos.html 662
baldwinlinguas.com/transquote.html 244
Total word count = 2415

So, it works. Someone with better bash fu could likely find a shorter path to this result.

Now, this is simple, of course, for a simple website, like baldwinlinguas.com.
On the other hand, if you have some huge wordpress installation, like this blog, and have tonso public php pages, rather than html, and eve more php files in the backend, you have to do a bit of sorting, I imagine.

Were I to attempt that with the baldwinsoftware wiki, I would probably just go to the Sitemap and grab that list of pages, using their URLs, of course.

./tony

Written by tonybaldwin

September 21, 2011 at 5:25 am

search google, wikipedia, reverso from the bash terminal

leave a comment »

 

searching in bash

searching in bash

 

Okay, so, I like to use my bash terminal. Call me a geek all you like; it matters not to me. I wear that badge with pride.

The bash terminal is quick and efficient for doing a lot of stuff that one might otherwise use some bloated, cpu sucking, eye-candied, gui monstrosity to do. So, when I find ways to use it for more stuff, more stuff I do with it.

Now, for my work (recall, I am professionally a translator) I must often do research, some of which entails heavy lifting, and, otherwise, often simply searching for word definitions and translations. I use TclDict, which I wrote, frequently, but, I also use a lot of online resources that I never programmed TclDict to access, and would generally use a browser for that stuff. Unless, of course, I can do it my terminal!

For precisely such purposes, here are a couple of handy scripts I use while working.

First, let’s look up terms at Dict.org:

#!/bin/bash

if [[ $(echo $*) ]]; then

searchterm="$*"
else

read -p "Enter your search term: " searchterm
fi
read -p "choose database (enter \'list\' to list all): " db

if [ $db = list ] ; then
curl dict://dict.org/show:db

read -p "choose database, again: " db
fi

curl dict://dict.org/d:$searchterm:$db | less

 

 

Now, let’s search google from the command line:

#!/bin/bash
if [[ $(echo $*) ]]; then
searchterm="$*"
else
read -p "Enter your search term: " searchterm
fi
lynx -accept_all_cookies http://www.google.com/search?q=$searchterm
# I accept all cookies to go direct to search results without having to approve each cookie.
# you can disable that, of course.

 

I saved that in ~/bin/goose # for GOOgle SEarch
and just do
goose $searchterm

Or, search the google dictionary to translate a term:

#!/bin/bash
echo -e "Search google dictionary.\n"
read -p "Source language (two letters): " slang
read -p "Target language (two letters): " tlang
read -p "Search term: " sterm
lynx -dump "http://www.google.com/dictionary?langpair=$slang|$tlang&q=$sterm" | less

Note: For a monolingual search, just use the same language for source and target. Don’t leave either blank.

Or:

#!/bin/bash
if [ ! $1 ];
then
echo -e "usage requires 3 parameters: source language, target language, search term. \n
Thus, I have this as ~/bin/googdict, and do \n
googdict en es cows \n
to translate "cows" to Spanish. \n
For monolingual search, enter the language twice. \n
As indicated, use the two letter code: \n
\"en\" for English, \"fr\" for French, etc."
exit
fi

lynx -dump "http://www.google.com/dictionary?langpair=$1|$2&q=$3" | less

For the above, I have it in ~/bin/gd, usage being simply “gd $sourcelanguage $targetlanguage $searchterm”.
Example:
me@machine:~$ gd en es cow
Searches the Englist to Spanish dictionary for “cow”.

We can use similar principles to search reverso:

#!/bin/bash
#search reverso
read -p "Enter the source language: " slang
read -p "Enter target language: " tlang
read -p "Enter your search term: " searchterm
lynx -dump dictionary.reverso.net/$slang-$tlang/$searchterm | less

With the google dictionary, you use the two-letter language code (i.e., “en” for English, “fr” for French, etc.). With reverso, you have to spell out the language (“english” for English, etc.).

With all of the above, I’ve used the program, less, to display the results, rather than spitting it all out to to the terminal at once. Click here to learn how to use less, if needed.

Additionally, most of the above require Lynx Browser, which is generally available for any gnu/linux distribution via your favorite package manager (apt, synaptic, aptitude, yum, portage, pacman, etc.). For the dict.org script, I used cURL (also part of most gnu/linux distributions and installable with your favorite package manager).

Google Translate can also be accessed, but for this, we’ll use a bit of python magic (I know, I pick on google translate, a lot, but it can be useful):

#!/usr/bin/env python
from urllib2 import urlopen
from urllib import urlencode
import sys

# The google translate API can be found here:
# http://code.google.com/apis/ajaxlanguage/documentation/#Examples

 

lang1=sys.argv[1]
lang2=sys.argv[2]
langpair='%s|%s'%(lang1,lang2)
text=' '.join(sys.argv[3:])
base_url='http://ajax.googleapis.com/ajax/services/language/translate?'
params=urlencode( (('v',1.0),
('q',text),
('langpair',langpair),) )
url=base_url+params
content=urlopen(url).read()
start_idx=content.find('"translatedText":"')+18
translation=content[start_idx:]
end_idx=translation.find('"}, "')
translation=translation[:end_idx]
print translation

Originally found that here, on the ubuntuforums.

And now for Wikipedia we have a couple of options.
First, we have this awesome little handy script, tucked into my $PATH as “define”:

#!/bin/bash
dig +short txt $1.wp.dg.cx
exit

I use it simply with “define $searchterm”, and it gives a short definition from wikipedia.  I originally found it here.

Another extremely handy tool is Wikipedia2Text, which I simply installed from the debian repos via aptitude. When I use this, I also pipe it to less:
#!/bin/bash

if [[ $(echo $*) ]]; then

searchterm="$*"
else

read -p "Enter your search term: " searchterm
fi

 

wikipedia2text $searchterm | less

I have that tucked into ~/bin/wikit, thus, do simply wikit $searchterm to get my results.

Enjoy!

All code here that I have written is free and released according to the GPL v. 3. Check the links for code I borrowed for licensing information (pretty sure it’s all GPL-ed, too).

./tony

Written by tonybaldwin

May 3, 2011 at 12:52 am

Oggify – convert all your wav and mp3 to ogg

with one comment

A script to convert .mp3 files to .ogg

Requires mpg123 and oggenc, uses perl rename, but I can make one with the old rename (rename.ul now in ubuntu and debian).

Why should we use ogg?

cd into the dir fullo tunes, and:


#!/bin/bash

# convert mp3 and wav to ogg
# tony baldwin http://www.BaldwinSoftware.com
# cleaning up file names

echo cleaning file names...

rename 's/ /_/g' *
rename y/A-Z/a-z/ *
rename 's/_-_/-/g' *
rename 's/\,//g' *

# converting all mp3 files to wav,
#so there will be nothing but wavs

echo Converting mp3 to wav...

for i in $(ls -1 *.mp3)
do
n=$i
mpg123 -w "$n.wav" "$n"
done

# and, now, converting those wav files to ogg

echo Converting .wav to .ogg

for i in *.wav
do
oggenc $i
done

# Clean up, clean up, everybody everywhere
# Clean up, clean up, everybody do your share...

# cleaning some file names
# removing ".mp3" from $filename.mp3.ogg
# for result of $filename.ogg

rename 's/.mp3//g' *.ogg

# removing all those big, fat wav files.

rm -f *.wav
rm -f *.mp3

Cleaning up after ourselves...

echo -e "Your files are ready, friend.\nHappy listening!"

exit

# This program was written by tony baldwin - tony @ baldwinsoftware.com
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

wiki page for this script.

I have a version with a little gui-ness with zenity, if anyone wants it. (i.e. with a graphical user interface)

./tony

Written by tonybaldwin

February 22, 2011 at 11:25 am

mv screenshots

leave a comment »

Screenshot pictures tend to pile up in my /home, so I wrote this:

#!/bin/bash

# move pix or die, damn it

if [ ! -f *.jpg ]
then
echo no pix d00d
exit
else
echo THESE
ls -1 *.jpg
echo are ALL pix, d00d.
for i in $(ls *.jpg)
do
mv $i pix/screenshots
echo I just moved $i to the screenshots dir
done
echo all done d00d
fi
exit

Written by tonybaldwin

January 27, 2011 at 1:54 pm

eXp0stulate – moving right along!

leave a comment »

I’ve already made a few changes to eXp0stulate, primarily in the gui.
Rather than have a menu with options for posting to the 3 relevant services (livejournal, insanejournal, and dreamwidth), I have made buttons, and placed the “Post to: (which journal?)”(so you can post to a community, etc.) at the bottom of the interface.

Here’s a new screenshot:

I have plans to edit the “Insert” menu, since DW uses slightly different tags than IJ and LJ (which both use stock LJ tags, for cut, user, community, etc.).
I may add more html options to that menu, as well (bold, italic, small, big, headings-h1, h2, etc., horizontal ruler, paragraph, blockquote, lists, etc.).
I also think that html syntax highlighting would be pretty awesome, so one could more readily see errors in html tags in an entry, before posting.
And, of course, spellchecking would be gro0vy.
Beyond that, there’s other work to do. Currently eXp0stulate uses the old, flat post method. I’d like to get it writing out to xml, so it will function with the xml-rpc interface, which will then allow me to add both blogger and wordpress functions (which will also allow me to use it with my baldwinsoftware/blog).
I’d also like to incorporate a few more features, such as downloading and editing older entries. Right now, it only posts an entry.
This is a drawback, because, if the user finds an error in their entry once it’s been posted, they have to edit it on-site, rather than in eXp0stulate.
Not convenient.
If I add in the ftp function from tcltext and tclup, heck, it could be a combo blogging client and html editor…maybe…
So, I have a full TODO list on the eXp0stulate wiki page.

One advantage I do think it has is, once you’ve posted an entry, the text widget does NOT clear (unless you deliberately clear it), so, the entry remains loaded, and you can post to all three current services (IJ, LJ, DW) simply by pressing the three buttons at the bottom of the interface, and, if you like, even x-post to a few communities by modifying the “Post to: [which journal?]” field and pressing said buttons again.
(In Logjam, for instance, one must logout/login to the distinct services, reloading the entry before changing services, in order to xpost…here it’s just, write an entry, and click, click, click…)
That’s convenient, I think.

The user must be certain to make sure that “Post to: [which journal?]” is filled in with the name of the journal to which they are posting (their username if posting to their own journal, the name of a community, if that is what they wish). In TKLJ and Therapy, I had that field default to the user’s username for the relevant service, but, since this program posts to 3 (and hopefully, in the future, 5 or more) services, defaulting to any specific username seemed impractical. I have a username on LJ that differs from that which I use on IJ and DW. I suspect others may have differing usernames, as well, so the field simply defaults to “which journal?”, which means, if the user does not alter the field, they will be trying (unsuccessfully) to post to http://which journal?.livejournal.com/, or ij or dw, as the case may be.

It would be good to load menus with the user’s communities, and also add user icon functionality, but the user icon feature would be useless for blogger and wordpress, of course, so I may just dispense with that.

Some might ask,”Why, Tony? Why make another blogging client for LJ, IJ, and DW?”
To which I answer,”Well…

  1. I keep blogs on several services, and want to make it as convenient and efficient as possible to X-post to all 6 blogs I keep;
  2. I want some features the others don’t have;
  3. Why climb the mountain? because it’s there… (i.e., it was something to do, and I like hacking);
  4. No such blogging client exists written in TclTk, and I like to promote the language;

Anyway, I’ve got to get away from this project at the moment and return to my translation work.

If you want to get on board and hack this thing up with me, let me know!
I’ll add you to the baldwinsoftware.com/wiki, forums, and invite you to the baldwinsoftware googlegroup/listserv.

be well,
tony


posted with eXp0stulate

Written by tonybaldwin

March 26, 2010 at 1:59 pm

Python v. Tcl/Tk: Denting & Tweeting

with one comment

python v. tcl/tkSo.  I have now made two little denter/tweeter programs (to send updates to twitter.com and identi.ca), one with Tcl/Tk, the other with Python. I figured a little comparison, perhaps, was in order.

If you look at them, of course, they look, well, just about the same.  Tkinter is, after all, analogous to Tk.
The Tcl/Tk program made it incredibly simple to display the response from the remote server, which I haven’t succeeded in doing with the python script, yet.  Both rely on calling an external program (curl) to send updates, rather than relying on the languages’ built-in tools.  I could probably work out HTTP POST in tcl rather painlessly.  I did try to use python’s urllib to post, unfruitfully, and resorted back to calling curl.
the links: iDenTickles (tcl/tk) / iDenTweetyPie (python)
the code for both programs is available at the above wiki links

The Tcl/Tk program, which has the added feature of displaying the server response, has only 47 lines of code, 245 words, 1844 characters.  It took me less than an hour to write it.

The Python program, however, which does precisely the same exact thing as the tcl/tk program, without displaying the server response, has 104 lines of code, 564 words, and 4073 characters.  It took me the better part of a day to write it.  Oh, but the python program tells you if your update is too long.
One must ask oneself, of course, is this a testament to the power and simplicity of tcl/tk?  Or, is it simply an indication of my lack of skill with python?
I can’t answer that defnitively, but, to me, it really looks like tcl/tk is a bit more efficient.  Admittedly, I’m not a very skilled programmer at all, in truth.  Timewise, of course, I have been writing tcl/tk for a couple of years, and only just now delving into python. As such, I was able to throw the tcl/tk program together quickly, while, my efforts to “translate” my tcl/tk program into python required a bit of research on the syntax for writing tkinter guis, and other elements.  It just really looks to me as though Python/Tkinter takes a lot more code to do the same thing.  I really have drawn that conclusion.  Especially building a gui, it seems, is more cumbersome with tkinter than with simple, good old tcl/tk.  I know there are other means of building a gui with python (wxwidgets, pygtk, pyqt, etc.), but I wanted to try the one most similar to that with which I am already familiar, and, I believe it is a fairer comparison when using a similar gui ToolKit.
At this juncture, I do have to say, I feel a great loyalty and deep affinity for tcl/tk.  I don’t understand why it isn’t in wider use, frankly.  It is an incredibly powerful language, used for a vast array of purposes, and, in my opinion, is probably the easiest programming language to learn (of course, I haven’t tried them all), especially for a beginning programmer.  One can be up and running, creating useful programs in a relatively short time.  I also feel the need to give kudoz to the tcl/tk community and the tcl.tk wiki, which is replete tons of example code, detailed explanations, and great resources for learning how to program in tcl/tk.  The tcl-ers that hang out at #tcl on irc.freenode.net, additinoally, are extremely helpful, and patient.  They won’t hold your hand, but they’ll tolerate a newbie, and point them in the right direction, without any snobbishness or derision.
I can’t say the same for my experiences with pythonistas.  Their irc channel was a little less friendly, imho.  Maybe I just caught them on a bad day, or maybe I was having a bad day.  After all, Pythonistas are known for having a sense of humor.  Admittedly, I was frustrated when I finally went to their channel for a bit of support, and frustrated, whiny n00b is no fun to play with, any way.  Moreover, the python community does have a lot of documentation available online. Nonetheless, to me, it seems that it is written for other programmers, not for the uninitiated, so, is not so easily read as much of the tcl/tk resources.  Their sample code is not well explained, where someone new to programming can really make sense of it. This may also be a function of time, since tcl/tk has been around a bit longer than python.

I do want to make it very clear: I’m really not here to pick on python.  I know that it’s a powerful language with a great many uses, and a favorite of a great many real hackers who know a lot more about programming than I do.  I will continue to learn to write it, and believe it will serve me quite well for various purposes, and I believe I will continue to have fun learning it.  But, I think I might continue to point out how tcl/tk is much easier and seemingly efficient, too…

Written by tonybaldwin

March 23, 2010 at 3:30 am