Friday 25 July 2008

Word frequencies...from files

So I was chilling in IRC today when someone was going on about random() and predictability of what people might say, so being kinda geeky, I decided a one liner to extract the said user from the irc log file and then provide a count of all the words in order showing the most likely words a user would say.

so here it is anyway

fgrep "username" \#room.log|cut -f2 -d ">"|sed 's/ /\n/g'|sort|uniq -c|sort -g

you can use this one text files too

cat foo.txt |sed 's/ /\n/g'|sort|uniq -c|sort -g