Was having fun with Ruby programming language and Richa’s tweets during my spare time in the last few weeks. My idea (and implementation) of having fun is:
- Implement Markov chain algorithm in Ruby.
- Make a change. Print the output in a structure that looks like a poem.
- Next change. Fetch tweets and apply the program. Tweets -> Markov Algorithm -> poem.
Now into the details:
I wanted to play with Ruby after a long time. In The Practice of Programming, authors Kernighan and Pike give various implementations of the Markov chain algorithm in C, C++, Java, Awk and Perl. Implementing this algorithm in Ruby was a good option, so I got myself started on it.
If you are wondering what the algorithm is all about, K&P write that “the key observation is that we can use any existing text to construct a statistical model of the language as used in that text, and from that generate random text that has similar statistics to the original. If we imagine the input as a sequence of overlapping phrases, the algorithm divides each phrase into two parts, a multi-word prefix and a single suffix word that follows the prefix. A Markov chain algorithm emits output phrases by randomly choosing the suffix that follows the prefix, according to the statistics of (in our case) the original text. Three-word phrases work well -- a two-word prefix is used to select the suffix word.”
My first couple of attempts did not reach completion. I took the Java version and wrote similar code in Ruby. There is a Prefix class which has a Vector. Prefix objects become the keys in the main data structure statetab which is a Hashtable. The values in the Hashtable are arrays.
I finished the coding, and the Ruby map was not storing data properly. And I just got bored. I didn’t feel like debugging and fixing my program. Did you ever get that feeling? Look at code, and just don’t feel like making it work. It was like the writer’s block.
So I opened the book again, and this took the Perl version and started converting it to Ruby. In this version, there is an anonymous array as a value in a map. Again after some coding, my energy sapped and stopped. I didn’t want to “design” a Ruby implementation. I only wanted to do coding work, that is, just take an existing implementation and convert it into Ruby.
I returned to the task a day or two after I stopped on the Perl version. I googled for Markov 2-prefix algorithm and one of the hits I got explained a very nice and simple implementation of the algorithm. When I scrolled up and looked around the page, I realized it was an
example in the documentation of a programming language called Lua.
Lua? Holy ****. I never knew there was a programming language called Lua. If some one asked me a quiz question whether a programming language called Lua existed, I would say no and fail scoring in the question. But anyways, I was glad I found it. The code was so nice, that I understood it in the first read.
The Lua example forms prefixes by concatenating two words with a space in between them. These prefixes become “keys” in a map. The “values” in the map are arrays. Arrays as values in a map! Thank goodness, that data structure, I know - having used in my Ruby programs in the past. Syntax of the declaration is
statetab = Hash.new { |h, k| h[k] = Array.new }
And then it was a breeze. No writer’s block or programmer’s block this time. I had my first task done - my Ruby implementation of the Markov algorithm. If you want to stop here and take a look at the code, click
here Next junction in the fun journey was to print the output as a poem. I decided on a poem structure. In poetry, you can have concrete structures that result in visual poetry. I settled for a simple pattern.
In this pattern, the poem has three stanzas. Each stanza has four lines. The lines have 10 words and 5 words alternatively. Thus each stanza has 10 + 5 + 10 + 5 = 30 words. The three stanzas make upto a total of 90 words.
This did not require any fancy coding. Just save the words in an array and print appropriate sub arrays. That’s it. This program is available ->
here. You may notice that I got rid of functions prefix and insert from the first version. This makes the program smaller but would take longer to understand for a new person.
Anyways, second junction crossed. To take the fun ahead, why not apply to the Markov algorithm to tweets and generate a poem? Ha!! That would be amusing. And was easy too. Because I found that Ruby has a rubygem called Twitter that could be used to fetch tweets.
I read the API on the web and it was very simple to use. The user_timeline was the method I needed along with the option max_id. Every tweet has an id and once you passed this id as the max_id value, the call would fetch tweets that were lesser than the max_id. In other words, you would get tweets that are earlier in the timeline than the tweet whose max_id you passed to the method. Only one problem.
The return had the tweet whose id was equal to the id that I was passing. Don’t know if it is a bug or I didn’t read the doc properly. Basically, when you passed an id value as max_id to user_timeline, it was returning tweets including the one whose id was passed. Not tweets earlier than this tweet. This resulted in a couple of lines of code, where every time I had to subtract 1 (one) from the last tweet id fetched and then send it fetch the next block of tweets.
Anyways, the
program fetched tweets, ran Markov algorithm on the tweets, and then printed my 90-word concrete poem. It did that every time I ran the program. Fetching tweets over the net takes time. Hence generating one poem was taking a few minutes.
So I broke down the program into two separate programs. The
first one just to fetch tweets. The
second one to make the poem. I run the first program once, fetch the tweets and save to a file. I run the second program passing the file as input and generates poem. Generating poems is very fast now, almost instantaneous.
I follow
Richa, the Tollywood actress on
Twitter and I was using her tweets for testing through out the course of my fun coding. Basically, this is technical enablement of a poet from an actor : )-. - if Richa was a poet. The poems are based on tweets, hence I call them twoems. A sample few Richa twoems are given below.
These poems were generated on Thursday, Friday and Sunday. The Thursday tweets I fetched are the tweets that were generic and sent to all followers. That is, those tweets do not have any twitter ids. The Friday and Sunday poems are based on all tweets, whether they were addressed to a particular follower or not. However, the program removes all twitter ids and retains only the text.
Thursday twoems
So blank... Yayyyy katy perry's 'shatter nail polish'. I didn't
do it perfectly but pretty
cool! Flying back to hyd tomorrow for the "Kabaddi" song.
now on my flip hd
video cam but sum of the most romantic thing ever
when a guy sings love
songs from the nearby yucky lake/pond/bog/swamp! wishing everyone a very
Happy Bday to my room
after a wonderful 1st day back w my roomie :)
The rage of nagavalli Who's
that chick? Whooo's that chick! Luv Rihanna Need to recharge
after a long but fun
---------------
So blank... Yayyyy katy perry's 'shatter nail polish'. I didn't
get to see me dance
in Leader, but in Mirapakaay :)u guys will love it!
Ok my days feel really
weird when I'm not shooting...just wrapped up with the coolest
costar and director everrr! WHY
did God create mosquitoes again? what purpose do they serve
in this traffic.... It's time
for my costar Ravi sir and Deeksha....one of our very
few combo shots :) Ahhhhhh
dialogue stress!!! At sangi temple for MPK shoot! nitey nite
twitpalz, another early shoot tomorrow
--------------
So blank... Yayyyy katy perry's 'shatter nail polish'. I didn't
get crap for tweeting abt
NORMAL useless things! Gone r those days Omg guys. The
name of the Gods. Vow
not to tweet until I finish! Signing off for now...
Ma and Baba, are you
reading my tweets? I MISS MY MOM, DAD AND DOG!!!!!!!!
mama= mom's brother (bengali) kolkata
humidity definitely beats out bombay humidity.... chilling with the fam
and friends :) Ok I'm
over my jetlag but I feel like cleaning it. haha
i feel like doing anything
----------------------------
Listening to the day, oops! So much traveling back n
forth...cane wait to empty my
bags once i get my mumbai number and internet back
in 2 weeks! these outfits
are super cute! im playing an orthodox telugu brahmin character,
but her role has a
lot planned for today... Going 2 kol tomorrow for Mirrrapakaaay!
Ok its high time I
finished reading this book, In Spite of the ridiculous heat!
Today is unbearably hot! But
have to packkkkk!!! AHHHHHHH it's midnight and i have gud
news 4 my tweeps soon
--------------------------
Friday generated twoems
its okay :) thanks for the inspiration thank you No
guys, I do love andhra
cuisine and its 'weather specialties' I was just visiting from
the states so it BETTER
be soon! i have spent most of it :) :)
:) 2 dance rehearsals in
between. PHEW what a weird day...went to shoot 2 songs
in Araku :) LOL this
made my day more than i ever going home? stuck
at the airport to my
home in a few scripts :) hiiiii! haha that's sweet...hello!
lol don't be worried its okay
-------------------------------
its okay :) thanks for your motivating tweets guys! ill
be sure not to invite
me to Starbucks when I come to hyd for a
joggy with my doggy :)
hehee I'm in your heart when you're ready to go
home! Flight boards in an
auto for shelter! What a relief Can't express how amazing
it feels to be wearing
valentine's colors!:( awww thanks! i love my character was shy
don't worry i'll let you
be the only one full year since I got it
;). but luckily i was
--------------------------------------------
its okay :) hahaha omg.....that's funny :) Who's that chick?
Whooo's that chick! Luv Rihanna
yep that was delivered 2 me on a flight back
home lol! fluffy lovessss going
for a holiday flavor haha I like it. Mirapakaya and
Chandramukhi sequel with Venky sir!
why would it be tough? he's one of my bday
:) hi thanks a ton!
yup! In hyd haha yes its him :) thank youuu
thats okayyyyy i don't know
why ppl are interested in knowing 'if i had the
most amazing and relaxing massage
--------------------------------------------
Sunday twoems
nooooooo ur phone is offfffffff now I have a lottttttttt
planned! wow you are telling
someone who has been subscribed to them for sure :)
teeeeeeheeee :-P. well i think
its funny how ppl are interested in knowing 'if i
had an amazing 2-week holiday
with my super squishy puppy Fluffy! Perfect. Home sweet snowy
home :) hope you like
me in my head for the weekend! :) sunny state
here i come! HAPPY MOTHER'S
DAY!!!!!! missing my dog Fluffy, the snow, christmastime, starbucks winter
flavors, my car, my cozy
----------------------------------------
nooooooo ur phone is offfffffff now I have yet to
see my uncle and met
the 'Who's who' of the world! i will post a
pic and tweet it :)
awwww that's so you can only see u all too!
Last year was special. This
year its more like minimal practice so no time to
bust out the baking madness
;) like OLD TAIIMEZ! BROWNAYZ? omg lol dork yayayayay new
yrs with u n dimpz!
Don't u dare watch my own good... I had it
in Hyderabad though! its interesting
--------------------------------------
nooooooo ur phone is offfffffff now I am so happy
i've done both these films,
they are utterly different from each and every opinion counts,
especially when you get than
this?! i barely get on fb too I THINK i
have to wait and see..
omg that is a sweetie :) But I adooore her
so huge compliment! i love
my mom is here to cook the best android on
the 12th. shooting will begin
for chandramukhi sequel, and then mirapakaya soon after. yes im
in 'The Sound of Music"