Markov Chains in Ruby
January 15th, 2005
Writing text and playing with text is all good fun. Using Markov Chains one can generate text that is almost readable, almost understandable, but not quite. Markov Chains analyses the frequency of words in relation to another.
Analysing text basically means counting which words follow each other. Producing text is done by picking a word and then looking up which word has come up after that in the text samples, picking one and then using this new word as the basis for determining the next one.
For a game project we are working on, I need this functionality. Because Amuda is written in Ruby using Ruby On Rails as it’s MVC framework, I also wanted to do the Markov Chains in Ruby. Because there don’t seem to be any Ruby implementation, I ported Gary Burds Markov Chain Generator. For those interested, read on to see the code — and if any Ruby-Gurus are out there, please help me improve my “Ruby-Fu”, I’m still a beginner in this language.
Here we go
# Markov Chain Generator
# based on the Python version by Gary Burd: http://gary.burd.info/2003/11/markov-chain-generator.html
# Released into the public domain, please keep this notice intact
# (c) InVisible GmbH
# http://www.invisible.ch
#
require "YAML"
class Array
# return a random element of the array, similar to random.choice in python
def choice
self[ rand(self.size) ]
end
end
class MarkovTool
attr_accessor :markov_data
def initialize( markov_data = nil )
# use an unlikely combination for end of paragraph marker
@nlnl = "#-#-"
@markov_data = markov_data if markov_data.class == Hash
@markov_data ||= Hash.new
end
def new_key( key, word)
return @nlnl if word == "\n"
return key if !word
return word
end
def markov_data_from_words( words )
key = @nlnl
words.each do | word |
@markov_data[ key ] ||= Array.new
@markov_data[ key ] << word
key = new_key( key, word )
end
end
def words_from_markov_data
key = @nlnl
result = Array.new
word = ""
# repeat until we hit a newline or a full-stop, remove the last clause to get paragraphs,
while word && word != "\n" && word[-1] != "."[0]
word = @markov_data[ key ].choice rescue nil
key = new_key( key, word )
result << word
end
result
end
# analyze and add a string
def words_from_string( line )
result = Array.new
words = line.split
if words.size > 0
words.each { | word | result << word }
else
result << "\n"
end
result
end
# analyze and add a file
def words_from_file( f )
result = Array.new
File.foreach( f ) do | line |
result << self.words_from_string( line )
end
result.flatten
end
# build a paragraphs out of the result array
def paragraph_from_words( words )
result = Array.new
words.each do | word |
result << word
end
result.join( " " )
end
# return a complete paragraph
def get_paragraph
wo = self.words_from_markov_data
self.paragraph_from_words( wo )
end
def store_in_yaml( f )
YAML.dump( @markov_data, f )
end
def load_from_yaml( f )
@markov_data = YAML.load( f )
end
end
if __FILE__ == $0 then
m = MarkovTool.new
# read exisiting markov data
# File.open( "markov.yaml" ) { | yf | m.load_from_yaml( yf ) }
if ARGV[0]
# if we got a filename, read it, process it and store the markov data
w = m.words_from_file( ARGV[0] )
m.markov_data_from_words( w )
File.open( "markov.yaml", "w" ) { | yf | m.store_in_yaml( yf ) }
end
# create a paragraph and display it
p = m.get_paragraph
puts p
end
Entry Filed under: Ruby

7 Comments Add your own
1. Markov Chains&hellip | June 8th, 2006 at 12:25
[...] I’m having fun with Markov chains and some texts. Here’s Wittgenstein, mixed in with Kant and McCartney markovized: [...]
2. IT ÐовоÑÑ&hellip | January 9th, 2007 at 10:01
[...] на Ruby: http://blog.invisible.ch/index.php?p=000361 Posted in Другое | Trackback | del.icio.us | Top OfPage [...]
3. Веб моÐ&hellip | January 9th, 2007 at 10:08
[...] Вот здеÑÑŒ можете поÑмотреть ещё одну реализацию на Python: http://burd.info/gary/2003/11/markov-chain-generator.htmlРздеÑÑŒ реализацию на Ruby: http://blog.invisible.ch/index.php?p=000361 [...]
4. codex | January 26th, 2007 at 11:58
This site is related:
5. Matheedubbas | December 9th, 2007 at 00:03
Check my site: http://s3.dk
I have made an implementation on SQL Server and C# that will render
webpages with markov-linked content. I would like to do the same with
MIDI. Anyone go any Ideas for an application ?
regs,
Matheedubbas
http://s3.dk
6. ramakrishna | June 21st, 2008 at 08:55
s=rawinput(“enter the string:”)
l=len(s)
b=”"
w=” ”
x=”~!@#$%^&*()+|}{]['<>?,./"
m=len(x)
for i in range(l):
c=0
for j in range(m):
if s[i]==x[j]:
c=c+1
elif s[i]==”-” or s[i]==”/”:
w=w+b
if c==0:
w=w+s[i]
w=w.lower()
print “w”,w
x=w.split(” “)
y=float(len(x))
d={}
y=[]
max=0
for i in x:
d[i]=d.get(i,0)+1
revitems = [(v,k) for k,v in d.items()]
revitems.sort()
revitems.reverse()
x=len(revitems)
index=1
rank=1
print “keys” “\t\t\t\t” “values” “\t\t”"rank”
for v, k in revitems:
temp=revitems[index][0]
print “%-20s%15d%15d”%(k,v,rank)
index+=1
if temp
m=len(d)
print “total word count:”,m-1
7. crunchytoast.com » &hellip | May 1st, 2009 at 17:49
[...] So! Check out the HMM Ruby Quiz and also Markov Chains in Ruby. These are examples of working code in under 50 or so lines. Also, for a clear if somewhat “mathy” explanation of Hidden Markov Processes and Markov Chains, check out Wikipedia’s HMM article. [...]
Leave a Comment
Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
Subscribe to the comments via RSS Feed