google
yahoo
bing

Markov Chains in Ruby

January 15th, 2005

Writing text and playing with text is all good fun. Using Markov Chains one can generate text that is almost readable, almost understandable, but not quite. Markov Chains analyses the frequency of words in relation to another.

Analysing text basically means counting which words follow each other. Producing text is done by picking a word and then looking up which word has come up after that in the text samples, picking one and then using this new word as the basis for determining the next one.

For a game project we are working on, I need this functionality. Because Amuda is written in Ruby using Ruby On Rails as it’s MVC framework, I also wanted to do the Markov Chains in Ruby. Because there don’t seem to be any Ruby implementation, I ported Gary Burds Markov Chain Generator. For those interested, read on to see the code — and if any Ruby-Gurus are out there, please help me improve my “Ruby-Fu”, I’m still a beginner in this language.

Here we go

    #  Markov Chain Generator
    # based on the Python version by Gary Burd: http://gary.burd.info/2003/11/markov-chain-generator.html
    # Released into the public domain, please keep this notice intact
    # (c) InVisible GmbH    
    # http://www.invisible.ch
    # 

require "YAML"

class Array
  # return a random element of the array, similar to random.choice in python
  def choice
    self[ rand(self.size) ]
  end
end

class MarkovTool

  attr_accessor :markov_data

  def initialize( markov_data = nil )
    # use an unlikely combination for end of paragraph marker
    @nlnl = "#-#-"
    @markov_data = markov_data if markov_data.class == Hash
    @markov_data ||= Hash.new
  end

  def new_key( key, word)
    return @nlnl if word == "\n" 
    return key if !word
    return word
  end


  def markov_data_from_words( words )
    key = @nlnl
    words.each do | word |
      @markov_data[ key ] ||= Array.new
      @markov_data[ key ] << word
      key = new_key( key, word )
    end
  end

  def words_from_markov_data
    key = @nlnl
    result = Array.new
    word = ""
    # repeat until we hit a newline or a full-stop, remove the last clause to get     paragraphs, 
    while word && word != "\n" && word[-1] != "."[0]
      word = @markov_data[ key ].choice rescue nil
      key = new_key( key, word )
      result << word
    end
    result
  end

  # analyze and add a string
  def words_from_string( line )
    result = Array.new
    words = line.split
    if words.size > 0
      words.each { | word | result << word }
    else
      result << "\n"
    end
    result
  end

  # analyze and add a file
  def words_from_file( f )
    result = Array.new
    File.foreach( f ) do | line |
      result << self.words_from_string( line )
    end
    result.flatten
  end

  # build a paragraphs out of the result array
  def paragraph_from_words( words )
    result = Array.new
    words.each do | word |
      result << word
    end
    result.join( " " )
  end

  # return a complete paragraph
  def get_paragraph
    wo = self.words_from_markov_data

    self.paragraph_from_words( wo )
  end

  def store_in_yaml( f )
    YAML.dump( @markov_data, f )
  end

  def load_from_yaml( f )
    @markov_data = YAML.load( f )
  end

end

if __FILE__ == $0 then

  m = MarkovTool.new

  # read exisiting markov data
  # File.open( "markov.yaml" ) { | yf | m.load_from_yaml( yf ) }

  if ARGV[0]
    # if we got a filename, read it, process it and store the markov data
    w = m.words_from_file( ARGV[0] )
    m.markov_data_from_words( w )
    File.open( "markov.yaml", "w" ) { | yf | m.store_in_yaml( yf ) }
  end
  # create a paragraph and display it
  p = m.get_paragraph
  puts p
end

Entry Filed under: Ruby

7 Comments Add your own

  • 1. Markov Chains&hellip  |  June 8th, 2006 at 12:25

    [...] I’m having fun with Markov chains and some texts. Here’s Wittgenstein, mixed in with Kant and McCartney markovized: [...]

  • 2. IT НовосÑ&hellip  |  January 9th, 2007 at 10:01

    [...] на Ruby: http://blog.invisible.ch/index.php?p=000361 Posted in Другое | Trackback | del.icio.us | Top OfPage [...]

  • 3. Веб моÐ&hellip  |  January 9th, 2007 at 10:08

    [...] Вот здесь можете посмотреть ещё одну реализацию на Python: http://burd.info/gary/2003/11/markov-chain-generator.htmlА здесь реализацию на Ruby: http://blog.invisible.ch/index.php?p=000361 [...]

  • 4. codex  |  January 26th, 2007 at 11:58

    This site is related:

  • 5. Matheedubbas  |  December 9th, 2007 at 00:03

    Check my site: http://s3.dk

    I have made an implementation on SQL Server and C# that will render
    webpages with markov-linked content. I would like to do the same with
    MIDI. Anyone go any Ideas for an application ?

    regs,

    Matheedubbas

    http://s3.dk

  • 6. ramakrishna  |  June 21st, 2008 at 08:55

    s=rawinput(“enter the string:”)
    l=len(s)
    b=”"
    w=” ”
    x=”~!@#$%^&*()
    +|}{]['<>?,./"
    m=len(x)
    for i in range(l):
    c=0
    for j in range(m):
    if s[i]==x[j]:
    c=c+1
    elif s[i]==”-” or s[i]==”/”:
    w=w+b
    if c==0:
    w=w+s[i]
    w=w.lower()
    print “w”,w
    x=w.split(” “)
    y=float(len(x))
    d={}
    y=[]
    max=0
    for i in x:
    d[i]=d.get(i,0)+1
    revitems = [(v,k) for k,v in d.items()]
    rev
    items.sort()
    revitems.reverse()
    x=len(rev
    items)
    index=1
    rank=1
    print “keys” “\t\t\t\t” “values” “\t\t”"rank”
    for v, k in revitems:
    temp=rev
    items[index][0]
    print “%-20s%15d%15d”%(k,v,rank)
    index+=1
    if temp
    m=len(d)
    print “total word count:”,m-1

  • 7. crunchytoast.com » &hellip  |  May 1st, 2009 at 17:49

    [...] So! Check out the HMM Ruby Quiz and also Markov Chains in Ruby. These are examples of working code in under 50 or so lines. Also, for a clear if somewhat “mathy” explanation of Hidden Markov Processes and Markov Chains, check out Wikipedia’s HMM article. [...]

Leave a Comment

Required

Required, hidden

Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Subscribe to the comments via RSS Feed


Most Recent Posts

I wrote the book