n-gram – Python implementation

More info: http://en.wikipedia.org/wiki/N-gram

def n_gram(string, size = 1):
          “””
          [string[i:] for i in range(size)]
          generates: ‘hi there’,’i there’, ‘ there’
          zip(*’hi there’,’i there’, ‘ there’)
          generates:[(‘h’, ‘i’, ‘ ‘), (‘i’, ‘ ‘, ‘t’),
            (‘ ‘, ‘t’, ‘h’), (‘t’, ‘h’, ‘e’),
            (‘h’, ‘e’, ‘r’), (‘e’, ‘r’, ‘e’)]
          “””
          ngram = zip(*[string[i:] for i in range(size)])
          return [”.join(i) for i in ngram]
n_gram(“hi there” ,size=3)
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: