distribution of words

The Tower – v2

the-tower

I am starting to write some tools to examine a draft of the tower.  That photo is an example of the  frequency distribution of words.  What does that picture look like for different poems? Good question.

Here are some common words in v2 of the tower.  N.B. I had to turn all the words into lower case to make this work. [(‘the’, 261), (‘i’, 221), (‘and’, 203), (‘a’, 142), (‘my’, 130), (‘to’, 124), (‘of’, 101), (‘her’, 100), (‘you’, 87), (‘in’, 85), (‘me’, 76), (‘is’, 75), (‘she’, 62), (‘your’, 56), (‘it’, 54), (‘?’, 54), (‘not’, 45), (‘what’, 43), (‘love’, 38), (‘with’, 37), (‘on’, 36), (‘we’, 35), (‘but’, 34), (‘are’, 32), (‘for’, 32), (‘that’, 32), (‘have’, 31), (‘or’, 29)]

Here are some uncommon words – ‘nice’: 1, ‘lore’: 1, ‘echos’: 1, ‘chorus’: 1, ‘chora’: 1, ‘corpus’: 1, ‘complement’: 1, ‘polyphonic’: 1, ‘syncopation’: 1, ‘melodic’: 1, ‘measure’: 1, ‘joined’: 1, ‘gregorian’: 1, ‘chat’: 1, ‘created’: 1, ‘chippewas’: 1, ‘yume’: 1, ‘indian’: 1, ‘warberler’: 1, ‘warbler’: 1, ‘skeleton’: 1, ‘scrum’: 1}

There are 1165 words used once. there are 1772 distinct words there are 5860 words total. Some of these uncommon words are banal (nice), most of the frequent words are copulas. I may run some other poems of similar length through this and see what pops up.

This stuff is pretty basic – but thought provoking and useful in editing. I am going to go through now and perhaps delete many if not all the most common words. I am going to look at the least common words and think about what sort of world they build. It is what I want.

I am thinking my next code jig will involve meter and rhythm.

I have posted the code on  https://github.com/msrobot0/.

Leave a Reply