python统计文本字符串里单词出现频率的方法
本文实例讲述了python统计文本字符串里单词出现频率的方法。分享给大家供大家参考。具体实现方法如下:
#wordfrequencyinatext
#testedwithPython24vegaseat25aug2005
#Chinesewisdom...
str1="""Manwhoruninfrontofcar,gettired.
Manwhorunbehindcar,getexhausted."""
print"Originalstring:"
printstr1
print
#createalistofwordsseparatedatwhitespaces
wordList1=str1.split(None)
#stripanypunctuationmarksandbuildmodifiedwordlist
#startwithanemptylist
wordList2=[]
forword1inwordList1:
#lastcharacterofeachword
lastchar=word1[-1:]
#usealistofpunctuationmarks
iflastcharin[",",".","!","?",";"]:
word2=word1.rstrip(lastchar)
else:
word2=word1
#buildawordListoflowercasemodifiedwords
wordList2.append(word2.lower())
print"Wordlistcreatedfrommodifiedstring:"
printwordList2
print
#createawordfrequencydictionary
#startwithanemptydictionary
freqD2={}
forword2inwordList2:
freqD2[word2]=freqD2.get(word2,0)+1
#createalistofkeysandsortthelist
#allwordsarelowercasealready
keyList=freqD2.keys()
keyList.sort()
print"Frequencyofeachwordinthewordlist(sorted):"
forkey2inkeyList:
print"%-10s%d"%(key2,freqD2[key2])
希望本文所述对大家的Python程序设计有所帮助。