在搜索引擎,語音識別等領(lǐng)域常會統(tǒng)計單詞的出現(xiàn)頻率,下面給出Groovy實現(xiàn),打印出現(xiàn)頻率最高的6個單詞以及相應(yīng)的出現(xiàn)次數(shù):
def?content??
=
?
????
"""
????The?Java?Collections?API?is?the?basis??
for
??all?the?nice?support?that?Groovy?gives?you
????through?lists?and?maps.?In?fact,?Groovy?not?only?uses?the?same?abstractions,?it
????even?works?on?the?very?same?classes?that?make?up?the?Java?Collections?API.
????
"""
?
def?words?
=
?content.tokenize()
def?wordFrequency?
=
?[:]
words.each?{
????wordFrequency[it]?
=
?wordFrequency.get(it,?
0
)?
+
?
1
?
}?
def?wordList?
=
?wordFrequency.keySet().toList()
wordList.sort?{wordFrequency[it]}?
def?result?
=
?
''
?
wordList[
-
1
..
-
6
].each?{
????result?
+=
?it.padLeft(
12
)?
+
?
"
:?
"
?
+
?wordFrequency[it]?
+
?
"
?\n?
"
?
}?
?
println?result?
運(yùn)行結(jié)果:
?????????? the:?5
?? Groovy:?2
????????? that:?2
?Collections:?2
??????? ?Java:?2
????????same:?2?
?
如果所要處理的文本比較復(fù)雜,可以使用Regex進(jìn)行處理,順便說一句,Groovy在語言級別支持Regex!
posted on 2007-02-01 23:31
山風(fēng)小子 閱讀(4346)
評論(6) 編輯 收藏 所屬分類:
Groovy & Grails