<rt id="bn8ez"></rt>
<label id="bn8ez"></label>

  • <span id="bn8ez"></span>

    <label id="bn8ez"><meter id="bn8ez"></meter></label>

    posts - 97,  comments - 93,  trackbacks - 0
    Problem Statement

    In written languages, some symbols may appear more often than others. Expected frequency tables have been defined for many languages. For each symbol in a language, a frequency table will contain its expected percentage in a typical passage written in that language. For example, if the symbol "a" has an expected percentage of 5, then 5% of the letters in a typical passage will be "a". If a passage contains 350 letters, then 'a' has an expected count of 17.5 for that passage (17.5 = 350 * 5%). Please note that the expected count can be a non-integer value.
    The deviation of a text with respect to a language frequency table can be computed in the following manner. For each letter ('a'-'z') determine the difference between the expected count and the actual count in the text. The deviation is the sum of the squares of these differences. Blank spaces (' ') and line breaks (each element of text is a line) are ignored when calculating percentages.
    Each frequency table will be described as a concatenation of up to 16 strings of the form "ANN", where A is a lowercase letter ('a'-'z') and NN its expected frequency as a two-digit percentage between "00" (meaning 0%) and "99" (meaning 99%), inclusive. Any letter not appearing in a table is not expected to appear in a typical passage (0%). You are given a String[] frequencies of frequency tables of different languages. Return the lowest deviation the given text has with respect to the frequency tables.
    Definition

    Class:
    SymbolFrequency
    Method:
    language
    Parameters:
    String[], String[]
    Returns:
    double
    Method signature:
    double language(String[] frequencies, String[] text)
    (be sure your method is public)


    Notes
    -
    The returned value must be accurate to within a relative or absolute value of 1E-9.
    Constraints
    -
    frequencies will contain between 1 and 10 elements, inclusive.
    -
    Each element of frequencies will be formatted as described in the statement.
    -
    Each element of frequencies will contain between 6 and 48 characters, inclusive.
    -
    No letter will appear twice in the same element of frequencies.
    -
    The sum of the percentages in each element of frequencies will be equal to 100.
    -
    text will contain between 1 and 10 elements, inclusive.
    -
    Each element of text will contain between 1 and 50 characters, inclusive.
    -
    Each element of text will contain only lowercase letters ('a'-'z') and spaces (' ').
    -
    text will have at least one non-space character.
    Examples
    0)


    {"a30b30c40","a20b40c40"}
    {"aa bbbb cccc"}
    Returns: 0.0
    The first table indicates that 30% of the letters are expected to be 'a', 30% to be 'b', and 40% to be 'c'. The second table indicates that 20% are expected to be 'a', 40% to be 'b', and 40% to be 'c'. We consider the text to have length 10, as blank spaces are ignored. With respect to the first table, there are 2 'a' where 3 were expected (a difference of 1), one more 'b' than expected (again a difference of 1) and as many 'c' as expected. The sum of the squares of those numbers gives a deviation of 2.0. As for the second table, the text matches expected counts exactly, so its deviation with respect to that language is 0.0.
    1)


    {"a30b30c40","a20b40c40"}
    {"aaa bbbb ccc"}
    Returns: 2.0
    Here we use the same tables as in the previous example, but with a different text. The counts for 'b' and 'c' each differ by 1 from the expected counts in the first table, and the counts for 'a' and 'c' each differ by 1 from the expected counts in the second table. The text has a deviation of 2.0 with respect to both tables.
    2)


    {"a10b10c10d10e10f50"}
    {"abcde g"}
    Returns: 10.8
    Here, each of the letters 'a' through 'e' is expected to make up 10% of the letters (0.6 letters). Each of those letters actually appears once, so the difference is 0.4, which becomes 0.16 when squared. 50% of the letters (3 letters) are expected to be 'f', but 'f' does not appear at all. The square of this difference is 9.0. No 'g's are expected to appear, but there is one in the text. This adds 1.0 to the deviation. The final deviation for this table is: 0.16+0.16+0.16+0.16+0.16+9.0+1.0=10.8.
    3)


    {"a09b01c03d05e20g01h01i08l06n08o06r07s09t08u07x01"
    ,"a14b02c05d06e15g01h01i07l05n07o10r08s09t05u04x01"}
    {"this text is in english"
    ,"the letter counts should be close to"
    ,"that in the table"}
    Returns: 130.6578
    These two frequency tables correspond (roughly) to the frequencies found in the English and Spanish languages, respectively. The English passage, as expected, has a lower deviation in the first table than in the second one.
    4)


    {"a09b01c03d05e20g01h01i08l06n08o06r07s09t08u07x01"
    ,"a14b02c05d06e15g01h01i07l05n07o10r08s09t05u04x01"}
    {"en esta es una oracion en castellano"
    ,"las ocurrencias de cada letra"
    ,"deberian ser cercanas a las dadas en la tabla"}
    Returns: 114.9472
    The same tables again, but with Spanish passage. This time the second table, which correspond to frequencies in Spanish, gives the lowest deviation.
    5)

    {"z99y01", "z99y01", "z99y01", "z99y01", "z99y01",
     "z99y01", "z99y01", "z99y01", "z99y01", "z99y01"}
    {"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
     "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
     "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
     "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
     "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
     "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
     "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
     "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
     "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
     "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"}
    Returns: 495050.0

    This problem statement is the exclusive and proprietary property of TopCoder, Inc. Any unauthorized use or reproduction of this information without the prior written consent of TopCoder, Inc. is strictly prohibited. (c)2003, TopCoder, Inc. All rights reserved.

     1 import java.util.ArrayList;
     2 import java.util.Arrays;
     3 import java.util.Formatter;
     4 import java.util.HashMap;
     5 
     6 /**
     7  *
     8  * @author Nicky Qu
     9  * All Rights Reserved
    10  */
    11 public class SymbolFrequency {
    12 
    13     private double frequency;
    14     private String textAll = "";
    15     private HashMap<String, Integer> patternMap = new HashMap<String, Integer>();
    16 
    17     public double language(String[] frequencies, String[] text) {
    18         int tempCount = 1;
    19         ArrayList<String> tempArray_T = new ArrayList<String>();
    20         ArrayList<String> tempArray_F = new ArrayList<String>();
    21 
    22         for (int i = 0; i < text.length; i++) {
    23             textAll = textAll + text[i].replaceAll(" """).trim();
    24         }
    25         String[] temp = new String[textAll.length()];
    26         for (int j = 0; j < temp.length; j++) {
    27             temp[j] = textAll.substring(j, j + 1);
    28         }
    29         Arrays.sort(temp, String.CASE_INSENSITIVE_ORDER);
    30         for (int j = 0; j < temp.length - 1; j++) {
    31             if (temp[j].equals(temp[j + 1])) {
    32                 tempCount++;
    33                 if (j == temp.length - 2) {
    34                     patternMap.put(temp[j], tempCount);
    35                     tempArray_T.add(temp[j]);
    36                 }
    37             } else {
    38                 if (j == temp.length - 2) {
    39                     patternMap.put(temp[j], tempCount);
    40                     patternMap.put(temp[j + 1], 1);
    41                     tempArray_T.add(temp[j]);
    42                     tempArray_T.add(temp[j + 1]);
    43                     continue;
    44                 }
    45                 patternMap.put(temp[j], tempCount);
    46                 tempArray_T.add(temp[j]);
    47                 tempCount = 1;
    48             }
    49         }
    50 
    51         for (int i = 0; i < frequencies.length; i++) {
    52             double tempFrequency = 0.0;
    53             for (int j = 2; j < frequencies[i].length(); j = j + 3) {
    54                 tempArray_F.add(frequencies[i].substring(j - 2, j - 1));
    55                 if (patternMap.containsKey(frequencies[i].substring(j - 2, j - 1))) {
    56                     tempFrequency = tempFrequency + (Integer.parseInt(frequencies[i].substring(j - 1, j + 1)) * temp.length * 0.01 - patternMap.get(frequencies[i].substring(j - 2, j - 1))) * (Integer.parseInt(frequencies[i].substring(j - 1, j + 1)) * temp.length * 0.01 - patternMap.get(frequencies[i].substring(j - 2, j - 1)));
    57                 } else {
    58                     tempFrequency = tempFrequency + (Integer.parseInt(frequencies[i].substring(j - 1, j + 1)) * temp.length * 0.01* (Integer.parseInt(frequencies[i].substring(j - 1, j + 1)) * temp.length * 0.01);
    59                 }
    60             }
    61             for (int m = 0; m < tempArray_T.size(); m++) {
    62                 if (!tempArray_F.contains(tempArray_T.get(m))) {
    63                     tempFrequency = tempFrequency + (double) patternMap.get(tempArray_T.get(m)) * patternMap.get(tempArray_T.get(m));
    64                 }
    65             }
    66             if (i == 0) {
    67                 frequency = tempFrequency;
    68             } else {
    69                 if (tempFrequency < frequency) {
    70                     frequency = tempFrequency;
    71                 }
    72             }
    73         }
    74         Formatter formatter = new Formatter();
    75         return Double.parseDouble(formatter.format("%.9e", frequency).toString());
    76     }
    77 }

    posted on 2007-10-21 20:14 wqwqwqwqwq 閱讀(926) 評論(1)  編輯  收藏 所屬分類: Data Structure && Algorithm

    FeedBack:
    # re: TopCoder TCHS2
    2007-10-22 13:48 | 曲強 Nicky
    public class SymbolFrequency {

    public double language(String[] frequencies, String[] text) {
    String s="";
    for(String g:text)
    s += g.replaceAll(" ", "");
    char[][] lett = new char[frequencies.length][];
    int[][] perc = new int[frequencies.length][];
    double best = Double.POSITIVE_INFINITY;
    for(int i=0;i<frequencies.length;i++){
    lett[i]=new char[frequencies[i].length()/3];
    perc[i]=new int[frequencies[i].length()/3];
    for(int j=0;j<frequencies[i].length();j += 3){
    lett[i][j/3]=frequencies[i].charAt(j);
    perc[i][j/3]=(frequencies[i].charAt(j+1)-'0')*10+(frequencies[i].charAt(j+2)-'0');
    }
    String dict = s;
    double curr = 0 ;
    int len = s.length();
    for(int j=0;j<lett[i].length;j++){
    dict = dict.replaceAll(lett[i][j]+"", "");
    curr += Math.pow((len-dict.length())-(perc[i][j]*.01*s.length()), 2);
    len = dict.length();
    }
    for(char j='a';j<'z';j++){
    dict = dict.replaceAll(j+"", "");
    curr += Math.pow(len-dict.length(), 2);
    len = dict.length();
    }
    best = Math.min(curr, best);
    }
    return best;
    }
    }  回復  更多評論
      
    <2007年10月>
    30123456
    78910111213
    14151617181920
    21222324252627
    28293031123
    45678910




    常用鏈接

    留言簿(10)

    隨筆分類(95)

    隨筆檔案(97)

    文章檔案(10)

    相冊

    J2ME技術網站

    java技術相關

    mess

    搜索

    •  

    最新評論

    閱讀排行榜

    校園夢網網絡電話,中國最優秀的網絡電話
    主站蜘蛛池模板: 国内精品免费麻豆网站91麻豆| 亚洲精品中文字幕无乱码麻豆| 亚洲av日韩综合一区久热| 中国在线观看免费国语版| 亚洲天堂福利视频| 啦啦啦中文在线观看电视剧免费版 | 亚洲av无码一区二区三区天堂| 在线观看免费人成视频| 亚洲AV无码成人专区| 毛片在线免费视频| 亚洲av午夜电影在线观看 | 亚洲?v女人的天堂在线观看| 亚洲精华国产精华精华液| 日本一区二区三区日本免费| 免费人人潮人人爽一区二区| 中文字幕亚洲一区二区三区| 免费黄网站在线观看| 337p日本欧洲亚洲大胆精品555588 | 国产成人亚洲综合一区| 热久久精品免费视频| 无码人妻一区二区三区免费视频 | 亚洲欧洲国产精品你懂的| 91短视频免费在线观看| 亚洲xxxx视频| 亚洲精品国产精品乱码不卞| 久久九九AV免费精品| 亚洲av产在线精品亚洲第一站| 国产免费久久精品| 久久99毛片免费观看不卡| 亚洲国产夜色在线观看| 国产精品公开免费视频| 久久久国产精品福利免费| 亚洲一区中文字幕在线电影网| 免费在线不卡视频| 99re6在线精品视频免费播放| 中文字幕在线观看亚洲视频| 亚洲综合久久夜AV | 精品成在人线AV无码免费看| 美女免费视频一区二区| 911精品国产亚洲日本美国韩国| 麻豆国产入口在线观看免费|