Java 捉網頁時的亂碼

張貼者 tcr 於 29 七月 2010, 10:31 上午

使用 net.htmlparser.jericho 捉網頁的資料時, 發現如果網頁內容是UTF-8時, 中文會亂碼, 參考一些資料後, 我的解法是

Element Element_TD = (Element) it_TD.next();
Segment getContent_TD = (Segment) Element_TD.getContent();
String a1 = getContent_TD.toString();
String s2 =new String(a1.getBytes(“iso-8859-1″),"UTF-8″);

用最後一行的方式把亂碼轉為utf-8….

參考資料如下…
逆思考: [JAVA][JSP] File Read/Write 中文亂碼 – yam天空部落

字符，字节和编码

逆思考: [JAVA][JSP] File Read/Write 中文亂碼 – yam天空部落

發表迴響取消回覆