I've been reading up on Unicode and UTF-8 encoding for a while and I think I understand it, so hopefully this won't be a stupid question:
I have a file which contains some CJK characters, and which has been saved as UTF-8. I have various Asian language packs installed and the characters are rendered properly by other applications, so I know that much works.
In my Java app, I read the file as follows:
// Create objects
fis = new FileInputStream(new File("xyz.sgf"));
InputStreamReader is = new InputStreamReader(fis, Charset.forName("UTF-8"));
BufferedReader br = new BufferedReader(is);
// Read and display file contents
StringBuffer sb = new StringBuffer();
String line;
while ((line = br.readLine()) != null) {
sb.append(line);
}
System.out.println(sb);
The output shows the CJK characters as '???'. A call to is.getEncoding()
confirms that it is definitely using UTF-8. What step am I missing to make the characters appear properly? If it makes a difference, I'm looking at the output using the Eclipse console.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…