补充记录一下。

指定编码

  1. FileInputStream in = new FileInputStream("aaa.txt");
  2. byte[] data = new byte[1024 * 8];
  3. StringBuilder sb = new StringBuilder();
  4. while(in.read(data) > 0) {
  5. sb.append(new String(data,"utf8"); // 指定编码格式,避免乱码
  6. }
  7. System.out.println(sb.toString());

未知编码获取

1、引用

  1. <!-- https://mvnrepository.com/artifact/net.sourceforge.cpdetector/cpdetector -->
  2. <dependency>
  3. <groupId>net.sourceforge.cpdetector</groupId>
  4. <artifactId>cpdetector</artifactId>
  5. <version>1.0.7</version>
  6. </dependency>

2、工具类

  1. public static String getFileEncode(String filePath) {
  2. String charsetName = null;
  3. try {
  4. File file = new File(filePath);
  5. CodepageDetectorProxy detector = CodepageDetectorProxy.getInstance();
  6. detector.add(new ParsingDetector(false));
  7. detector.add(JChardetFacade.getInstance());
  8. detector.add(ASCIIDetector.getInstance());
  9. detector.add(UnicodeDetector.getInstance());
  10. java.nio.charset.Charset charset = null;
  11. charset = detector.detectCodepage(file.toURI().toURL());
  12. if (charset != null) {
  13. charsetName = charset.name();
  14. } else {
  15. charsetName = "UTF-8";
  16. }
  17. } catch (Exception ex) {
  18. ex.printStackTrace();
  19. return null;
  20. }
  21. return charsetName;
  22. }