用Java 读取 PDF 遇到中文标签该怎么处理

2024-12-20 06:49:21
推荐回答(1个)
回答1:

直接使用系统字体读取或创建带中文的pdf,需要注意jar的版本。

com.itextpdf
itextpdf
5.5.8


com.itextpdf
itext-asian
5.2.0


com.itextpdf.tool
xmlworker
5.5.6
123456789101112131415

代码如下,覆写XMLWorkerFontProvider$getFont即可读取中文
public void createPdf(String src, String dest) throws IOException, DocumentException {
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(dest));
document.open();
XMLWorkerHelper.getInstance().parseXHtml(writer, document, new FileInputStream(src), null, new XMLWorkerFontProvider(){ public Font getFont(final String fontname, final String encoding,
final boolean embedded, final float size, final int style,
final BaseColor color) {
BaseFont bf = null;
try {
bf = BaseFont.createFont("C:/Windows/Fonts/SIMYOU.TTF",BaseFont.IDENTITY_H,BaseFont.NOT_EMBEDDED);
} catch (Exception e) {
e.printStackTrace();
}
Font font = new Font(bf, size, style, color);
font.setColor(color);
return font;
}

});
document.close();
}1234567891011121314151617181920212223

创建时,使用系统(windows下)的字体即可
BaseFont baseFont = BaseFont.createFont("C:/Windows/Fonts/SIMYOU.TTF",BaseFont.IDENTITY_H,BaseFont.NOT_EMBEDDED);
Font font = new Font(baseFont);