ppt轉成pdf,原理是ppt轉成圖片,再用圖片生產pdf,過程有個問題,不管是ppt還是pptx,都遇到中文亂碼,編程方框的問題,其中ppt后綴網上隨便找就有解決方案,就是設置字體為統一字體,pptx如果頁面是一種中文字體不會有問題,如果一個頁面有微軟雅黑和宋體,就會導致部分中文方框,懷疑是poi處理的時候,只讀取第一種字體,所以導致多個中文字體亂碼。
百度和谷歌都找了很久,有看到說apache官網有人說是bug,但他們回復說是字體問題,這個問題其實我覺得poi可能可以自己做,讀取原來字體設置成當前字體,不過性能應該會有很多消耗,反正我估計很多人跟我一樣花費大量時間找解決方案,網上幾乎沒有現成的方案。自己也是一步步嘗試,最終找到解決辦法,ppt格式的就不說了網上找得到,pptx后綴的網上我是沒找到。
問題前的pptx轉成圖片:
解決后的pptx轉成圖片:
解決方法:
讀取每個shape,將文字轉成統一的字體,網上找到的那段代碼不可行,我自己改的方案如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
for ( XSLFShape shape : slide[i].getShapes() ){ if ( shape instanceof XSLFTextShape ){ XSLFTextShape txtshape = (XSLFTextShape)shape ; System.out.println( "txtshape" + (i+ 1 ) + ":" + txtshape.getShapeName()); System.out.println( "text:" +txtshape.getText()); for ( XSLFTextParagraph textPara : txtshape.getTextParagraphs() ){ List<XSLFTextRun> textRunList = textPara.getTextRuns(); for (XSLFTextRun textRun: textRunList) { textRun.setFontFamily( "宋體" ); } } } } |
完整代碼如下(除了以上自己的解決方案,大部分是stackoverflow上的代碼):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
|
public static void convertPPTToPDF(String sourcepath, String destinationPath, String fileType) throws Exception { FileInputStream inputStream = new FileInputStream(sourcepath); double zoom = 2 ; AffineTransform at = new AffineTransform(); at.setToScale(zoom, zoom); Document pdfDocument = new Document(); PdfWriter pdfWriter = PdfWriter.getInstance(pdfDocument, new FileOutputStream(destinationPath)); PdfPTable table = new PdfPTable( 1 ); pdfWriter.open(); pdfDocument.open(); Dimension pgsize = null ; Image slideImage = null ; BufferedImage img = null ; if (fileType.equalsIgnoreCase( ".ppt" )) { SlideShow ppt = new SlideShow(inputStream); inputStream.close(); pgsize = ppt.getPageSize(); Slide slide[] = ppt.getSlides(); pdfDocument.setPageSize( new Rectangle(( float ) pgsize.getWidth(), ( float ) pgsize.getHeight())); pdfWriter.open(); pdfDocument.open(); for ( int i = 0 ; i < slide.length; i++) { TextRun[] truns = slide[i].getTextRuns(); for ( int k= 0 ;k<truns.length;k++){ RichTextRun[] rtruns = truns[k].getRichTextRuns(); for ( int l= 0 ;l<rtruns.length;l++){ // int index = rtruns[l].getFontIndex(); // String name = rtruns[l].getFontName(); rtruns[l].setFontIndex( 1 ); rtruns[l].setFontName( "宋體" ); } } img = new BufferedImage(( int ) Math.ceil(pgsize.width * zoom), ( int ) Math.ceil(pgsize.height * zoom), BufferedImage.TYPE_INT_RGB); Graphics2D graphics = img.createGraphics(); graphics.setTransform(at); graphics.setPaint(Color.white); graphics.fill( new Rectangle2D.Float( 0 , 0 , pgsize.width, pgsize.height)); slide[i].draw(graphics); graphics.getPaint(); slideImage = Image.getInstance(img, null ); table.addCell( new PdfPCell(slideImage, true )); } } if (fileType.equalsIgnoreCase( ".pptx" )) { XMLSlideShow ppt = new XMLSlideShow(inputStream); pgsize = ppt.getPageSize(); XSLFSlide slide[] = ppt.getSlides(); pdfDocument.setPageSize( new Rectangle(( float ) pgsize.getWidth(), ( float ) pgsize.getHeight())); pdfWriter.open(); pdfDocument.open(); for ( int i = 0 ; i < slide.length; i++) { for ( XSLFShape shape : slide[i].getShapes() ){ if ( shape instanceof XSLFTextShape ){ XSLFTextShape txtshape = (XSLFTextShape)shape ; // System.out.println("txtshape" + (i+1) + ":" + txtshape.getShapeName()); //System.out.println("text:" +txtshape.getText()); for ( XSLFTextParagraph textPara : txtshape.getTextParagraphs() ){ List<XSLFTextRun> textRunList = textPara.getTextRuns(); for (XSLFTextRun textRun: textRunList) { textRun.setFontFamily( "宋體" ); } } } } img = new BufferedImage(( int ) Math.ceil(pgsize.width * zoom), ( int ) Math.ceil(pgsize.height * zoom), BufferedImage.TYPE_INT_RGB); Graphics2D graphics = img.createGraphics(); graphics.setTransform(at); graphics.setPaint(Color.white); graphics.fill( new Rectangle2D.Float( 0 , 0 , pgsize.width, pgsize.height)); slide[i].draw(graphics); // FileOutputStream out = new FileOutputStream("src/main/resources/test"+i+".jpg"); // javax.imageio.ImageIO.write(img, "jpg", out); graphics.getPaint(); slideImage = Image.getInstance(img, null ); table.addCell( new PdfPCell(slideImage, true )); } } pdfDocument.add(table); pdfDocument.close(); pdfWriter.close(); System.out.println( "Powerpoint file converted to PDF successfully" ); } |
maven配置:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
<dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <!-- <version> 3.13 </version> --> <version> 3.9 </version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <!-- <version> 3.10 -FINAL</version> --> <version> 3.9 </version> </dependency> <dependency> <groupId>com.itextpdf</groupId> <artifactId>itextpdf</artifactId> <version> 5.5 . 7 </version> </dependency> <dependency> <groupId>com.itextpdf.tool</groupId> <artifactId>xmlworker</artifactId> <version> 5.5 . 7 </version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-scratchpad</artifactId> <!-- <version> 3.12 </version> --> <version> 3.9 </version> </dependency> |
上面就是為大家分享的java實現PPT轉PDF出現中文亂碼問題的解決方法,希望對大家的學習有所幫助。