The ways to extract text from Word, Excel and Powerpoint documents are shown below.
//Word text extraction POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream("filename.doc")); WordExtractor extractor = new WordExtractor(fs); String wordText = extractor.getText();
//Excel text extraction POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream("filename.xls")); ExcelExtractor extractor = new ExcelExtractor(fs); String excelText = extractor.getText();
//Powerpoint extraction POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream("filename.ppt")); PowerPointExtractor extractor = new PowerPointExtractor(fs); String powerText = extractor.getText();
However POI is still not compatible with Office 2007 file formats like .docx, .xlsx and .pptx but it will in the future.