Posts

Showing posts from May, 2016

Apache Tika + J2Html +Bootstrap + Java

Image
I was just taking a look at Apache Tika today and because of this I had a chance to brush up my HTML and Regex I used a library called J2HTML   which really made my output easier to handle, Apart from this I used Bootstrap to make my table look neater. I ended up with the below kind of look and feel  The Columns are as below  1. File Name along with the link. \ 2. File Type String  3. Metadata Table - All metadata elements pulled out by Tika 4. Categories - Basically Tags. - Right now it is very rudimentary but will improve it soon - These categories are arrived by analysing the content of the files.  Below is the code that is used to parse a file  Parser parser = new AutoDetectParser(); BodyContentHandler handler = new BodyContentHandler(); Metadata metadata = new Metadata(); FileInputStream inputStream = new FileInputStream(file); ParseContext context = new ParseContext(); Once the above code is executed. handler.toString() can be used to get