I have finally started a Google code project......
Tuesday, June 30, 2009
So, I have finally got round to setting up the HSBC Java API (some of you may remember my posts from months back mentioning my personal project I was working on HSBC Bank account aggregation). Writing this API has been a personal project of mine which has been on and off for a while now due to other committments. The idea behind this API is that it easily allows you to access your UK HSBC accounts and transaction history. I have so far found it useful for tracking my expenditure (by grouping transactions) and using it for notifications about the most recent transaction to be processed on my account. I am sure many developers will find this project interesting and will find many interesting ways to incorporate it into their applications.
I have been playing around with extracting data from PDF files. Apache PDF Box looked pretty promising but unfortunately it is far behind some of the others that are available. iText is a mature library but lacks the ability to extract information (it is actually a PDF creator). I was very impressed by the work done by LAB Asprise!. It took minutes to understand their impressive API and start coding. The parsing is fast, and so far appears accurate. The library is also extremely small for the abilities it provides (just over 3MB). If you are looking for a powerful Java API for processing PDFs then I strongly recommend it. Here is a code sample for extracting text (taken from their site). The code clearly demonstrates how much of an awesome job these guys have done....
PDFReader reader = new PDFReader(new File("my.pdf"));
reader.open(); // open the file.
int pages = reader.getNumberOfPages();
for(int i=0; i < pages; i++)
String text = reader.extractTextFromPage(i);
System.out.println("Page " + i + ": " + text);