Category Archives: Lucene

Token attributes versus term attributes

As you add documents to IndexWriter the indexed fields end up sooner or later as TokenStream-s, and then tokens and their attributes are collected and inverted and added to TermsHash, which is an internal segment-like representation of new inverted document … Continue reading

Posted in Lucene | Leave a comment

SIGIR 2012 paper

Together with gsingers and rmuir we’re submitting a paper to SIGIR-2012 on Lucene 4. This is pretty exciting – it’s my first “formal” paper submitted to such an important conference. We realized that the IR community at large is no … Continue reading

Posted in Lucene | Leave a comment

Large stored fields in Lucene

I’ve been thinking recently about a NoSQL use case for Lucene. It looks like Lucene already satisfies many needs that other NoSQL platforms also support, of course with the added benefit of robust search functionality, which is something that other … Continue reading

Posted in Lucene | Leave a comment