Tuesday, 14. August 2007
Lucene Performance

I started a proof of concept if Lucence could be used for searching in log files. Here a first result of the indexer (see also benchmarks of the Lucene site):

    Hardware Environment

  • Dedicated machine for indexing: yes
  • CPU: Intel Pentium M, 1.6 GHz, 1 processor
  • RAM: 2 GB
  • Drive configuration: IDE 2,5" hard disk (in a Dell Lititude D810 notebook)
  • Software environment

  • Lucene Version: 2.2.0
  • Java Version: Java SE 1.6.0_02
  • Java VM: client VM
  • OS Version: WinXP with SP1
  • Location of index: local
  • Lucene indexing variables

  • Number of source documents: 9
  • Total filesize of source documents: 15 MB
  • Average filesize of source documents:2 MB
  • Source documents storage location: Filesystem
  • File type of source documents: log files
  • Parser(s) used, if any:
  • Analyzer(s) used: StandardAnalyzer
  • Number of fields per document: 3
  • Type of fields: text
  • Index persistence: FSDirectory
  • Index size: 3 MB
  • Figures

  • Time taken (in ms/s as an average of at least 3 indexing
    runs)
    : 7 s (first try: 37 s -> ignored)
  • Time taken / 1000 docs indexed: 150 s (estimated)
  • Memory consumption: started with -Xmx128m -Xms128m
  • Query speed: not yet measured
  • Notes

  • Note: first prototype, no special tuning/strategies
  • Note: maxFieldLength set to 1,000,000 (default of 10,000 was to small)

... link (2 Kommentare)   ... comment