Configure full text indexing

From DSpace Wiki

Jump to: navigation, search

[edit] Files:

  • [dspace]/bin/filter-media
  • [dspace]/config/dspace.cfg

[edit] Instructions:

  1. First, follow the instructions in Configure media filters, since full text indexing is performed by those media filters (specifically the HTMLFilter, PDFFilter, and WordFilter)
  2. You may wish to modify the search.maxfieldlength field in your dspace.cfg configuration file. This field specifies the maximum number words to index for each document, and by default is set to the first 10,000 words. (Set it to the value -1, if you want to index an unlimited number of words)

    search.maxfieldlength = 10000

  3. If you choose to modify the search.maxfieldlength field you must Re-index DSpace before the change will take affect.

[edit] Notes:

  • Full text indexing is only available for the following formats at this time:

    • Adobe PDF (only if text-based or OCRed)
    • Microsoft Word
    • Plain Text
    • HTML
  • Full text searching in DSpace occurs when a user searches via the default search box (see below), or when a user selects the “Keyword” option from the Advanced Search screen.

Image:Searchbox.png

Personal tools