Managing Quickfind

This screen is used to set up and configure Quickfind, a mechanism that greatly increases the speed of text searches through the ExtraView database. This utility defines exactly what text will be indexed, and where the indexes will be stored. The Quickfind indexing mechanism is based upon the Apache Foundation Lucene technology.

Note that documents and attachments larger than 16MB are not indexed.

Information on the Quickfind Synchronization task is found here.

The basic setup of Quickfind is as follows:

  • Check the Task Manager to ensure the Quickfind Synchronize Task (FULL_TEXT_SYNCHRONIZE) is not running. If it is running, you must stop it before proceeding. See the page here for instructions on starting and stopping the task.
  • The basic setup and configuration of Quickfind is accomplished on the Admin, Initial Setup, Manage Quickfind Settings screen
  • Enable Quickfind from the screen by setting the Enable Quickfind list to Yes
  • Setup the path to the location where the index files will be stored on your file system. If you will use the default path as explained below, you can skip this step. All application servers in your installation must have read and write access to the index file location. The index location is also the value stored in the behavior setting named QUICKFIND_INDEX_LOCATION. Further, you must make sure you have sufficient disk space to store the indexes as they grow.
    • The default path to your indexes is quickfind_index. Note that this is using a pathname relative to your WEB-INF folder. Any pathname relative to WEB-INF may be used, except when you are running ExtraView within a WAR file environment, in which case you must use an absolute pathname
    • You may choose to enter an absolute pathname such as C:\ExtraView\QuickfindIndex within a Windows environment, or /usr/ExtraView/QuickfindIndex within a Linux environment
    • If the location is on an NFS mount, then you should include this in the index location by using the convention nfs:pathname
  • The prompt Allow Quickfind on user defined text fields determines whether the text within all the user defined fields should be indexed. The recommendation is that you set this to Yes
  • If you are upgrading an existing database, run the external utility named FullTextIndexSetup as described below, to create the initial indexes from the issue data and the file attachments that already exist. If this is a new installation, you can skip this step
  • Finally, return to the Task Manager screen to add and / or start the Quickfind Synchronize Task (FULL_TEXT_SYNCHRONIZE).

The Quickfind management utility

Observe the dates on which files were last indexed, and the count of files still to be indexed. The button named Estimate Storage Requirements gives an approximate estimate of how much storage is required with the current database, to store the indexes. Make sure you have sufficient storage available at all times, as your database grows in size. On rare occasions, it might be necessary to manipulate the content types and the character set mappings of information to be indexed. It is not recommended that you alter these settings.

FullTextIndexSetup

If you are enabling Quickfind on an existing installation, you should index all the existing information by using the external program named FullTextIndexSetup. This should be accomplished with the task named Quickfind Synchronization Task (FULL_TEXT_SYNCHRONIZE) turned off, or the ExtraView application stopped. Some upgrades may also require the indexes to be rebuilt, using the same FullTextIndexSetup program.

The FullTextIndexSetup utility is extremely quick on a new database, but may take some time on a very large existing database. It is difficult to predict exactly how long this will take on a large database, as there are many factors such as the processing speed, memory, amount of text, and the amount of attachments that all have an impact. However, it is unlikely that the process will take more than a few hours. Your users can continue working during this period, and the search results will improve as the process continues. Our recommendation is to start the process after the majority of users leave work for the day.

The utility is found in the directory named WEB-INF/data.

The syntax to run this utility is:

For Windows platforms - FullTextIndexSetup.bat JAVA_HOME TOMCAT_HOME EV_BASE

    where JAVA_HOME is the path to your Java, TOMCAT_HOME is the path to your application server and
    EV_BASE is the path to the ExtraView installation

or for Linux platforms - FullTextIndexSetup.sh evj

    where evj is the path to the ExtraView installation
    It is likely that you will have environment variables set for JAVA_HOME and TOMCAT_HOME,
    but these can be set in the shell script file if needed.

Optional parameters for FullTextIndexSetup

-report filename - generate a report in file filename. By default, the report will be written into the ExtraView log file

-testOnly - this allows a dry run migration of the attachments without any modification to the database or the repository. Errors will be reported

-directory - the directory to use for the full text indexes if you are using Microsoft SQL Server. This must be created prior to running the script

-tablespace - the tablespace for the full text indexes, if you are using Oracle. This must be created prior to running the script.

What is Indexed for Quickfind Keyword Searches

The following are “indexed” as part of the Quickfind operation. Indexing is done in the talisk named FULL_TEXT_SYNCHRONIZE, or through the FullTextIndexSetup operation, which is run as a command-line operation (see above).

  • Inbuilt Fields
    • the SHORT_DESCR field
    • If the behavior setting named QUICKFIND_INDEX_USERS is YES, the following user fields are indexed:
      • OWNER
      • ASSIGNED_TO
      • LAST_UPDATED_BY_USER
      • ORIGINATOR
      • CONTACT
      • All MODULEs: ASSIGNED_TO field
      • User Defined Field USER type fields
    • All User Defined Fields with a display type of TEXTFIELD
    • All large User Defined Fields with a display type of TEXTAREA, LOGAREA, PRINT_TEXT
    • The User Defined Fields with a display type of HTMLAREA, after removing the HTML tags
  • DOCUMENT fields
    • Document description
    • File name
    • Document content according to MIME type and content size:
      • Size must be less than 16MB
      • MIME type must not be any kind of video, audio, or image
      • The MIME type is mapped to an extractor – see MIME type extractors below
  • IMAGE fields
    • Image description
    • File name
  • Attachments - these are only searched when the Search Attachment checkbox is checked
    • Attachment description
    • Attachment content according to MIME type and content size:
      • Size must be less than 16MB
      • MIME type must not be any kind of video, audio, or image
      • MIME type is mapped to an extractor – see MIME type extractors below

    MIME Type Extractors

    There are two text extractors:

    1. PDF Text Extractor: uses iText PdfReader object to tokenize the PDF strings. Document content will NOT be indexed if it is marked as being “encrypted”
    2. Office Text Extractor: uses one of the POI extractors appropriate to the type (Word, Excel, Powerpoint, Outlook, Publisher, or Visio). Note: if Excel text extractor fails, it tries to extract text assuming it is a comma-separated or tab-separated text file with an Excel MIME type
    3. Other MIME type documents are indexed as text

Notes

As stated above, Quickfind utilizes the Apache Lucene software to provide the indexing mechanism. One support issue is that if your Java Virtual Machine, or your application server (Apache Tomcat or whatever) crashes for any reason, then the indexes may be left in a locked state on the server. The lock files are kept in the directory specified by the org.apache.lucene.lockdir system property if it is set, or by default in the directory specified by the java.io.tmpdir system property (on Unix boxes this is usually /var/tmp or /tmp). If for some reason java.io.tmpdir is not set, then the directory path you specified to create your index is used. Lock files have names that start with lucene- followed by an MD5 hash of the index directory path. If you are certain that a lock file is not in use, you can delete it manually.

For full information on the use of Quickfind, see the page titled Keyword Searching.