Keyword Queries & Quickfind

Keyword queries are enabled with the following actions:

  • Place the inbuilt field named KEYWORD on a query filter layout
  • This is subject to the normal PR_RESOLUTION.KEYWORD security permission key, allowing you to control which roles have access to the facility
  • There is a further security permission key named PR_RESOLUTION.SEARCH_ATTACHMENTS. When this is enabled for a user role, they will see a checkbox appear beneath the keyword search. When the user checks this box, attachments to issues will also be searched for the keywords entered.
  • ALLOWED_ATTACH_SEARCH_FILE_EXT – this behavior setting controls which file attachment types are searched. For example, you may have large image files within your database, through which there is no point searching for keywords
  • ALLOW_SEARCH_TEXT_UDFS – When this is set to YES, all User Defined Fields with a display type of TEXT are included in the keyword query search
  • SEARCH_ATTACH_THRESHOLD – This is another control to stop huge searches when using keyword queries on attachments. ExtraView will first calculate the total size of the attachments that are to be scanned for keywords. If the size (in bytes) is greater than the number in this setting, then ExtraView will ask the user to confirm that he wants to execute the search.

ExtraView composes SQL queries to search for keywords in its underlying database by default. However, the Administrator may turn an extremely fast, indexed search mechanism named Quickfind. Quickfind requires additional resources and storage to provide this performance boost, but for large sites with a significant amount of text and attachment files, it significantly improves performance.

Searching Microsoft Documents for Keywords

Microsoft documents, such as Word and Excel, are stored using a character set known as UTF-16LE Unicode 16-bit LittleEndian. If you want to search these documents for keywords, you should store the documents, when you upload them, using this as the character encoding. This is especially important for Asian languages. Many searches with Roman alphabets will work fine without correctly identifying the character set within the Microsoft document.

Quickfind

If ExtraView Corporation is hosting your installation, you do not have direct access to the file system of the server to configure, alter or use this feature without contacting ExtraView support.

Quickfind uses indexes built and maintained by ExtraView to speed up the search process for keywords. Understanding these indexes and constructing your search terms properly is critical in generating the expected results from your query. The feature is built upon the Apache Lucene technology.

To enable this feature, see the section on Managing Quickfind and the section on setting up the task that performs the indexing. This is managed in the Manage Tasks and Threads administration utility.

The indexes built are based on words extracted from the data being searched. The database automatically extracts words from your documents, ignoring most special characters. Common words that are not useful for searches, such as “a”, “the”, and others are discarded. The list of words that are discarded depends upon your database vendor and the specific version of their product. A knowledgeable Database Administrator may alter the list of ignored words and other settings, as these are part of the database operation, not part of ExtraView’s functionality.

Since only words are indexed by the database ExtraView will use a more complete, but slower method to search if the keywords that the user is searching for contain special characters. This search can be much slower, especially if the user is searching for attachments. Special characters are any characters other than alphanumeric characters, a-z, A-Z, and 0-9. For example, if the keyword search is for the character string FIND_ME, a slower search will be used to ensure that words containing the underscore are found. This is a limitation of the database operation underneath ExtraView, not a limitation of ExtraView.

One of the biggest implications of the word index is that searches for fragments of a word that do not start at the beginning of the word will not return that word. For example, a search for FORM can return results for FORM, but not inFORMal.

ExtraView extends the keyword search term entered to search for words that start with the same characters. For example, a search for APPLE can return results for APPLE as well as APPLEs and APPLEsauce.

The default installation of Quickfind into your database environment performs case insensitive matching. Any combination of upper and lower case characters will match the same list of characters with any other combination of capitalization. For example, the following list of words would all be found when searching for apple: applE, ApPlE.

You may configure which text fields and file attachments are indexed. The indexing happens as a background task run on a timer within ExtraView. This timer is controlled by a behavior setting on the management screen for Quickfind, with the default being every five minutes. This means that text entered or attachments uploaded are not immediately available for finding with the keyword search capability, but will be included in the results following the next completion of the Quickfind background task. To enter the settings, click on the Manage Quickfind Settings of the Display & Reports administration section.


Managing Quickfind settings

Note that in addition to the Estimate Storage Requirements button, the two other buttons at the bottom of the screen:

Manage Content Types – This allows you to manage the different content types that are indexed by Quickfind. It is not very likely that you will need to modify these settings as the default values provided are fairly extensive.

Manage Character Set Mapping – This utility is only used for Oracle databases. Again it is not likely that you will need to alter the settings provided.

Quickfind versus Standard Search

  • Quickfind is faster searching for keywords, compared to the standard search mechanism.
  • Quickfind requires significantly more database disk storage than the standard search mechanism
  • Text indexed with Quickfind will not be found immediately it is inserted. There will be a delay, up to the time specified in the poll interval of the task, before the text may be found. In most circumstances this is not a significant issue
  • The standard search mechanism will not find mixed case words
  • Simple words that are in the stop word list of Quickfind are not found with keyword searches.