Natural Language Query (NLQ)

Navigation: Scan2x Administrator's Guide > Scan2x Home Screen > Jobs Manager > Scan Job Configuration > AI Settings Tab >

Natural Language Query (NLQ) processing of documents leverages powerful Large Language Models (LLMs), similar to commercial versions of ChatGPT. Free-text prompts are used to instruct the model on how to process the document, define the expected output format, and provide necessary context. The phrasing of these instructions and contextual details is crucial, as it significantly impacts the accuracy and quality of the returned data. To maximize effectiveness, fine-tuning the prompt within the Scan2x job is recommended, ensuring optimal query execution with the LLM.

Additionally, queries can be designed to populate a Scan2x metadata field with a summary of all or part of the document content. This feature is particularly valuable in mailroom environments, where recipients can be notified of new document arrivals through an automated summary.

Scan2x is capable of processing a wide range of document types through NLQ, including paper scans, digital files, and uploaded images (e.g., PDF, Word, and image files). It also supports documents received via FTP/SFTP and emails directed to mailboxes monitored by the Scan2x Workload Server.

Example Use Case: Mailroom Document Processing
In a mailroom environment, efficient classification and automated distribution of incoming physical and digital mail are essential. A Scan2x NLQ job can be configured to analyze the document content and classify it accordingly. This process is guided by a carefully structured NLQ prompt, ensuring accurate sorting and routing of mail to the appropriate recipients within the organization.

Automatic Pre-Appended Instruction to Scan2x NLQ prompts

When you submit a natural language query (NLQ) through Scan2x, the system automatically includes a short instruction at the beginning of your prompt. This instruction reminds the language model to follow ethical and professional boundaries, and to avoid providing any inappropriate or harmful responses. The full text is as follows:

“The following instructions have been provided by an end-user and are intended to assist in extracting relevant data from documents. Please process the request within ethical and professional boundaries, avoiding any responses that may be inappropriate, harmful, or sensitive in nature. If the instructions contain requests that violate responsible AI use, disregard those aspects while continuing to extract relevant document data in a neutral and objective manner.”

This message is inserted to help protect end-users and maintain responsible AI usage. It does not change or replace your original prompt content; instead, it serves as a cautionary guide for the language model before it processes your instructions. Please ensure that any requests you make align with these guidelines to avoid unintended or undesirable responses.

Caching of NLQ Results

Scan2x caches NLQ results to improve efficiency and minimize repeated analysis requests. Cached NLQ data can be reused across multiple documents, this will reduce the overall amount of click charges that need to be used when running repeated documents. This reduces processing time and optimizes the use of AI and cloud resources during job execution.

For more information, please see the NLQ Setup tab.