How the Search Works

 

When you press the Fetch    button, the set of queries in the Query Editor is submitted to multiple user-selected search engines.  The retrieved results (documents) are fetched from their respective sources and each document is then parsed into sentences and analyzed for concept associations (e.g. protein-protein associations). Agilent Literature Search uses a set of lexicons for defining concept names (and aliases) and association terms (verbs) of interest. The concept lexicon supplied with this version of Agilent Literature Search contains gene/protein names and aliases.  An association is extracted for every sentence containing at least two concept names and one verb.  Associations  are then converted into interactions, which are further grouped into a network. The sentences and source hyperlinks for each association are further stored as attributes of the corresponding interactions .

Every sentence is tokenized into words and stemmed. The stemmed words are filtered using a dictionary of common English words, which is based on the dictionary provided with the UNIX operating system.  All the words in the text document that match a user context term are marked as potential entities of interest.

Agilent Literature Search uses two lexicons

An association is extracted for every sentence containing at least two nouns and one verb that are contained in the lexicons  .  Associations  are then converted into interactions, which are further grouped into a network. .