When you press the Fetch button, the set of queries
in the Query Editor is submitted to multiple user-selected
search engines. The retrieved results (documents) are fetched from their
respective sources and each document is then parsed into sentences and analyzed
for concept associations (e.g. protein-protein associations). Agilent
Literature Search uses a set of lexicons for defining concept names
(and aliases) and association terms (verbs) of interest. The concept lexicon supplied with this version
of Agilent Literature Search contains gene/protein names and
aliases. An association is extracted for every sentence containing at
least two concept names and one verb. Associations are then
converted into interactions, which are further grouped into a network.
The sentences and source hyperlinks for each association are further stored as
attributes of the corresponding interactions .
Every sentence is tokenized into words and stemmed. The stemmed words are filtered using a dictionary of common English words, which is based on the dictionary provided with the UNIX operating system. All the words in the text document that match a user context term are marked as potential entities of interest.
Agilent Literature Search uses two lexicons,
An association is extracted for every sentence containing at least two nouns and one verb that are contained in the lexicons . Associations are then converted into interactions, which are further grouped into a network. .