Agilent Literature Search uses text mining technology to generate an "association" network from information extracted from the scientific literature. This can be useful in understanding how genes and proteins may interact in the context of a disease or other biological process. Agilent Literature Search can also be used to create a knowledge base for a set of genes and proteins. In this case, the network diagram can serve as a visual "table of contents" for the knowledge base. We refer to these networks as "association networks" because they represent associations in the literature between biological entities. The Agilent Literature Search tool can be invoked in Cytoscape by selecting the Agilent Literature Search item on the Plugins menu. When you do this, the Agilent Literature Search control panel will appear, as shown in the figure below.
To execute a search, you enter a set of search terms and, optionally, context words. You can also restrict search results
to a particular species by checking the Concept lexicon limits search box. For each search term
entered, a query line is constructed in the Query Editor panel,
incorporating aliases and context, if desired. You can also manually edit
the query lines in the Query Editor to specify more advanced
search options, such as an "author" field. When you press the button,
the set of queries in the Query Editor is
submitted to multiple user-selected search engines.
The retrieved results (documents) are fetched from their
respective sources and each document is then parsed into sentences and analyzed
for concept associations (e.g. protein-protein associations). Agilent Literature
Search uses a set of lexicons for defining concept
names (and aliases) and association terms (verbs) of interest. The concept lexicon supplied with this version
of Agilent Literature Search
contains gene/protein names and aliases. An association is
extracted for every sentence containing at least two concept names and one
verb. Associations are then converted into
interactions, which are further grouped into a network. The sentences
and source hyperlinks for each association are further stored as attributes of
the corresponding interactions
.
The networks can be viewed and manipulated in Cytoscape v2.3.
The Agilent Literature Search control panel consists of several panels:
The figure below shows an Agilent Literature Search control panel with information filled in for a set of genes related to melanoma, using Human concept lexicon to resolve aliases, limiting results to 'human' concept lexicon, and limiting search engine "hits" to 5 per search engine per query line. The stringency level is set to "limited".
The figure below shows the Literature Search control panel, with entries in the Query Matches panel that correspond to the sentences found in the literature. Note that you may see a different network due to additions to PubMed since this query was run.
This figure below shows the resulting network from the example melanoma search, displayed in Cytoscape. The thickness of an edge is mapped (via Cytoscape VizMapper) to the number of sentences stored as a property on the edge. The network nodes containing user-entered search terms are highlighted in yellow.
A closer look at a zoomed-in area of the network is shown in the figure below.
For each of the Nodes and Edges, you can view the literature references that were used to generate them.