Projects‎ > ‎

The IGN Corpus

The Instance-level Gene Normalization (IGN) corpus was compiled using two datasets, one for abstract and the other for full text-level evaluations. For each article, in addition to the annotations of all described gene/gene product mentions, the following annotations are included in IGN corpus:

  1. The corresponding Entrez Gene ID of each human gene mention,
  2. Species information of each gene mention,
  3. Gene full name/abbreviation pairs,
  4. Co-reference of gene mentions, and
  5. Sentence boundaries. 
The IGN corpus intends to be released in the BioC XML format as a publicly available resources for other text mining systems.

SelectionFile type iconFile nameDescriptionSizeRevisionTimeUser
ċ

Download
  49k v. 1 Jun 28, 2013, 4:44 AM Hongjie Dai
ċ

Download
  115k v. 1 Jun 28, 2013, 4:44 AM Hongjie Dai
ċ

Download
  113k v. 1 Dec 14, 2013, 2:44 AM Hongjie Dai
ċ

Download
  54k v. 1 Dec 14, 2013, 2:44 AM Hongjie Dai
Comments