Wednesday, April 13, 2011

Available for download: The Icelandic Parsed Historical Corpus, IcePaHC 0.4



Available: IcePaHC 0.4 (now includes a visual Windows version)


IcePaHC 0.4, the latest version of the Icelandic Parsed Historical Corpus, is now available for download:


  • 440.000 words total, from every century between the 12th and the 19th centuries inclusive annotated for phrase structure, part-of-speech-tagged and lemmatized
  • An optional easy-to-install visual user interface for Windows
  • LGPL license: You are free to copy, modify and redistribute the corpus for research and/or profit

Joel C. Wallenberg (joel.wallenberg@gmail.com)
Anton Karl Ingason (anton.karl.ingason@gmail.com)
Einar Freyr Sigurðsson (einarfs@gmail.com)
Eiríkur Rögnvaldsson (eirikur@hi.is)
University of Iceland

The project is funded by the following grants:
  • Icelandic Research Fund (RANNÍS), grant nr. 090662011,"Viable Language Technology beyond English – Icelandic as a test case".
  • U.S. National Science Foundation (NSF) International Research Fellowship Program (IRFP), grant #OISE-0853114, "Evolution of Language Systems: a comparative study of grammatical change in Icelandic and English".
/Anton Karl Ingason

--------------------------------

IcePaHC 0.4, íslenski trjábankinn (nú með Windows útgáfu)

IcePaHC 0.4, nýjasta útgáfa íslenska trjábankans, er komin út:

  • Samtals 440.000 orð frá öllum öldum frá og með 12. öld til og með 19. öld, sem búið er að greina setningafræðilega, marka og lemma
  • Einföld Windows uppsetning á myndrænu notandaviðmóti
  • LGPL leyfi: Notendur geta afritað málheildina, breytt henni og endurútgefið vegna rannsókna og/eða í hagnaðarskyni
Joel C. Wallenberg (joel.wallenberg@gmail.com)
Anton Karl Ingason (anton.karl.ingason@gmail.com)
Einar Freyr Sigurðsson (einarfs@gmail.com)
Eiríkur Rögnvaldsson (eirikur@hi.is)

Verkefnið er styrkt af:
  • RANNÍS, styrk nr. 090662011, "Hagkvæm máltækni utan ensku - íslenska tilraunin".
  • U.S. National Science Foundation (NSF) International Research Fellowship Program (IRFP), styrk #OISE-0853114, "Evolution of Language Systems: a comparative study of grammatical change in Icelandic and English".
/Anton Karl Ingason

2 comments:

Signe said...

Does anyone else have trouble loading the webpage? (http://linguist.is/icelandic_treebank/Download)

Unknown said...

I'm sorry you had trouble accessing the download page. It must have been some temporary problem because many people have successfully downloaded the corpus. Let me know if you still have problems downloading!

Anton Karl Ingason
anton.karl.ingason@gmail.com