.CSIRO Enterprise Research Collection

TREC Enterprise track 2007

Access to the collection

Note, you will need to have signed and returned the Organisation Agreement, and been allocated a user/password, for you to access the new corpus data. Instructions for where to return the signed agreement are in the agreement itself.

If you just fax the form, please also also email us so we have an email address in case we need to get hold of you, and so we can be sure the fax gets through.

The data may be downloaded either as a single tar file (CSIRO_Enterprise_Research_Collection.tar - 357MB in size) or as individual bundles and extras.

Quick stats on the corpus:

4493545213 bytes = 4.1849401 gigabytes
370715 docs
267 bundles - CSIRO0lmn.gz

Extras:

Crawl of *.csiro.au websites carried out using Funnelback 6.0 by Peter Thew and Peter Bailey. Data preparation by Nick Craswell and Peter Bailey.


More information