BoM to scan our climate history

Australia’s Bureau of Meteorology (BoM) is undertaking a major project to digitise historic technical records dating back to 1860 through to 1956 detailing meteorological observation elements (e.g. temperature, pressure, wind).

The Bureau is Australia’s national weather, climate and water agency with regional offices  located in each State capital and in Darwin. 

The Bureau is looking for translation of over 300,000 scanned PDF pages of handwritten, tabulated, historical weather observations, to generate an accessible digital version to be incorporated into the Bureau's database. The BoM says it needs to make the data from these forms accessible as a national data asset.

It expects the process to do this may be either manual (data entry) or automated (e.g. optical character recognition/intelligent character recognition) or a combination of the two.

The call for Expressions of Interest sets out the considerable challenge in digitising these historical records:

  • The records comprise numerous form types ranging in size from Foolscap to approximately A1.
  • The records are for a number of different stations, which are identified by name on the form and a corresponding 6-digit station number for ingestion into the Bureau's database.
  • The data is handwritten on pre-printed tabular forms. The writing is generally in ink but some forms may be written in pencil.
  • Printed text on forms may have been amended by hand. These alterations must be recorded in preference to the original printed text. Some interpretation may be required.
  • Each form contains a range of meteorological observation elements (e.g. temperature, pressure, wind) for various days and times/time periods.
  • Some handwritten values may have been corrected, written over, may not be completely contained in the designated column or may have been written in a different column altogether (this is common with corrections).
  • The data is generally in imperial units.

The Bureau expects an accuracy rate of 99.8% for the digitisation project. Depending on how the project is rolled out there is the potential that there could be many more pages over later phases.

An example of a historical meteorological record.