Oxford2014

The Oxford Open Science Group and the Oxford e-Research Centre are holding an event for Open Data Day at the Oxford e-Research Centre.

We aim to gather after 9:30am and end at 5pm.

The Centre is based in the centre of Oxford on Keble Road.

Lunch is being sponsored by White October, a local website & mobile app company who ran the first event in Oxford.

You can register for the day on our Google form, and tell us what skills that you have or any data sets for the day.

Data Sets

 * Carbon Data released from DECC


 * Google broadcast and wireless price and speeds internationally (http://policybythenumbers.blogspot.co.uk/2013/05/international-broadband-pricing-study.html)


 * International Aid Transparency Initiative (http://www.iatiregistry.org/)


 * Oxford City Council (http://www.oxford.gov.uk/PageRender/decCD/Transparency.htm)


 * Arts Council of England Grants 2012-2013 (https://docs.google.com/spreadsheet/ccc?key=0Ap_FnlqKeK9sdFloTFZIZFhnZkV4NEFpY256T2RRS3c&usp=sharing)


 * Oxford County Council might have some transport data as well for the day.


 * Environment Agency http://www.geostore.com/environment-agency/WebStore?xml=environment-agency/xml/ogcDataDownload.xml


 * Flood data from the OKFn EN list: http://www.owenboswarva.com/opendata/EA/ea_flood_datasets.htm

Carbon data wrangling

 * https://github.com/wf4ever/ro-manager/tree/develop/src/simplecsvtordf - code and data

Did some hacking on converting CSV data on carbon emissions by UK region into RDF, then experimented wit querying the data using Fuseki and SPARQL.

Source data: https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/212445/Full_DatasetFINAL.xlsx

Exported "full dataset" tab in spreadsheet to CSV file.

Sample from, CSV (cols 7, 8):

Region Name,Second Tier Authority,LA Region Name,LAD10CD,Year,A. Industry and Commercial Electricity,B. Industry and Commercial Gas,C. Large Industrial Installations,D. Industrial and Commercial Other Fuels,E. Agricultural Combustion,Industry and Commercial Total,F. Domestic Electricity,G. Domestic Gas,H. Domestic 'Other Fuels',Domestic Total,I. Road Transport (A roads),J. Road Transport (Motorways),K. Road Transport (Minor roads),L. Diesel Railways,M. Transport Other,Transport Total,N. LULUCF Net Emissions,Grand Total,"Population                                             ('000s, mid-year estimate)",Per Capita Emissions (t),,,,,,,,,,,, North East,Darlington,Darlington,00EH,2005, 165.9, 139.9 , 0.0 , 29.0 , 4.0 , 338.7 , 100.6 , 156.5 , 31.5 , 288.7 , 87.6 , 49.0 , 74.2 , 6.1 , 5.2 , 222.1 , 4.1 ,853.6,100.3,8.5,,,,,,,,,,,,

Proposed sample as RDF (using Turtle):

@prefix ex: . ex:colheadings a ex:ColumnHeadingText ; ex:colA  "Region Name" ; ex:colB   "Second Tier Authority" ; ex:colC   "LA Region Name" ; ex:colD   "LAD10CD" ; ex:colE   "Year" ; ex:colF   "A. Industry and Commercial Electricity" ; ex:colG   "B. Industry and Commercial Gas" ; ex:colH   "C. Large Industrial Installations" ; ex:colI   "D. Industrial and Commercial Other Fuels" ; ex:colJ   "E. Agricultural Combustion" ; ex:colK   "Industry and Commercial Total" ; ex:colL   "F. Domestic Electricity" ; ex:colM   "G. Domestic Gas" ; ex:colN   "H. Domestic 'Other Fuels'" ; ex:colO   "Domestic Total" ; ex:colP   "I. Road Transport (A roads)" ; ex:colQ   "J. Road Transport (Motorways)" ; ex:colR   "K. Road Transport (Minor roads)" ; ex:colS   "L. Diesel Railways" ; ex:colT   "M. Transport Other" ; ex:colU   "Transport Total" ; ex:colV   "N. LULUCF Net Emissions" ; ex:colW   "Grand Total" ; ex:colX   "Population" ; ex:colY   "Per Capita Emissions (t)" ; .

ex:row a ex:RowData ; ex:colA    "North East" ; ex:colB    "Darlington" ; ex:colC    "Darlington" ; ex:colD    "00EH" ; ex:colE    "2005" ; ex:colF    "165.9" ; ex:colG    "139.9" ; ex:colH    "0.0" ; ex:colI    "29.0" ; ex:colJ    "4.0" ; ex:colK    "338.7" ; ex:colL    "100.6" ; ex:colM    "156.5" ; ex:colN    "31.5" ; ex:colO    "288.7" ; ex:colP    "87.6" ; ex:colQ    "49.0" ; ex:colR    "74.2" ; ex:colS    "6.1" ; ex:colT    "5.2" ; ex:colU    "222.1" ; ex:colV    "4.1" ; ex:colW    "853.6" ; ex:colX    "100.3" ; ex:colY    "8.5" ; .

Hacked some previoully-written software to convert the tabular CSV data to RDF, as above.

Installed latest Fuseki (http://jena.apache.org/documentation/serving_data/)

Loaded converted data, run a couple of queries; e.g.

prefix ex:  select ?a ?av ?b ?bv ?e ?ev ?w ?wv ?x ?xv where {      ?s a ex:ColumnHeadingText ; ex:colA ?a ; ex:colB ?b ; ex:colE ?e ; ex:colW ?w ; ex:colX ?x .     ?r a ex:RowData ; ex:colA ?av ; ex:colB ?bv ; ex:colE ?ev ; ex:colW ?wv ; ex:colX ?xv .   }

Oxford flood tweet count vs water levels
Ryan added a blog post about the day: http://ryanbrooks.co.uk/posts/2014-02-22-open-data-day-2014/

Google Broadband Pricing
http://hangler.net/2014/02/22/oxford-open-data-day/

Twitter Hashtag

 * 1) oddox

Attendee
Graham Klyne (@gklyne ) - carbon data wrangling

Yan Weigang - carbon data wrangling

@spikeheap Ryan Brooks

@iiSeymour Chris Seymour

@metaphract Tom Talbot

@iainemsley - Jenny Molloy