Sunday January 21st 2018

Categories

Insider

Archives

Open data, SSIS and Excel

CHECKRecently Downloaded the Large dataset from UK Government Open Data Website i.e. Land Registry Price Paid Data from 1995 to 2015 about 24 million definitive records dating back to January 1995. Used the SSIS to get the data into SQL Server from CSV. Initially aggregate the data using the following query,

select YEAR([Data of Transfer]), sum([Price in Pounds]),avg([Price in Pounds]),
Town,County,[Property Type],[Old New] from
dbo.ukhomes
group by YEAR([Data of Transfer]),Town,County,District,[Property Type],[Old New]

after the above aggregation, total rows became 354,884. Exported the data to EXcel for further analysis.

Now there are 6 columns – Year Trnsfrd, Amnt, Town, County, Property type, Old New
and 354,884 rows. In Excel Sheet A1 to F 354,884. I used the following technique to select a large data range. Specifically, to use [F5] select A1:f354,88421, do the following:

Select A1.
Press [F5].
Enter F354,884 in the Reference field, but don’t click OK yet.
Press and hold the [Shift] key.
While holding down [Shift], click OK.

Still working on the data and will share more on any pattern / trend etc in future posts on this topic.

Background on this dataset

Price Paid Data tracks the residential property sales in England and Wales that are lodged with Land Registry for registration. Our price paid data tracks the residential property sales in England and Wales that are lodged with us for registration. The dataset is a reliable source of house price information and consists of more than 24 million definitive records dating back to January 1995. Source

Kaleem Mian
Senior Consultant
DataScientist.ca