Data in Canada - Looking Ahead

Lars Vilhuber

Jan 22, 2020

Declining survey response rates

An image worth many words

an image

(source: the economist)

Another statistic

calls (source: Pew)

Administrative data

Administrative data can be very useful

  • Universal coverage
  • Comprehensive network data

Evolution

Use of administrative data
Use of administrative data

Admin data can be hard to access

  • Onerous rules
  • Administrative hassle
Secure rooms
Secure rooms

Access

SafePod
pod
pod
pod

Hassles

Security clearance

Annual training

Remote access key

Cost

Privacy and security are necessary

Threats

Reconstruction attacks are real

Bad research practices are frequent

  • PII published through negligence
  • Unencrypted data on laptops stolen
Source: ruoaa (pixabay.com)
Source: ruoaa (pixabay.com)

Privacy and security need not be an impediment

The French network

French network
French network

Successful CASD

660 projects (January 2020)

Pay-per-use, scalable infrastructure

Accessible from multiple European countries, and from North America

German network

Forschungszentren des IAB
Forschungszentren des IAB

Features of the German network

Multiple physical access points in North America and Europe

Web-access possible

Other examples

  • Danish data (hostage-holding)
  • US Government personnel (remote access when highly trusted)

Non-survey data are also imperfect

Administrative data

Response rates are high

But measure only certain things

“Found data”

On the internet, everybody is xx years old

Representativeness/ reliability of one (internet) company’s data

Compare income

Admin data vs Survey data
Formal wages Formal and informal wages
Profit/Income from SE Hours spent on SE
Reporting location (One of) actual locations
Receipt of support income Memory of receipt of support income
Purchase of product Consumption of product
Indicators of poverty Perception of poverty
Occupation recorded Tasks performed

Self-employment incidence

Figure 1, Abraham et al 2017
Figure 1, Abraham et al 2017

Self-employment mis-measurement

Figure 2b, Abraham et al 2017, for same individual
Figure 2b, Abraham et al 2017, for same individual

See also Meyer and Mittag (2019)

Compare locations

Differences in residence location

Distance between Survey and Admin Residence location
Distance between Survey and Admin Residence location

Differences in work locations

Commute Distance for Single-Establishment Firms by size
Commute Distance for Single-Establishment Firms by size

Looking forward

Easier but secure access to richer data

Access should be made easier

  • technology/training/legal framework is available

Silos need to be broken down

  • combine data from private/gov’t/provincial sectors (not just “big” data)

Researchers and providers need to be trained

  • on new access methods, new protection methods, stronger ethics

New ways of collecting data

Consider co-designing surveys and admin data

  • better data
  • for novel data

BIT UK IIU Canada OES USA

More secure ways to collect

  • Differential privacy works for some collections
  • Data ownership is the new oil - for individuals, too

Thank you