In Estonia, the Ministry of Education and Science keeps track of students and the Tax and Customs Board keeps track of working (by tracking income tax payments). If data scientists could access these databases, they could find the correlation between working during studies and not graduating in time. However, this data cannot be shared because of the Personal Data Protection Act and the Taxation Act (not to mention the relevant EU regulation). This prevents such studies from being performed.
Personal Data Protection Act actually permits processing of personal data for research purposes (see § 16), although data mining in privacy-preserving manner might have some advantages.
We used the Sharemind Application Server with its analytics package Rmind to perform the study in a privacy-preserving way. The privacy-preserving solution was checked by the Estonian Data Protection Inspectorate. Their response was that our solution does not process Personally Identifiable Information (PII) in the meaning of the law.
For actual privacy of the study the institutions are required to audit the code which is being run on the Sharemind server. In this case Tax and Customs Board had a person having skills and willingness to audit the code:
Furthermore, the Tax and Customs Board reviewed Sharemind’s source code to ensure that everything is performed according to the study plan.
The findings of the study:
Our study showed relations between higher education and higher income, but we found no relation between working during studies and not graduating on time. Instead, it turned out that Estonian students of all fields work an equal amount. Also, our data showed clearly the reduction of employment during the financial crisis in 2008.