This week we take the week 4 exercise, perform additional analysis, and apply some design patterns to make the design (slightly) better.
If you did not return the week 4 assignment, there’s a reference Data Flow Diagram that you can work on. This will be made available after the week 5 deadline, at a location that will be separately communicated. You may opt to use either your own diagram from week 4 (which lets you start right away), or the reference diagram (in which case you need to wait).
Making the design better
In the discussion that followed your week 4 threat modelling, you managed to convince the company to make some changes to the architecture.
First, you think that the co-location of the web service (Nginx and Node.js) and the PostgreSQL database on the same machine is not very optimal state of affairs. If there is a security vulnerability in Nginx, or if the Node.js application has an SQL injection vulnerability, an attacker could potentially compromise the whole database.
Second, the use of the (publicly displayed) meter serial number as an encryption key doesn’t feel exactly right. The engineers proposed putting the serial number sticker inside the meter box, so that the user would have to break the seal in order to see it, and to remove the meter number request from the web account creation stage.
Irrespective of your professional opinion about this system, above, the company for whom you performed threat modelling last week pushes forward with it and is intending to hop on the PaaS (Platform as a Service) cloud bandwagon. Essentially, they’re going to ditch their own servers, which are an additional maintenance burden, and are switching to Heroku. The engineers have drawn a corrected data flow diagram that shows the Heroku option. (You are glad to see that it actually implements additional security domains, and segregates the SQL database from the web front-end.)
Heroku (https://www.heroku.com/) is a PaaS platform that runs on Amazon cloud services (https://aws.amazon.com/). Essentially, Heroku provides a managed web application framework - in week 4 exercise, the Node.js and PostgreSQL installations are provided by Heroku, and the developers don’t have to maintain them any more.
However, the customers of the company are in Finland, and the company aims to run the system from Heroku servers residing in Amazon’s us-east-1 availability zone (in plain English, this means that the servers are going to physically run, and the database will be stored, in a server farm in Northern Virginia, USA).
As you recall, it may not be straightforward for a European Union based outfit to export EU customers’ personal data into the United States. The company’s lawyers have identified this as a potential risk; you have been contracted to determine technical privacy aspects of the design.
On top of the DFD from week 4 (i.e., either your own or the reference one we provide), draw the “privacy domains” - meaning, which components are physically within which privacy jurisdiction. The jurisdictions here are Finland (in the EU) and the United States. Ensure that it is clear which data flows cross the physical EU/US boundary.
Then, determine all locations in the diagram that process or store personal data and/or special categories of personal data (see the GDPR) text; the applicable definitions are in Article 4 (1) and Article 9.
Remind yourself about the “TRIM” considerations (see lecture notes) that you can perform for personal data.
What to return
- A data flow diagram that has the privacy domains drawn on top of it. As with the week 4 exercise, you should return the picture electronically.
- A list (a short list with bullets is fine) of:
- What data stored or processed in the system (if any) should be classified as Personal data as defined in Section 3 of the Personal Data Act?
- What data stored or processed in the system (if any) should be classified as Sensitive personal data as defined in Section 11 of the Personal Data Act?
- Answer the following TRIM considerations with maximum 3 sentences per consideration:
- You drew the privacy domain boundaries on the data flow diagram. Who are Controllers and who are Processors in this picture? (See lecture notes).
- What sort of Retention policy would you recommend for the company (i.e., when should personal data be destroyed)?
- Is there a place where you could apply some privacy engineering to ensure that collected data is less likely to leak private information? You do not have to come up with a solution, but where should time be spent?
- Can we Minimise some data transfer or storage? Could we replace some of the (sensitive) personal data flows or storage with something else that would not be personal data, thus freeing us from the requirements of Personal Data Act?
In your submission, please copy the bullets above and write your answer after each one. Please keep your answer to three sentences per consideration.