Applying Data Science to Social Engineering – good or evil?
Social engineering, defined by Wikipedia as “efforts to influence particular attitudes and social behaviours on a large scale, whether by governments, media or private groups in order to produce desired characteristics in a target population” has been around for a long time. Historically, governments have practised it, such as China’s “Great Leap Forward” and “Cultural Revolution”, India’s population control programs that pay women to be sterilised, and Singapore’s attempts to improve public behaviour (banning chewing gum, severe punishments for drug use). Social engineering efforts have traditionally been overt, for example on the spot fines for littering or more subtle for example, posters urging people not to litter, placing bins at convenient places.
However, what is new is the application of data science to social engineering. Data science is defined as methods to gain “meaningful information from large sets of complex data ” and includes techniques such as predictive analysis, statistical analysis, data mining and machine learning. Masses of data is being collected by search engines, social media, website cookies, apps and CCTVs. The technology is here now to link these collections as input to social engineering programs to influence social outcomes.
In authoritarian China, a system of ‘social credit’  has been trialled and will be fully operational by 2020. Social credit is a personal scorecard for China’s citizens and is fed by surveillance cameras equipped with facial recognition, body scanning and geo-tracking to cast a constant gaze over every citizen. Those with a high scorecard will have access to cheap loans, fast track to the best universities and jobs and those at the bottom will have trouble accessing credit, government jobs and may be banned from travel. Even worse, it is not just the behaviour of an individual that is taken into account, but also those they associate with, for example their spouse, parents, siblings and even friends. Having a friend who is a dissident may reduce your social credit, thus bringing pressure to disassociate with them.
While we may not live in a dictatorship in Australia, data science is assisting the government in more subtle social engineering programs. NSW Premier & Cabinet established a Behavioural Insights (BI) Unit  in 2012 to “improve the effectiveness of public services and policy by applying what we know about the way people act and think”. The BI Unit works with agencies and partner organisations to understand a range of social problems (for example, improving court attendance for domestic violence, reducing peak hour commuting), design solutions and trial interventions to identify what works. Data analysis is used as one way to understand issues although the BI Unit relies more on qualitative interviews and site visits. Data and statistical analysis is also necessary to determine how the interventions stack up against each other.
The NSW BI Unit is modelled on the UK’s Behavioural Insights Team (BIT) which was part of the UK Government but is now an independent venture. The BIT inaugurated a data science team in Jan 2017 with a focus on ‘predictive modelling’. This usually consists of gathering a large set of historical data, using computer algorithms to find patterns in the data that it would be impractical or impossible for a human to find, and then using those patterns either to understand the process in question or to predict where and when specific events are likely to happen, in order to plan for and respond to those events.
Using predictive modelling coupled with machine learning , the BIT were able to analyse which factors contributed to a school’s risk profile and were able to improve the effectiveness of school inspections through better targeting of high risk schools. Other projects include assisting children’s social workers to accurately escalate cases and predicting the severity of collisions from factors such as road, driver characteristics and vehicle type. The results demonstrate the great benefits that can be achieved across core government services.
Despite the enormous potential to benefit citizens, there are disquieting aspects to the application of data science to social engineering. Dr Ian Oppermann, the CEO of NSW Data Analytics Centre (DAC) has said  that the urban renewal project to determine who lives where and with whom in South Sydney has ‘genuinely frightened him’. The project is to provide household composition, information normally covered by census every 5 years. The latency of the data is however, an issue for urban planners and the project takes data on both inward and outward migration, type of home and school, to determine if there are enough schools and hospitals for the area. Working with telcos, banks, car-share companies, it is now possible to get it down to 30-minute intervals of not only who lives where with whom, but who travels in, who travels out, who travels around, or who stays put.
Potential deidentification and misuse of this data is indeed frightening - even for a benevolent, democratically elected government.
Doll Martin Associates has extensive experience in data protection and privacy impact assessments. We are also data professionals who understand the challenges of applying data science to business problems (see this article).