#Debate: Anonymising data in retail: CSR challenge or financial risk ?
Welcome to the “Debatech” series of articles, in which we present our thoughts on topics related to tech and retail. Our aim? To share our findings and discoveries in order to advance public debate. This month’s author is Max Barbet, co-founder of YOUrban (2 Liens). With a background in information science from EPITECH, he offers a unique perspective on a controversial debate: personal data. In this article, you’ll find out how he sees the issues facing the retail giants, and his ideas for solutions. Without further ado, read Max Barbet’s article in Débatech.
Photographie d’ Ivan
The use of personal data for marketing and data analysis has become commonplace in recent years. Aside from the enthusiasm for the prospects that this use represents in terms of marketing performance, it has also given rise to a number of controversies, particularly concerning the protection of privacy.
In response to these concerns, data anonymisation has emerged as a solution for protecting users’ identities, but what is the reality?
Is data anonymization a key issue for businesses ?
Data anonymization is frequently used to protect users’ privacy. However, does the use of anonymized data maintain its performance? What impact does anonymization have?
Anonymisation: a CSR risk or opportunity for businesses ?
Data anonymization can be seen as a viable solution for protecting privacy while enabling companies to improve their marketing and sales performance. However, it is also true that under certain conditions it can lead to a loss of accuracy and relevance – particularly if the data is used to drive predictive models on typical customer profiles.
It is therefore important to bear in mind that, before taking any steps, it is necessary to clearly determine the objectives of the algorithms and methods requiring this data, in order to minimise the risks of loss of accuracy and relevance.
Data anonymization can be seen as a viable solution for protecting privacy while enabling companies to improve their marketing and sales performance.
However, it is also true that under certain conditions it can lead to a loss of accuracy and relevance – particularly if the data is used to drive predictive models on typical customer profiles.
It is therefore important to bear in mind that, before taking any steps, it is necessary to clearly determine the objectives of the algorithms and methods requiring this data, in order to minimise the risks of loss of accuracy and relevance.
Anonymisation, a viable solution that has already proved its ability to combine ethics and performance
This risk can vary considerably depending on the algorithm used and the data to which it is applied.
For example, the study “An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques” by C. Eyupoglu, M. Aydin, A. Zaim and A. Sertbas demonstrates the challenges of personal data security when it comes to Big Data.
In this study, a data anonymization algorithm based on chaos and perturbation has been proposed to preserve the confidentiality and usefulness of voluminous data.
The performance of the proposed algorithm is then evaluated in terms of Kullback-Leibler divergence, probabilistic anonymity, classification accuracy, and execution time.
Experimental results showed that the proposed algorithm is efficient and gives better results in terms of Kullback-Leibler divergence and classification accuracy than most existing algorithms using the same dataset.
Conclusion: Resulting from the application of chaos to perturb the data, this successful algorithm is promising for use in data mining and helps to harmonise ethics and confidentiality.
Is it possible to design simplified models that combine data anonymisation and performance?
On a purely technical level, it is also possible to design models that work with average characteristics that are representative of a group (or “persona”) rather than with an average of precise characteristics retrieved online.
Secondly, it is necessary to take into account the strategic management of change that this strategy imposes: it is absolutely essential to train employees in these strategic choices in order to reassure them that performance will be maintained.
Photographie de Tima Miroshnichenko
What strategy should be put in place to combine CSR, performance and RGPD?
As well as recruiting a technical profile that is aware of RGPD and algorithmic performance issues, it is essential to educate yourself independently about these challenges by reading the literature on the subject.
Here are some of my recommendations for further reading:
Recommendation 1: “Business Intelligence and Analytics: From Big Data to Big Impact” by Hsinchun Chen, Roger H. L. Chiang and Veda C. Storey
Key words: big data, business intelligence, decision support
This study explores the trends and challenges of data analysis in the context of big data management. It looks at how businesses can leverage this data to improve performance and make better decisions through data analysis.
Recommendation 2: “Data privacy and security in business intelligence and analytics” by Delgado Mercè, Jaime.
Key words: RGPD, cybersecurity, business intelligence, data science
This study explores the issues of privacy and data security and data analysis. It examines the risks associated with the use of personal data and the methods that can be used to protect privacy and data security, such as data anonymization and access rights management.
Recommendation 3: “Business intelligence meets big data: an overview on security and privacy” by C.A. Ardagna and E. Damiani
Key words: big data, security, privacy, business intelligence, cyber-security
The study “Business Intelligence Meets Big Data: An Overview on Security and Privacy” by C.A. Ardagna and E. Damiani explores the issues of security and privacy in the context of big data management and the use of business intelligence. It examines the risks associated with the use of personal data and the methods that can be used to protect privacy and data security, such as data anonymisation and access rights management. The study also presents a literature review of the main security and privacy protection approaches used in the field of business intelligence and data analysis.
Are accuracy and performance necessarily linked ?
Performance should not be confused with accuracy: while personal data can provide details of typical profiles, it does not necessarily improve sales results.
The example of retail is particularly telling: to improve sales, it’s not a question of understanding the expectations of several isolated individuals. Above all, it’s about identifying the common denominator that will enable you to reach a broad customer target and achieve your growth objectives.
Photographie de Lucas Pezeta
So it’s important to make sure that we reject the common belief that associating relevance for the company with accuracy in customer data. It is perfectly possible to achieve maximum performance while minimising the recovery of personal information!
A final word
At YOUrban, we firmly believe in respect for privacy. That’s why the performance of our technological models does not depend on personal data. All our data is anonymised, from A to Z.
This article by no means represents a complete state of the art on the subject. It does, however, aim to provide some initial food for thought for ambitious companies that want to put the protection of customer data at the heart of their digital transition!
MINIBIO
Max Barbet has been working in high-impact tech for 4 years: web development, data analysis, artificial intelligence, SaaS architecture, Max is a true Swiss Army knife… Today, his mission is to give life to YOUrban’s ambition: to relaunch retail, thanks to the power of data.
To do this, he studies numerous data sources, compares formats to choose the most relevant, and monitors technical development with the aim of combining two key words: ethics and performance. In this article, he shares his thoughts on how to use them.