[This article is authored by Avinash Kumar, II Year B.A. LL.B. (Hons.) student at Dr. Ram Manohar Lohia National Law University. This is the second part of a two-part series evaluating the challenges of handling Data Subject Access Requests (DSAR) under the Digital Personal Data Protection Act, 2023, from the perspective of Financial Service Providers (FSPs). In the previous part, the author analysed the scope of personal data, the right to access, and the tension between privacy and public interest. This part focuses on automation, KYC/AML compliance, and responsible data governance in DSAR management.]

PART-2

I. Introduction

In the first part of the blog, we had an idea of how personal data affects financial privacy. We discussed the changing rules and the challenges financial service providers (FSPs) face as requests for data grow. In this part, we will focus on how automation can help with these requests, the conflict of privacy interest between Know Your Customer (KYC) and Anti-Money Laundering (AML) rules and individuals’ data rights. The role of automation, particularly through Artificial Intelligence and Machine Learning (ML), in data processing and its biases is also examined. I will further highlight the need for strong data management and why it is essential for meeting both regulatory requirements and changing customer needs.

Take on the Automation Tool as a Frontier in DSAR Handling

As presented earlier, modern finance embeds personal data in complex instruments like structured notes and unorganised fragmented data requiring firms to point out what information is truly “personal” and what is just a general product structuring information. To rule out these specifics, financial institutions also streamline the automation of Artificial Intelligence (AI) and algorithms that touch upon the decisions of credit scoring, fraud detection, personalised services recommendation, redaction of sensitive data and others. The hurdle downplays the force when there is a scope of algorithmic bias and many functions operating under black boxes, making it difficult to reach a conclusion on how the decision was made.

Consider the Apple Card Gender Bias Controversy in 2019, the then newly launched Apple Card was drawn into an alleged gender bias in its credit limit assignments. The accusations were based on observations that men were being provided with credit limits at a greater gap than women, even though couples had similar financial profiles and shared assets. Although the regulators declared no unlawful discrimination, the contention was the opaque nature of the machine learning program which service representatives couldn’t thoughtfully explain.

A relevant issue that called for attention was that despite gender not being used as an explicit input, the algorithm might have implemented historical data bias, which could be better termed as “latent feature bias”. Experts delved into the fact that the machine model could have drawn correspondence with male gender proxies like historical credit data, favouring men as primary account holders with a continuous credit history. This when applied to the Apple Card lacking joint account features explains the systemic disparities. Since the credit history might have been primarily associated with a spouse’s account, the individual profile appeared “thin” to the machine program, which would interpret that as higher credit risk, regardless of shared income, assets or tax returns.

This stance of algorithmic bias in FIs must be able to explain at a high level which data of an individual went through a machine learning program and how it contributed to the decision made. Regulations like GDPR (Article 15(1)(h) and Article 22) grant data subjects the right to meaningful information about the interpretation made as well as the envisaged consequences of automated decision-making. The shortfall in providing candid clarification leads to customer dissatisfaction and regulatory scrutiny, jeopardising the organisation’s reputation and credibility.

III. KYC/AML Compliance and Data Subjects’ Rights

Financial institutions focus on customer data through ongoing monitoring, while maintaining long-term records for audit, legal, and investigative purposes. This practice at times conflicts with data principal rights under privacy laws. Data privacy laws make a case for data minimisation, purpose limitation and right to erasure and concern the far-reaching commitment of financial institutions particularly those of AML, KYC and mandatory data retention.

For instance, the Payment and Settlement Systems Act, 2007 and the RBI KYC Master Directions, 2016, authorise the ‘implied consent’ of the customer where appropriate. The DPDPA does not call for ‘implied consent’ or ‘deemed consent’, but it does act as a proponent of ‘legitimate purposes’ under Section 7, permitting the personal data to be processed by the voluntary disclosure or for medical emergency, protection of state interest, compliance with a judgment, trade secrets and others.

This somehow creates a new challenge for institutions. They must interpret the value of laws in undefined scenarios while fulfilling their duties. They are expected to comply with both sets of laws, i.e., regulatory entities and the provision of DPDPA. In times like these, if a data principal subject withdraws consent, the fiduciary must stop processing their data and ensure its erasure. While theoretically, a data principal can withdraw consent and trigger data erasure, in reality, FIs can rarely erase all personal data immediately due to overlapping legal obligations. The laws requiring retentions also diverge by industry-specific, for example, the Prevention of Money Laundering Act and The Indian Computer Emergency Response Team CERT-In Cybersecurity Directions facilitate a retention period of five years. Along with that RBI’s Master Directions on Prepaid Payment Instruments require a ten-year retention.

The GDPR under Article 6(1)(c) exemplifies the processing of personal data when “processing is necessary for compliance with a legal obligation to which the controller is subject”.The right to erasure when curtailed by the provision of financial regulatory retention policies retaining the identification records, transactional data and much more for a specific duration of legal obligation, FIs must act pre-emptively against it. It’s important to have a clear retention schedule along with maintaining an accurate inventory of data assets, noting their location, format, and legal basis for retention. Over time, when the retention period expires and if there is no other prevailing legitimate purpose for continued retention, the data should be erased. This could be advanced by implementing systems that automatically flag and delete data once retention periods expire.

That being said, these are not enough to grant financial institutions carte blanche. They must adhere to the principle of data minimisation which means only collecting personal information that is necessary and relevant for specific purposes, like KYC and AML. This ensures that any data processed is essential for its intended use. In tandem, the principle of purpose limitation provides that data collected for AML/KYC compliance shall not be regrouped in a section of data used for incompatible activities, be it even marketing unrelated financial products and services. This collection of excessive and irrelevant data under the guise of KYC/AML purposes when coupled with a basis not shielded by legal obligation, would potentially breach data protection principles.

The European Data Protection Board, has voiced apprehension around AML and counter-terrorist financing (CFT) laws. The papers signal that data on a large scale is being shared by private corporations for AML purposes in the absence of essential safeguards. The Spanish Data Protection Authority (AEPD) fined Open Bank €2.5 million. A key reason for the fine was that the bank asked for sensitive AML verification documents through unencrypted emails and did not follow the required Data Protection Impact Assessment (DPIA) for processing this data.

These factors undermined the provisions of security under Article 32 of GDPR, and requesting data via an unprotected channel does not comply with ‘protection by design and default’ under Article 25 of GDPR. DPDPA has also implemented the principle of Privacy by Design, embedding the privacy considerations at every stage of the system. Institutions are supposed to provide a mechanism that is user-friendly and easier for data principals to submit requests for access, opt-out and erasure. This should be as simple as it was to give consent in the first place.

These FSPs are also required to comply with Section 70B (6) of the Information Technology Act, 2000, adhering to information security practices, procedures, prevention, response and reporting of cyber incidents for a safe and trusted internet. The institutions have to reach out to CERT-In in certain cybersecurity incidents involving data breaches and data leaks within six hours of their occurrence. This would ensure FSPs comply with information security practices, procedures, prevention, response and reporting of cyber incidents for trusted digital banking usage.

IV. Responsible Data Practices

Featuring fewer constraints on customers accessing their data a strong data governance is the backbone on which effective DSAR management is to be built. Without understanding what data is being held, in what corner of the system it is housed, whose rights are linked to it, how it ought to be managed and timely responses to these requests demand a substantial effort.

Without a comprehensive data inventory, organisations face challenges in locating personal data, resulting in delayed responses to DSARs and potential non-compliance to privacy laws. Poor data governance also leads to siloed and inconsistent data, complicating management. This fragmentation increases operational risks and compliance costs compromising decision making due to reliance on inaccurate information. Documenting how data enters the institutions and moves between the system and is processed and ultimately where it is transferred is to be mapped. Data sprawl could be managed through the categorisation of data and each classification with levels of security, access control and retention policies is to be applied in specific datasets.

For instance, personal data, including identifiers and financial records, must often be retained for specific periods based on regulations like GDPR, which allows retention only as long as necessary. Financial regulations typically require retention for 5 to 7 years for audit and tax compliance. Business records may also be kept for 5 to 7 years, while transaction logs are usually retained for 1 to 2 years for operational purposes. Public data can be maintained indefinitely unless it becomes obsolete. Each retention timeline should be documented in retention schedules, which need regular review and enforcement.

A lot of processes that are automated need to be under the eye of Role-Based Access Control, which ensures that employees are responsible for quality, security and compliant use of data. In India, as mandated by law for Significant Data Fiduciaries under Section 10(2)(a) of DPDPA, Data Protection Officers are to oversee the strategies being followed and how they are in line with relevant regulations, acting as a point of contact for the data subjects and supervisory authorities.

V. The Blend of Tech Efficiency and Human Judgment

In sectors where data volumes and their complexities are high, financial institutions are deploying AI and ML. AI does enhance data discovery rummaging through the datasets, including shadow data consisting of emails, scanned loan documents, and call transcripts. The financial institutions are to ensure that these shadow data processing and disposals are done in a way that minimises risk.

IBM recommends that organisations adopt a data-first security approach by primarily conducting data inventory and classification to identify and prioritise sensitive information. Encrypting this data ensures it remains unreadable if accessed by unauthorised parties. To continuously monitor data locations, usage, and access patterns and detect risks such as shadow data or policy violations, Data Security Posture Management (DSPM) should be considered. Finally, enforcing policy-driven secure disposal ensures that outdated or unused data is properly destroyed, preventing unauthorised recovery and reducing compliance risks.

Further, IBM report on Cost of a Data Breach, 2024 sheds light on how 35% of breaches involve shadow data. This type of data is difficult to track and protect because it is stored for a long time. A subset of AI, Natural Language Processing (NLP), has marked a significant breakthrough. NLP tools automatically classify documents, like distinguishing between loan applications and help desk complaints, and with the help of Named Entity Recognition, can correlate documents (e.g., linking an account number to a specific customer and transaction).

ML models can identify patterns of personal identifiers which could improve the identification of data subjects. Companies like Cognaize have reported 99.9% accuracy in data extraction and redaction from complex reports, which were AI-driven. But looking at the possibilities of algorithmic bias and also how their performance depends on the input of quality data, a human-in-the-loop approach would ensure the quality and fairness of output data.

VI. Conclusion

As the demand for DSARs increases and regulations become more stringent, FSPs need to build clear and scalable data governance systems. AI can surely help make compliance easier and lower costs, but it should be used carefully with proper accountability and human supervision.

With the need to balance the demands of privacy and regulatory compliance, organisations need to proactively map data flows, enforce retention policies, deploy advanced discovery tools and maintain audit logs. Though it is yet to be seen how these FIs will commit to safeguarding the sensitive information in the coming digital age, organisations must have to reinforce the customers’ trust in the evolving algorithmic environment.