We use cookies to give you the best experience and to help improve our website

Find out what cookies we use and how to disable them

ISO/TC 37/SC 4 N 1769, ISO/PWI 24620-5 Language resource management -- Controlled human communication (CHC) -- Part 5: Lexico-morphosyntactic principles and methodology for personal data recognition and protection in texts (DataPro)

Scope

The proposed deliverable specifies the basic principles and methodology to be used to recognize personal data written in free text in different languages and countries.

The applications are essentially for protecting human data circulating in national and international industries, and private and public organizations.

The principles of data protection should apply to any information concerning an identified or identifiable natural person. An identifiable natural person is one who can be identified, directly or indirectly, in particular for the proposed standard by reference to an identifier such as, for example, a name, an identification number, a bank account number, location data such as residence address, an online identifier. Excluded from this standard are factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

The proposed deliverable which is directed to processing by human beings and/or automated processing is also applicable to different domains (for example: law, finance, health, etc.) but excludes automated image processing.

Purpose

The exchange of personal data between public and private actors, including natural persons, associations and undertakings is continually increasing. Rapid technological developments and globalization have brought new challenges for the protection of personal data. The scale of the collection and sharing of personal data has increased significantly. Technology allows both private companies and public authorities to make use of personal data on an unprecedented scale in order to pursue their activities. Natural persons increasingly make personal information available publicly and globally. Nevertheless, technology has transformed both the economy and social life, and should further facilitate the free flow of personal data within, for example, the European Union, as well as the transfer to and between other countries and international organizations, whilst ensuring a high level of the protection of personal data. These developments require a robust and more coherent data protection framework.

Effective protection of personal data requires the strengthening and setting out in detail of the rights of natural persons as data subjects, and the obligations of those who process and determine the processing of personal data, as exemplified by the European Union’s General Data Protection Regulation (GDPR): https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX: 32016R0679&from=EN#d1e1374-1-1

The principles of data protection should apply to any information concerning an identified or identifiable natural person. An identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as, for example, a name, an identification number, a bank account number, location data such as residence address, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

In this context, numerous industries, governmental bodies, private and public companies or organisms need to variously hide (mask), remove, anonymize or pseudomize personal data before texts containing such data are processed.

The purpose of this proposed standard is to provide principles and a methodology to detect and identify personal data so that it can be hidden or suppressed, that is, protected before transmitting or/and processing a text containing such data. The problem is not so much the one of suppressing or hiding the data but the one of recognizing personal data in a text. Unlike personal data in text, personal data in structured data is not a real problem as such data is easily recognizable. 

The market is that of national and international micro, small, medium and large-sized enterprises, and also private and public bodies processing text which could contain personal data in all domains and languages and from different countries (work is already underway with academics and enterprises in different domains, countries and languages).

Comment on proposal

Required form fields are indicated by an asterisk (*) character.


Please email further comments to: debbie.stead@bsigroup.com

Follow standard

You are now following this standard. Weekly digest emails will be sent to update you on the following activities:

You can manage your follow preferences from your Account. Please check your mailbox junk folder if you don't receive the weekly email.

Unfollow standard

You have successfully unsubscribed from weekly updates for this standard.

Error