Filtering out the �oops� factor

If you use e-mail you know how easy it is to send a message to the wrong person � or even dozens of wrong people at the same time. As increasingly more medical information is stored in electronic formats rather than traditional paper files, it can become just as easy for details about your health to be sent somewhere you might not want them to go.

Stan Matwin, a professor at the School of Information Technology and Engineering (SITE), is already looking at ways of heading off such violations of our privacy.

�I�m not even thinking about malicious behaviour, but rather simple human error,� says Matwin. �It�s really to monitor for mistakes that will happen in an exchange-intensive environment.�

Matwin�s research is being funded by a $150,000 grant by Communications and Information Technology Ontario (CITO). He is working closely with law professor Michael Geist and SITE assistant professor Nathalie Japkowicz.

They will also be working with AmikaNow! Corporation, an Ottawa-based firm specializing in e-mail risk management, which could help turn the outcome of this research into a commercial product.

For Matwin, the problem is similar to that of writing software to prevent spam from getting onto an e-mail server. In this case, though, the goal is to keep some messages from ever being sent.

�We�re looking at examples and counter-examples of what is and what isn�t a privacy-compliant document,� he says. �We�ll feed those into our systems and we�ll get some type of a filter that we think an organization could deploy on its e-mail server to use as a monitoring tool.�

Matwin is getting those examples from Geist, one of the country�s leading experts on privacy and information technology. Such expertise is essential to this work, since new legislation are significantly altering the legal status of a great deal of personal data.

At the beginning of the year, for example, the federal Personal Information Protection and Electronic Documents Act (PIPEDA) began to impose guidelines on how organizations could gather and use this kind of data. Now the Ontario government is debating Bill 31, which would specify further guidelines for the use of medical information.

Such changes have made the careful handling of electronic health files a leading priority for medical organizations of all sizes and specialities. Health authorities have welcomed electronic record-keeping, because it improves the portability and adaptability of medical files. These files can travel far more efficiently among the many professionals who often collaborate on a patient�s case. Yet the growing volume of these electronic interactions presents its own set of challenges.

Matwin and Japkowicz specialize in machine learning, composing rules for computers to manage a stream of data such as e-mail messages. Expressing such rules is made all the more difficult by the fact that something which does not appear to be private can nevertheless compromise privacy.

Matwin recalls a case in the U.S. where a database of health records was �anonymized� for research purposes by removing all identifying details except for date of birth and zip codes. In spite of this step, however, a researcher was able to pick out individuals on the database by cross-referencing it with other databases containing local birth records and postal addresses.

The technology that has made possible this kind of �data mining� is inherently neither good nor bad, Matwin argues. Nevertheless, he admits feeling a moral obligation to help mitigate any negative consequences.

�We got the genie out of the bottle by providing all these computers, making the collection of the data easy and connecting all these data sources so it�s really easy to put things together,� he says. �If we�ve created this environment, then maybe some of us can think of how to create technical means that will counter these possible abuses.�