Toronto Public Library’s EY Park talks AI, Data & Ataccama

Toronto Public Library’s EY Park talks AI, Data & Ataccama Cover Image

See the
in action

With an impressive circulation of over 12 million items in its collection and 100+ branches, Toronto Public Library (TPL) has earned its reputation as one of the largest libraries in North America.

EY Park, Head of Information Management and CRM at Toronto Public Library shared with us her thoughts on libraries, the accomplishments of TPL and its future, and how increasing volumes of data will be managed as technology advances.

“What you would normally think of as a library, as an institution of knowledge, has changed a lot,” says Park. “Libraries in this day and age offer multiple items, services, and spaces beyond the traditional book and shelf.”

One of the biggest challenges libraries face today is pushing past what Park calls the notion of relevance. To remain relevant, says Park, the services offered by TPL need to make sense for the populations surrounding a given branch.

TPL proudly offers both tangible and digital literary and audio-visual items, in addition to in-house artifacts, printers, electronics, seminars, tutorials, and meeting/special event spaces. Moreover, the TPL team is actively pursuing innovation by looking at all of the library systems from a relevance perspective to ensure these systems remain a pillar of all communities, including those that need libraries as a public service.

The data challenge

With 100+ branches, the library sought to merge multiple municipalities and transaction systems, including data for tracking resources and data of millions of library users from the greater Toronto area and around the world.

“Bringing together very different library systems, data management practices, data quality rules, business rules, and more into one is a textbook example of what can cause bad data quality,” said Park.

To meet its modernization goals and anticipate, address, and implement best practice data management, TPL undertook a two-tiered, comprehensive data quality management project with Ataccama.

  1. Use Ataccama ONE Data Quality Management to discover and resolve disparities through automated data cleansing. This phase is currently in deployment.
  2. Integrate TPL’s current library interface with Ataccama ONE to allow its AI and machine learning capabilities to keep information about the current data quality state readily available. This will increase the speed and quality of decision making for the organization.

“Ataccama’s DQ solution is and will be a much anticipated and needed solution to managing complex and longstanding data quality challenges at TPL,” states Park.

It is certainly not news that most library systems have data quality issues. A major challenge they face is the lack of knowledge that data quality solutions are available and applicable to them. For TPL in particular, with high staff retention and team members that have been proud to work for the library for decades, adding personnel with a DQ industry background is a relatively new development.

TPL’s employees have inherited knowledge from their former colleagues and hold a great deal of tacit knowledge. However, Park explains that when left undocumented, tacit knowledge is likely to be lost over time. From a data management perspective, this is detrimental to the data lifecycle, and ultimately results in data corrosion. The lifecycle of data needs to be shifted. This is the ultimate goal of Park’s teamensuring that this data is clean, accurate, and ready for further use.

Libraries, tech & AI—for the public

Contrary to popular belief, libraries have become quite technologically savvy. TPL’s strategic planning is keenly focused on advancements such as a data cloud service stack for an analytical data warehouse, and implementing a data quality workflow.

All the while, however, the library must ensure the first priority is the needs of the public, such as makerspaces, digital innovation hubs equipped with green rooms, 3D printers, high-powered macs with graphic design software, and more. As Park confirms, “You would be surprised at how relevant the library really has become.”

Among the changes in technological advancements is the trending topic of Artificial Intelligence (AI). Ataccama and others leverage the benefits of AI to improve functionality and efficiency. Park believes that AI is an interesting issue in terms of applying it to a public service context. Currently embarking on a Master’s in Information and Ethics, Park is no stranger to the future of AI. “I personally believe AI and its algorithms are required to bring efficiencies to repeatable and predictable processes, especially when it comes to information and data management,” states Park.

The future of Toronto Public Library will involve ongoing progressions to future-proof modernization practices that will set a precedent for other library systems. TPL will have access to data that is succinct, validated and trusted. The streamlined data will effectively eliminate duplications of records and allow for cohesive data management processes across all of TPL’s different library systems.

“Our data cleansing and mastering goals are simple—to enable effective reach and understanding of the library patrons,” says Park.  With clean master data, the library's operational and strategic planning will be significantly more effective and based on confidence in the quality of its data.

About Ataccama

Ataccama delivers augmented data management with Ataccama ONE. It’s a robust platform integrating Data Discovery & Profiling, Metadata Management & Data Catalog, Data Quality Management, Master & Reference Data Management, and Big Data Processing & Data Integration. Ataccama ONE gives you the option to start with what you need and seamlessly extend as your business requires. The first step is free—try our one-click data profiling trusted by 55,000 users globally at

See the
in action

Get insights about data quality in your inbox Subscribe