Hackathon 2023: Gen AI for Data Management

Hackathon 2023: Gen AI for Data Management Cover Image

See the
in action

The world of Gen AI is upon us. Across most industries, decision-makers are tasking transformation teams with bringing Gen AI functionalities to their organizations. Businesses are beginning to implement customer service assistants, task managers, language processing engines, and much more at an exciting rate. At the same time, companies are investing in high-quality data to ensure the success of their AI projects.

Pioneering AI for data management

Here at Ataccama, we've been leading the charge toward fully automated data management for years. Our platform Ataccama ONE was designed with AI at the core even in its first release in 2017. In 2020, we launched the second generation of Ataccama ONE, enabling unparalleled automation of monotonous manual work and pushing the rest of the market to deliver a “self-driving” experience for users. 

We have always understood the potential of automation to accelerate innovation, ensure security, and destroy tedious tasks. That's why we prioritized it in our product development – heralding solutions for our customers who can leverage AI for increased business value.

Our AI journey began with traditional intelligent algorithms, progressed to incorporating nuanced machine learning capabilities for data quality using classification and regression techniques, and eventually evolved into fully autonomous features like domain detection or anomaly detection. Today, the journey continues with natural language interfaces powered by advanced Large Language Models

Roman Kucera

With the recent commercialization of LLM's, software companies are rethinking how generative AI can enhance the user experience. Our teams of data scientists, AI developers, and software engineers wanted to push the boundaries of their imagination and unlock the full potential of this technology. That's why we introduced the AI Hackathon.

Tackling new technologies through crowdsourcing

Between June 9th - 11th, 2023, Ataccama hosted an internal hackathon. During three days of uninterrupted work, we tasked our team members and colleagues with inventing innovative solutions for our products and processes using the latest generative AI technologies.

We provided each team with the Azure OpenAI service and a crash course on prompt engineering and the usage of LangChain. The only guidance was to align their inventions/hacks with Ataccama's core values: transparency, unconventional thinking, teamwork, industry value, and challenging fun.

Ten teams joined the race, filled with a diverse array of Ataccamers, from our most technical employees to people that have never coded before. To our delight, each team brought a unique project with exciting capabilities and plenty of real-world applications. Let's look at some of the most exciting projects, learn from their experiences, and learn how we can benefit from their success.

The Winners: A chat-like AI assistant for Ataccama ONE

The winning team developed a Generative AI chat assistant for the Ataccama Platform. The assistant would be available on every screen of the data catalog. It can automate several tasks/capabilities of the ONE platform by tasking an AI using plain text. They labeled their chat assistant the "Atamate," and demonstrated several of its capabilities, including:

  • Summarizing all the information available about a selected data asset.
  • Enriching various metadata (like table descriptions) based on other contexts.
  • Suggesting new data quality rules.
  • Finding similar datasets in the catalog.
  • Running DQ or profiling jobs on a selected table.
  • Running functions in bulk on multiple tables at the same time.

Frame 1

Challenges and Discoveries: They wanted to create a tool for Ataccama ONE that felt like it was helping you manage your data, not just answering questions about it. Some challenges they ran into along the way included:

  • Trouble testing the AI models. As data cataloging is a complex problem, it's challenging to get relevant data of a high enough quality to validate these ad-hoc prototypes effectively. Also, these AI models, by nature, have a different output every time, making automated testing more difficult in such a tight time frame.
  • Trouble with tooling. Generative AI companies have created a lot of excellent tooling features that allow you to develop projects with LLMs much faster. However, a lot of this tooling is in the very early stages of development, making it harder to work with. The Langchain library (their chosen AI tool) could have been more usable for real-life scenarios. The team basically had to re-implement it.

However, once they thoroughly worked out the features, they found that these models were less of a "black box" than other AIs because you can actually ask the tool what it thinks and how it came to a decision.

What it means for Ataccama

An Ataccama Virtual Assistant (AVA) is coming in V15 of the ONE platform, and we're using much of this team's code/work as a reference. A virtual assistant, powered by AI, allows users to take advantage of the powerful capabilities of Ataccama One without the need to become product experts. 

Users will have access to the most advanced capabilities regardless of their level of expertise. 

We've always strived to make our tools as self-service as possible to shrink the barrier between technical and non-technical users. A chat assistant will be the next great leap in that direction, providing a secure space where a user can prompt the assistant using plain (easy to write and interpret) messaging and speed up repetitive tasks (in bulk) with no concern about what happens in between. 

Runners-up and interesting projects

The winning team wasn't the only one with interesting discoveries. Here are two other fabulous projects and lessons learned, and a sneak peek into how they’ve been incorporated into our product roadmap.

Writing data quality rules in plain English

Typically, the way you write rules for Ataccama ONE is by using our Ataccama expression language. One of the teams in our Hackathon decided to make this process less technical by creating plain English to Ataccama expression language converter.

Named the "ExpreBot," the tool works by a user prompting it with a rule, which it then converts into Ata documentation. It works so well that it can even find the right data column or reference data to ensure your rule applies to the appropriate sets. The bot can also validate the syntax of new rules, determining whether it is a valid Ata expression or not. If the bot found an error during the validation, it could use an error description to correct itself. This feature overcame one of the primary limitations of language models because they can occasionally hallucinate. Finally, ExpreBot also explains your rule in plain English so that you can validate it yourself.

Frame 2

What it means for Ataccama 

This is a great example of how using code generation, powered by large language models, will enhance the usability of business applications in the future. In our case, this capability will simplify the rule-creation process for all users of Ataccama One.

The use of code generation within business applications will enable users to provide instructions in natural language and have the application generate the correct inputs. This allows the user to care about logic, not syntax. In the future, we expect to combine this capability with Atamates AI-powered rule suggestion capabilities. This will lead to a world where data management platforms provide high degrees of end-to-end automation.

AI answers questions about Ataccama products

Like many companies, we see the potential of generative AI to improve customer service. That's what our third team had in mind when they developed a chatbot that can answer questions about Ataccama products based on our product documentation.

They aimed to create a bot that was fully integrated into the Ataccama ONE platform with a natural language interface for user documentation. They tested it by feeding it a series of prompts with varying degrees of difficulty and then validated the answers to ensure their accuracy.

By the conclusion of their project, the bot was readily available to answer a wide array of questions ranging from "How do I add a new DQ dimension?" to "Are there any changes regarding Power BI support in V14.3?" and several other instances, encompassing the entirety of the support materials we have for all the latest versions of the platform.

What it means for Ataccama

This bot (and others like it) makes information access easier by leveraging natural language queries anyone can write. This means less digging through technical documentation, asking additional questions, becoming frustrated with FAQs, and having to connect long pieces of information from different sources.

We're in the final stages of productizing this capability which will mean Ataccama customers can get faster answers to questions about the product and instead spend more time delivering value to their business.

The future of Gen AI for ONE

This hackathon was just a kickoff as Ataccama moves forward and continues to explore the world of generative AI. There are many more exciting things to come for us as a company and for the data management discipline as a whole.

We're taking a two-pronged approach to Gen AI. We are in the midst of implementing several features (similar to the projects mentioned above), all of which increase automation and make everyone's work life easier with AI co-pilots and assistants. Some of these features in development will be available in our upcoming major release.

However, these short-term improvements are just the beginning. Gen AI tech will turn the data management industry on its head, and we’re taking a driver's seat to innovate our products as the technologies evolve. Our teams of AI engineers are heavily engaged in the AI technical community and are working on some groundbreaking ideas to transform the lives of our users. More coming. Stay tuned!

You don't have to take our word for it. Join us for a fireside chat to hear from groundbreakers in the industry – OpenAI and (our client) T-Mobile – as they chat about how Gen AI is changing the data management landscape. 

Get insights about AI for data management in your inbox

See the
in action

Get insights about data quality in your inbox Subscribe

Related articles

Arrow Right
Arrow Left
How automated data quality works

How automated data quality works

Automation is the future of data management, and metadata plays a…

Read more
The Importance of Data Quality for AI

The Importance of Data Quality for AI

Learn why data quality is key to successful AI implementation.

Read more
The Evolution and Future of Data Quality

The Evolution and Future of Data Quality

Learn about the evolution of data quality from SQL to the Data…

Read more