Prepare Your Data For Conversational AI With Schema.Org

By Jürgen UmbrichPublished On: August 19th, 2020Categories: conversational ai, knowledge graphs

When users communicate with chatbots or voice assistants in natural language based dialogues, advanced artificial intelligence techniques are used in the background. This is the field of the increasingly important discipline of Conversational AI.

Conversational AI relies on the presence of complete as well as well-structured data that is (mostly) modeled in the form of knowledge graphs. If you want to implement a chatbot or voice assistant as an organization, the following question arises: How can and should existing data (e.g. content on the website, internal knowledge database, customer service documentation) be structured and prepared, so that it can be used easily and quickly by Conversational AI?

Probably the best framework for this are the definitions and schemas of schema.org.

What is schema.org and what is the goal of the initiative?

Schema.org is currently the quasi-standard for the annotation of HTML files with semantic information. In 2011 schema.org was launched as a common initiative by the leading search engines Google, Bing, Yahoo and Yandex (leading search engine in Russia).

At that time, the goal was to create a common standard that enables the annotation of HTML files with information that makes it easier for (search) machines to read.

Above all, it is the lead management of Google that has ensured that a great number of companies have already joined the structuring of their data in accordance with the requirements of schema.org.

Companies choose to structure data according to schema.org because they want Google to achieve better rankings for their pages. Currently, the future of Conversational AI and voice search is hardly included in this calculation, but with schema.org you are starting off on the right foot.

The earlier companies use schema.org, the better!

We are convinced of that. Why? Because the high level of acceptance and widespread use of schema.org’s guidelines in the current web, as well as Google’s dominance as a search engine and with its voice assistants the Google Assistant, suggest this development.

Google indexes structured web content according to schema.org in order to optimize the answering of search queries for users. At the same time, Google also feeds its own Knowledge Graph, which is the basis for answering voice queries using the Google Assistant.

The increasing change from text-based to voice-based search (Voice Search) increases also the importance of Conversational AI. Conversational AI, however, requires clearly structured data in order to provide natural language-based and relevant answers or to be able to perform desired actions.

Companies that want to be found in the upcoming Voice First era must provide their data and content in a structured form. Following the guidelines of schema.org is the optimal preparation for this.

Use existing structured data for Conversational AI

Many companies have already prepared their web content at least partially in accordance with schema.org. A big advantage is that this already enables better Google rankings. This is advantageous in the context of Conversational AI because this data simplifies the development of chatbots and voice assistants.

In cooperation with a specialized expert such as Onlim, the basis for assistants operated by Conversational AI can then be laid in a Knowledge Graph by modeling the data, usually with simple additions or enrichments.

Structuring according to schema.org brings multiple benefits

Some companies have not yet structured their web content. In this case, the implementation of schema.org should be started immediately. Because, as we always advise our customers, you can kill two (or more) birds with one stone.

First, you optimize your website for search engines like Google and Bing. The structure according to schema.org already makes sense as a pure SEO measure. The inclusion of this data in a knowledge graph in the next step enables the full potential of Conversational AI to be developed.

By the way – if you would like to learn more about the role of knowledge graphs for Conversational AI, we invite you to check out our whitepaper.

How Onlim’s Conversational AI builds on schema.org

Here at Onlim we decided to also use schema.org as a framework for structuring data. We decided to do this for a number of reasons. Of course, always considering which decision would be the most beneficial for our customers in the long term.

We can immediately access or use the abundance of existing data structured according to schema.org on the web. Depending on the application, we expand and complement the already structured data for our customers.

We collect structured data, aggregate and connect or enrich it. And then use it to create the training data for Conversational AI in the form of a Knowledge Graph.

This can be illustrated briefly using the example of a hotel: If a user asks “Is this hotel open?” the intention of asking about the opening times of the hotel is recognized. Accordingly, a chatbot or voice assistant can access the current opening hours on that day and offer them as a response.

Further advantages of using schema.org

At the same time, we can also offer our customers to publish or make structured data available through our Knowledge Graph. Customers can integrate this data into their website and thus improve their search engine rankings.

In addition, companies could easily integrate external data into their own ecosystem on a schema.org basis. For example, event organizers can display the discography of a performing artist with just a few clicks. In return, it will also be easier to get their own products integrated on the website of other companies.

We are convinced that Knowledge Graphs are the best way to model data for Conversational AI. Schema.org is practically industry consensus and where necessary we extend the given schema.

Alternative approaches are possible but have limitations

Other providers are pursuing their own approaches for their Conversational AI that are not based on Knowledge Graphs. And while they might still be successful, these approaches are associated with massive limitations.

Of course, also data that is not structured according to schema.org can be partially read and used by search engines and consequently by Conversational AI – but only partially. The real problem of individual solutions becomes visible at the latest when an integration of external data sources is to take place.

For example, when new queries or actions have to be added in the course of expanding a chatbot. In the absence of common standards and definitions, painful adjustments and transfers may become necessary.

Alternative approaches can be useful for special cases

We see individual solutions or schemas as a reasonable solution wherever, due to the specifics of the application, a customized structure proves to be a superior data base compared to the generic structure of schema.org.

In all other cases, however, we always advise companies to rely on schema.org when annotating web content with semantic information. This is partly because of the causes outlined above. And because the schema.org guidelines will become even more important in the future of voice search.

The common future of Conversational AI and schema.org

41% of owners of devices with voice controls say they already use them weekly, Voice Search is the future of search. We, therefore, expect schema.org to expand its classes and vocabulary specifically for voice search in the future.

Descriptions for actions based on voice commands such as searches, answers, scheduling, orders could follow. API queries such as the shipping status for packages at delivery services could also soon be annotated. Annotations for answers could be helpful to make answers more meaningful in the future.

A schema for multilingualism would be useful to ensure the correct pronounciation of a voice assistant. For example in the sense “This word is pronounced the same in English and German, but it is to be pronounced differently in French.”

Whatever the future of voice search brings, with Onlim you have a competent partner for the optimization of your data structure for Conversational AI. To discuss more, feel free to schedule a call here.

Prepare Your Data For Conversational AI With Schema.Org

What is schema.org and what is the goal of the initiative?

The earlier companies use schema.org, the better!

Use existing structured data for Conversational AI

Structuring according to schema.org brings multiple benefits

How Onlim’s Conversational AI builds on schema.org

Further advantages of using schema.org

Alternative approaches are possible but have limitations

Alternative approaches can be useful for special cases

The common future of Conversational AI and schema.org

Retrieval Augmented Generation (RAG)

Is a voicebot right for my company?

What is Generative AI?

Contact

Language:

Company

Resources

Partner

From our blog