Simple Vocabulary Creation Support Tool(SVD)

In order to integrate data across different organizations or fields, terms and vocabularies must be used consistently to describe the data. Such data unification normally requires not only special knowledge in data processing, but also expertise in the target field as well as time and effort. Our Simple Vocabulary Designer (SVD) is a controlled vocabulary creation support tool that uses a GUI and support AI so that you can create controlled vocabularies more efficiently than conventional methods without the need for special knowledge of data processing. As SVD allows a field expert to define terms exclusively, SVD makes it easier to follow new terms and helps reduce obstacles to data integration.
As its input and output format, SVD has adopted the Simple Knowledge Organization System (SKOS), a standard specification used to describe controlled vocabularies and thesauruses, which facilitates external data integration. We have also developed a new format “SKOS Subset CSV (SKOS-SC),” which can describe required term lists more concisely than SKOS and meet your input and output needs.

SKOS Subset CSV (SKOS-SC) is a CSV-based file format that can describe synonymous relationships and hierarchical relationships between terms (or concepts) concisely.

Why SKOS Subset CSV (SKOS-SC) format was developed

When data is exchanged between different companies or organizations, they need to use terms with meanings that can be understood by both parties. Machine-readable standard specification “Simple Knowledge Organization System (SKOS)” can be used to describe synonymous relationships and hierarchical relationships between terms.

While SKOS can create vocabularies in the format called “Resource Description Framework (RDF),” it is not easy for users who have very little experience in knowledge processing to create vocabularies in the RDF format. To solve this issue, the National Institute of Informatics and Fujitsu Limited worked together to develop the SKOS-SC format, which allows you to describe synonymous relationships and hierarchical relationships between terms in the CSV format, which is widely used by the general public and is also compatible with spreadsheet software.

Feature of SKOS Subset CSV (SKOS-SC)

「SKOS-SC has the following features:

  • As its structure mainly uses the label called “term name (用語名),” the user does not need to have knowledge in the RDF grammar or the conceptual system of vocabularies
  • As it describes data in the CSV format, the user can describe relationships between terms (or concepts) more intuitively

Since SVD expands SKOS-SC to add data strings that express similarities between terms and other characteristics, the user can define terms more intuitively.

Problems with multiple terms

There are times when there are various names (terms) for the same thing, such as the name of an object or an event. While this is rich in expression, it is somewhat inconvenient. Therefore, creating controlled vocabulary is an attempt to organize the relationships between these terms that refer to the same thing in an easy-to-understand manner for everyone.

Controlled vocabulary

Each term has different relationships with other terms.

  • Terms with the same meaning
    → Synonym
  • A representative word among terms with the same meaning
    → Representative word
  • Term whose meaning contains other terms' meaning
    → Hypernym

A collection of terms with such relationships is called a controlled vocabulary.

About Controlled Vocabulary Designer

In creating a controlled vocabulary, it is necessary to edit the information of various terms such as:

  • Synonym
  • Representative word
  • URI of Representative word
  • Hypernym

It is a daunting task to organize such a large number of relationships and information by each word and term. SVD (Controlled Vocabulary Designer) was developed to support these tasks.

Intuitive operation is possible by visualizing relationships

Efficient work by utilizing AI

Vocabulary Creation Flow

(Based on achievements in the agricultural sector (WAGRI) by Professor Takeda, NII)

Document collection
& extraction of
candidate terms
approx. 1.5 months
Definition of
synonymous/is-a
relationships
approx. 1.5 months
Creation of
a controlled vocabulary
Structuration
(e.g. part-of, attribute-of,
predicate logic)
approx. 3 months

Docker and SVD

Docker is a software platform that can build, test and deploy applications quickly.
Docker packages software in the standardized unit called “container.”
The container can contain all items required to execute software, such as a library, code and runtime.
Docker can be used to execute application software easily without depending on the calculator environment.
Since SVD is packaged in the container, it can start up promptly.

Download SVD

Quick Start Guide

SVD Help

日本語