Get Our Extension

Distributional–relational database

From Wikipedia, in a visual modern way

A distributional–relational database, or word-vector database, is a database management system (DBMS) that uses distributional word-vector representations to enrich the semantics of structured data.

As distributional word-vectors can be built automatically from large-scale corpora,[1] this enrichment supports the construction of databases which can embed large-scale commonsense background knowledge into their operations. Distributional-Relational models can be applied to the construction of schema-agnostic databases (databases in which users can query the data without being aware of its schema), semantic search, schema-integration and inductive and abductive reasoning as well as different applications in which a semantically flexible knowledge representation model is needed. The main advantage of distributional–relational models over purely logical / semantic web models is the fact that the core semantic associations can be automatically captured from corpora in contrast to the definition of manually curated ontologies and rule knowledge bases.[2]

Discover more about Distributional–relational database related topics

Word embedding

Word embedding

In natural language processing (NLP), word embedding is a term used for the representation of words for text analysis, typically in the form of a real-valued vector that encodes the meaning of the word such that the words that are closer in the vector space are expected to be similar in meaning. Word embeddings can be obtained using a set of language modeling and feature learning techniques where words or phrases from the vocabulary are mapped to vectors of real numbers.

Data model

Data model

A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.

Text corpus

Text corpus

In linguistics, a corpus or text corpus is a language resource consisting of a large and structured set of texts. In corpus linguistics, they are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory.

Schema-agnostic databases

Schema-agnostic databases

Schema-agnostic databases or vocabulary-independent databases aim at supporting users to be abstracted from the representation of the data, supporting the automatic semantic matching between queries and databases. Schema-agnosticism is the property of a database of mapping a query issued with the user terminology and structure, automatically mapping it to the dataset vocabulary.

Database schema

Database schema

The database schema is the structure of a database described in a formal language supported by the database management system (DBMS). The term "schema" refers to the organization of data as a blueprint of how the database is constructed. The formal definition of a database schema is a set of formulas (sentences) called integrity constraints imposed on a database. These integrity constraints ensure compatibility between parts of the schema. All constraints are expressible in the same language. A database can be considered a structure in realization of the database language. The states of a created conceptual schema are transformed into an explicit mapping, the database schema. This describes how real-world entities are modeled in the database.

Semantic search

Semantic search

Semantic search denotes search with meaning, as distinguished from lexical search where the search engine looks for literal matches of the query words or variants of them, without understanding the overall meaning of the query. Semantic search seeks to improve search accuracy by understanding the searcher's intent and the contextual meaning of terms as they appear in the searchable dataspace, whether on the Web or within a closed system, to generate more relevant results. Content that ranks well in semantic search is well-written in a natural voice, focuses on the user's intent, and considers related topics that the user may look for in the future.

Inductive reasoning

Inductive reasoning

Inductive reasoning is a method of reasoning in which a general principle is derived from a body of observations. It consists of making broad generalizations based on specific observations. Inductive reasoning is distinct from deductive reasoning. If the premises are correct, the conclusion of a deductive argument is certain; in contrast, the truth of the conclusion of an inductive argument is probable, based upon the evidence given.

Abductive reasoning

Abductive reasoning

Abductive reasoning is a form of logical inference formulated and advanced by American philosopher Charles Sanders Peirce beginning in the last third of the 19th century. It starts with an observation or set of observations and then seeks the simplest and most likely conclusion from the observations. This process, unlike deductive reasoning, yields a plausible conclusion but does not positively verify it. Abductive conclusions are thus qualified as having a remnant of uncertainty or doubt, which is expressed in retreat terms such as "best available" or "most likely". One can understand abductive reasoning as inference to the best explanation, although not all usages of the terms abduction and inference to the best explanation are exactly equivalent.

Semantic Web

Semantic Web

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

Ontology

Ontology

In metaphysics, ontology is the philosophical study of being, as well as related concepts such as existence, becoming, and reality.

Distributional–relational models

Distributional–relational models were first formalized [3][4] as a mechanism to cope with the vocabulary/semantic gap between users and the schema behind the data. In this scenario, distributional semantic relatedness measures, combined with semantic pivoting heuristics can support the approximation between user queries (expressed in their own vocabulary) and data (expressed in the vocabulary of the designer).

In this model, the database symbols (entities and relations) are embedded into a distributional semantic space and have a geometric interpretation under a latent or explicit semantic space. The geometric aspect supports the semantic approximation between entities from different databases or between a query term and a database entity. The distributional relational model then becomes a double layered model where the semantics of the structured data provides the fine-grained semantics intended by the database designer, which is extended by the distributional semantic model which contains the semantic associations expressed at a broader use. These models support the generalization from a closed communication scenario (in which database designers and users live in the same context, e.g. the same organization) to an open communication scenario (e.g. different organizations, the Web), creating an abstraction layer between users and the specific representation of the conceptual model.

Discover more about Distributional–relational models related topics

Distributional semantics

Distributional semantics

Distributional semantics is a research area that develops and studies theories and methods for quantifying and categorizing semantic similarities between linguistic items based on their distributional properties in large samples of language data. The basic idea of distributional semantics can be summed up in the so-called Distributional hypothesis: linguistic items with similar distributions have similar meanings.

Heuristic

Heuristic

A heuristic, or heuristic technique, is an approach to problem solving or self-discovery using 'a calculated guess' derived from previous experiences. Heuristics are mental shortcuts that ease the cognitive load of making a decision. Usually the opposite process to heuristics is the application of algorithms. Algorithms involve calculated answers and guesswork is eliminated.

Data

Data

In the pursuit of knowledge, data are a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted. A datum is an individual value in a collection of data. Data are usually organized into structures such as tables that provide additional context and meaning, and which may themselves be used as data in larger structures. Data may be used as variables in a computational process. Data may represent abstract ideas or concrete measurements. Data are commonly used in scientific research, economics, and in virtually every other form of human organizational activity. Examples of data sets include price indices, unemployment rates, literacy rates, and census data. In this context, data represents the raw facts and figures which can be used in such a manner in order to capture the useful information out of it.

Semantic space

Semantic space

Semantic spaces in the natural language domain aim to create representations of natural language that are capable of capturing meaning. The original motivation for semantic spaces stems from two core challenges of natural language: Vocabulary mismatch and ambiguity of natural language.

Geometry

Geometry

Geometry is, with arithmetic, one of the oldest branches of mathematics. It is concerned with properties of space such as the distance, shape, size, and relative position of figures. A mathematician who works in the field of geometry is called a geometer.

Source: "Distributional–relational database", Wikipedia, Wikimedia Foundation, (2022, November 25th), https://en.wikipedia.org/wiki/Distributional–relational_database.

Enjoying Wikiz?

Enjoying Wikiz?

Get our FREE extension now!

References
  1. ^ Harris, Z. (1954). "Distributional structure". Word. 10 (23): 146–162.
  2. ^ Métais, Elisabeth; Roche, Mathieu; Teisseire, Maguelonne (2014-06-16). Natural Language Processing and Information Systems: 19th International Conference on Applications of Natural Language to Information Systems, NLDB 2014, Montpellier, France, June 18-20, 2014. Proceedings. Springer. ISBN 978-3-319-07983-7.
  3. ^ Freitas, A. “Schema-agnostic queries over large-schema databases: a distributional semantics approach” PhD Thesis, 2015
  4. ^ Freitas, A., Handschuh, S., Curry, E., Distributional-Relational Models: Scalable Semantics for Databases, AAAI Spring Symposium, Knowledge Representation & Reasoning Track, Stanford, 2014

The content of this page is based on the Wikipedia article written by contributors..
The text is available under the Creative Commons Attribution-ShareAlike Licence & the media files are available under their respective licenses; additional terms may apply.
By using this site, you agree to the Terms of Use & Privacy Policy.
Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization & is not affiliated to WikiZ.com.