AI Tables Extraction

The AI Tables Extraction model is tuned to recognise and extract only the data in tabular form on a document and is able to understand when a table spans more than one page. Each table will be presented by Scan2x as a table field within metadata. The number of metadata fields generated by Scan2x will vary, depending upon the number of tables on the document.

Using metadata field mapping, it is then possible to normalise the data. For example, a table returned by the model as Table1 can be mapped to a metadata field called “LineItems”. Furthermore, it is possible to filter out spurious tables returned by the model by implementing table filters. See the online handbook for more information.

Handwriting language support

Language

Language

English

Japanese

Chinese Simplified

Korean

French

Portuguese

German

Spanish

Italian

Russian

Thai

Arabic

 

Printed language support

Language

Abaza

Abkhazian

Achinese

Acoli

Adangme

Adyghe

Afar

Afrikaans

Akan

Albanian

Algonquin

Angika (Devanagari)

Arabic

Asturian

Asu (Tanzania)

Avaric

Awadhi-Hindi (Devanagari)

Aymara

Azerbaijani (Latin)

Bafia

Bagheli

Bambara

Bashkir

Basque

Belarusian (Cyrillic)

Belarusian (Latin)

Bemba (Zambia)

Bena (Tanzania)

Bhojpuri-Hindi (Devanagari)

Bikol

Bini

Bislama

Bodo (Devanagari)

Bosnian (Latin)

Brajbha

Breton

Bulgarian

Bundeli

Buryat (Cyrillic)

Catalan

Cebuano

Chamling

Chamorro

Chechen

Chhattisgarhi (Devanagari)

Chiga

Chinese Simplified

Chinese Traditional

Choctaw

Chukot

Chuvash

Cornish

Corsican

Cree

Creek

Crimean Tatar (Latin)

Croatian

Crow

Czech

Danish

Dargwa

Dari

Dhimal (Devanagari)

Dogri (Devanagari)

Duala

Dungan

Dutch

Efik

English

Erzya (Cyrillic)

Estonian

Faroese

Fijian

Filipino

Finnish

Fon

French

Friulian

Ga

Gagauz (Latin)

Galician

Ganda

Gayo

German

Gilbertese

Gondi (Devanagari)

Greek

Greenlandic

Guarani

Gurung (Devanagari)

Gusii

Haitian Creole

Halbi (Devanagari)

Hani

Haryanvi

Hawaiian

Hebrew

Herero

Hiligaynon

Hindi

Hmong Daw (Latin)

Ho(Devanagiri)

Hungarian

Iban

Icelandic

Igbo

Iloko

Inari Sami

Indonesian

Ingush

Interlingua

Inuktitut (Latin)

Irish

Italian

Japanese

Jaunsari (Devanagari)

Javanese

Jola-Fonyi

Kabardian

Kabuverdianu

Kachin (Latin)

Kalenjin

Kalmyk

Kangri (Devanagari)

Kanuri

Karachay-Balkar

Kara-Kalpak (Cyrillic)

Kara-Kalpak (Latin)

Kashubian

Kazakh (Cyrillic)

Kazakh (Latin)

Khakas

Khaling

Khasi

K'iche'

Kikuyu

Kildin Sami

Kinyarwanda

Komi

Kongo

Korean

Korku

Koryak

Kosraean

Kpelle

Kuanyama

Kumyk (Cyrillic)

Kurdish (Arabic)

Kurdish (Latin)

Kurukh (Devanagari)

Kyrgyz (Cyrillic)

Lak

Lakota

Latin

Latvian

Lezghian

Lingala

Lithuanian

Lower Sorbian

Lozi

Lule Sami

Luo (Kenya and Tanzania)

Luxembourgish

Luyia

Macedonian

Machame

Madurese

Mahasu Pahari (Devanagari)

Makhuwa-Meetto

Makonde

Malagasy

Malay (Latin)

Maltese

Malto (Devanagari)

Mandinka

Manx

Maori

Mapudungun

Marathi

Mari (Russia)

Masai

Mende (Sierra Leone)

Meru

Meta'

Minangkabau

Mohawk

Mongolian (Cyrillic)

Mongondow

Montenegrin (Cyrillic)

Montenegrin (Latin)

Morisyen

Mundang

Nahuatl

Navajo

Ndonga

Neapolitan

Nepali

Ngomba

Niuean

Nogay

North Ndebele

Northern Sami (Latin)

Norwegian

Nyanja

Nyankole

Nzima

Occitan

Ojibwa

Oromo

Ossetic

Pampanga

Pangasinan

Papiamento

Pashto

Pedi

Persian

Polish

Portuguese

Punjabi (Arabic)

Quechua

Ripuarian

Romanian

Romansh

Rundi

Russian

Rwa

Sadri (Devanagari)

Sakha

Samburu

Samoan (Latin)

Sango

Sangu (Gabon)

Sanskrit (Devanagari)

Santali(Devanagiri)

Scots

Scottish Gaelic

Sena

Serbian (Cyrillic)

Serbian (Latin)

Shambala

Shona

Siksika

Sirmauri (Devanagari)

Skolt Sami

Slovak

Slovenian

Soga

Somali (Arabic)

Somali (Latin)

Songhai

South Ndebele

Southern Altai

Southern Sami

Southern Sotho

Spanish

Sundanese

Swahili (Latin)

Swati

Swedish

Tabassaran

Tachelhit

Tahitian

Taita

Tajik (Cyrillic)

Tamil

Tatar (Cyrillic)

Tatar (Latin)

Teso

Tetum

Thai

Thangmi

Tok Pisin

Tongan

Tsonga

Tswana

Turkish

Turkmen (Latin)

Tuvan

Udmurt

Uighur (Cyrillic)

Ukrainian

Upper Sorbian

Urdu

Uyghur (Arabic)

Uzbek (Arabic)

Uzbek (Cyrillic)

Uzbek (Latin)

Vietnamese

Volapük

Vunjo

Walser

Welsh

Western Frisian

Wolof

Xhosa

Yucatec Maya

Zapotec

Zarma

Zhuang

Zulu

 

 

 

 

Copyright © 2025 Avantech Software