AI Tables Extraction
The AI Tables Extraction model is tuned to recognise and extract only the data in tabular form on a document and is able to understand when a table spans more than one page. Each table will be presented by Scan2x as a table field within metadata. The number of metadata fields generated by Scan2x will vary, depending upon the number of tables on the document.
Using metadata field mapping, it is then possible to normalise the data. For example, a table returned by the model as Table1 can be mapped to a metadata field called “LineItems”. Furthermore, it is possible to filter out spurious tables returned by the model by implementing table filters. See the online handbook for more information.
Handwriting language support
Language
|
Language
|
English
|
Japanese
|
Chinese Simplified
|
Korean
|
French
|
Portuguese
|
German
|
Spanish
|
Italian
|
Russian
|
Thai
|
Arabic
|
Printed language support
Language
|
Abaza
|
Abkhazian
|
Achinese
|
Acoli
|
Adangme
|
Adyghe
|
Afar
|
Afrikaans
|
Akan
|
Albanian
|
Algonquin
|
Angika (Devanagari)
|
Arabic
|
Asturian
|
Asu (Tanzania)
|
Avaric
|
Awadhi-Hindi (Devanagari)
|
Aymara
|
Azerbaijani (Latin)
|
Bafia
|
Bagheli
|
Bambara
|
Bashkir
|
Basque
|
Belarusian (Cyrillic)
|
Belarusian (Latin)
|
Bemba (Zambia)
|
Bena (Tanzania)
|
Bhojpuri-Hindi (Devanagari)
|
Bikol
|
Bini
|
Bislama
|
Bodo (Devanagari)
|
Bosnian (Latin)
|
Brajbha
|
Breton
|
Bulgarian
|
Bundeli
|
Buryat (Cyrillic)
|
Catalan
|
Cebuano
|
Chamling
|
Chamorro
|
Chechen
|
Chhattisgarhi (Devanagari)
|
Chiga
|
Chinese Simplified
|
Chinese Traditional
|
Choctaw
|
Chukot
|
Chuvash
|
Cornish
|
Corsican
|
Cree
|
Creek
|
Crimean Tatar (Latin)
|
Croatian
|
Crow
|
Czech
|
Danish
|
Dargwa
|
Dari
|
Dhimal (Devanagari)
|
Dogri (Devanagari)
|
Duala
|
Dungan
|
Dutch
|
Efik
|
English
|
Erzya (Cyrillic)
|
Estonian
|
Faroese
|
Fijian
|
Filipino
|
Finnish
|
Fon
|
French
|
Friulian
|
Ga
|
Gagauz (Latin)
|
Galician
|
Ganda
|
Gayo
|
German
|
Gilbertese
|
Gondi (Devanagari)
|
Greek
|
Greenlandic
|
Guarani
|
Gurung (Devanagari)
|
Gusii
|
Haitian Creole
|
Halbi (Devanagari)
|
Hani
|
Haryanvi
|
Hawaiian
|
Hebrew
|
Herero
|
Hiligaynon
|
Hindi
|
Hmong Daw (Latin)
|
Ho(Devanagiri)
|
Hungarian
|
Iban
|
Icelandic
|
Igbo
|
Iloko
|
Inari Sami
|
Indonesian
|
Ingush
|
Interlingua
|
Inuktitut (Latin)
|
Irish
|
Italian
|
Japanese
|
Jaunsari (Devanagari)
|
Javanese
|
Jola-Fonyi
|
Kabardian
|
Kabuverdianu
|
Kachin (Latin)
|
Kalenjin
|
Kalmyk
|
Kangri (Devanagari)
|
Kanuri
|
Karachay-Balkar
|
Kara-Kalpak (Cyrillic)
|
Kara-Kalpak (Latin)
|
Kashubian
|
Kazakh (Cyrillic)
|
Kazakh (Latin)
|
Khakas
|
Khaling
|
Khasi
|
K'iche'
|
Kikuyu
|
Kildin Sami
|
Kinyarwanda
|
Komi
|
Kongo
|
Korean
|
Korku
|
Koryak
|
Kosraean
|
Kpelle
|
Kuanyama
|
Kumyk (Cyrillic)
|
Kurdish (Arabic)
|
Kurdish (Latin)
|
Kurukh (Devanagari)
|
Kyrgyz (Cyrillic)
|
Lak
|
Lakota
|
Latin
|
Latvian
|
Lezghian
|
Lingala
|
Lithuanian
|
Lower Sorbian
|
Lozi
|
Lule Sami
|
Luo (Kenya and Tanzania)
|
Luxembourgish
|
Luyia
|
Macedonian
|
Machame
|
Madurese
|
Mahasu Pahari (Devanagari)
|
Makhuwa-Meetto
|
Makonde
|
Malagasy
|
Malay (Latin)
|
Maltese
|
Malto (Devanagari)
|
Mandinka
|
Manx
|
Maori
|
Mapudungun
|
Marathi
|
Mari (Russia)
|
Masai
|
Mende (Sierra Leone)
|
Meru
|
Meta'
|
Minangkabau
|
Mohawk
|
Mongolian (Cyrillic)
|
Mongondow
|
Montenegrin (Cyrillic)
|
Montenegrin (Latin)
|
Morisyen
|
Mundang
|
Nahuatl
|
Navajo
|
Ndonga
|
Neapolitan
|
Nepali
|
Ngomba
|
Niuean
|
Nogay
|
North Ndebele
|
Northern Sami (Latin)
|
Norwegian
|
Nyanja
|
Nyankole
|
Nzima
|
Occitan
|
Ojibwa
|
Oromo
|
Ossetic
|
Pampanga
|
Pangasinan
|
Papiamento
|
Pashto
|
Pedi
|
Persian
|
Polish
|
Portuguese
|
Punjabi (Arabic)
|
Quechua
|
Ripuarian
|
Romanian
|
Romansh
|
Rundi
|
Russian
|
Rwa
|
Sadri (Devanagari)
|
Sakha
|
Samburu
|
Samoan (Latin)
|
Sango
|
Sangu (Gabon)
|
Sanskrit (Devanagari)
|
Santali(Devanagiri)
|
Scots
|
Scottish Gaelic
|
Sena
|
Serbian (Cyrillic)
|
Serbian (Latin)
|
Shambala
|
Shona
|
Siksika
|
Sirmauri (Devanagari)
|
Skolt Sami
|
Slovak
|
Slovenian
|
Soga
|
Somali (Arabic)
|
Somali (Latin)
|
Songhai
|
South Ndebele
|
Southern Altai
|
Southern Sami
|
Southern Sotho
|
Spanish
|
Sundanese
|
Swahili (Latin)
|
Swati
|
Swedish
|
Tabassaran
|
Tachelhit
|
Tahitian
|
Taita
|
Tajik (Cyrillic)
|
Tamil
|
Tatar (Cyrillic)
|
Tatar (Latin)
|
Teso
|
Tetum
|
Thai
|
Thangmi
|
Tok Pisin
|
Tongan
|
Tsonga
|
Tswana
|
Turkish
|
Turkmen (Latin)
|
Tuvan
|
Udmurt
|
Uighur (Cyrillic)
|
Ukrainian
|
Upper Sorbian
|
Urdu
|
Uyghur (Arabic)
|
Uzbek (Arabic)
|
Uzbek (Cyrillic)
|
Uzbek (Latin)
|
Vietnamese
|
Volapük
|
Vunjo
|
Walser
|
Welsh
|
Western Frisian
|
Wolof
|
Xhosa
|
Yucatec Maya
|
Zapotec
|
Zarma
|
Zhuang
|
Zulu
|
|