AI Key Document Labels Extraction
The AI Key Document Labels model is not limited to a particular document type and is used to extract data from any structured or semi-structured document. With this model, Scan2x looks for keys and values anywhere on the document that are presented in the following format:
Key: Value
Or
Key: Value line 1
Value line 2
Examples of these could include an Airway Bill number, a name, an address, and tabular data.
For every value found on the document, the Key will be presented as the metadata field name, and will vary from one form type to another depending upon the information found on the document. Using metadata field mapping, it is then possible to normalise the data as with the Tables extraction above.
Handwritten language support
Language
|
Language
|
English
|
Japanese
|
Chinese Simplified
|
Korean
|
French
|
Portuguese
|
German
|
Spanish
|
Italian
|
Russian
|
Thai
|
Arabic
|
Printed language support
Language
|
Abaza
|
Abkhazian
|
Achinese
|
Acoli
|
Adangme
|
Adyghe
|
Afar
|
Afrikaans
|
Akan
|
Albanian
|
Algonquin
|
Angika (Devanagari)
|
Arabic
|
Asturian
|
Asu (Tanzania)
|
Avaric
|
Awadhi-Hindi (Devanagari)
|
Aymara
|
Azerbaijani (Latin)
|
Bafia
|
Bagheli
|
Bambara
|
Bashkir
|
Basque
|
Belarusian (Cyrillic)
|
Belarusian (Latin)
|
Bemba (Zambia)
|
Bena (Tanzania)
|
Bhojpuri-Hindi (Devanagari)
|
Bikol
|
Bini
|
Bislama
|
Bodo (Devanagari)
|
Bosnian (Latin)
|
Brajbha
|
Breton
|
Bulgarian
|
Bundeli
|
Buryat (Cyrillic)
|
Catalan
|
Cebuano
|
Chamling
|
Chamorro
|
Chechen
|
Chhattisgarhi (Devanagari)
|
Chiga
|
Chinese Simplified
|
Chinese Traditional
|
Choctaw
|
Chukot
|
Chuvash
|
Cornish
|
Corsican
|
Cree
|
Creek
|
Crimean Tatar (Latin)
|
Croatian
|
Crow
|
Czech
|
Danish
|
Dargwa
|
Dari
|
Dhimal (Devanagari)
|
Dogri (Devanagari)
|
Duala
|
Dungan
|
Dutch
|
Efik
|
English
|
Erzya (Cyrillic)
|
Estonian
|
Faroese
|
Fijian
|
Filipino
|
Finnish
|
Fon
|
French
|
Friulian
|
Ga
|
Gagauz (Latin)
|
Galician
|
Ganda
|
Gayo
|
German
|
Gilbertese
|
Gondi (Devanagari)
|
Greek
|
Greenlandic
|
Guarani
|
Gurung (Devanagari)
|
Gusii
|
Haitian Creole
|
Halbi (Devanagari)
|
Hani
|
Haryanvi
|
Hawaiian
|
Hebrew
|
Herero
|
Hiligaynon
|
Hindi
|
Hmong Daw (Latin)
|
Ho(Devanagiri)
|
Hungarian
|
Iban
|
Icelandic
|
Igbo
|
Iloko
|
Inari Sami
|
Indonesian
|
Ingush
|
Interlingua
|
Inuktitut (Latin)
|
Irish
|
Italian
|
Japanese
|
Jaunsari (Devanagari)
|
Javanese
|
Jola-Fonyi
|
Kabardian
|
Kabuverdianu
|
Kachin (Latin)
|
Kalenjin
|
Kalmyk
|
Kangri (Devanagari)
|
Kanuri
|
Karachay-Balkar
|
Kara-Kalpak (Cyrillic)
|
Kara-Kalpak (Latin)
|
Kashubian
|
Kazakh (Cyrillic)
|
Kazakh (Latin)
|
Khakas
|
Khaling
|
Khasi
|
K'iche'
|
Kikuyu
|
Kildin Sami
|
Kinyarwanda
|
Komi
|
Kongo
|
Korean
|
Korku
|
Koryak
|
Kosraean
|
Kpelle
|
Kuanyama
|
Kumyk (Cyrillic)
|
Kurdish (Arabic)
|
Kurdish (Latin)
|
Kurukh (Devanagari)
|
Kyrgyz (Cyrillic)
|
Lak
|
Lakota
|
Latin
|
Latvian
|
Lezghian
|
Lingala
|
Lithuanian
|
Lower Sorbian
|
Lozi
|
Lule Sami
|
Luo (Kenya and Tanzania)
|
Luxembourgish
|
Luyia
|
Macedonian
|
Machame
|
Madurese
|
Mahasu Pahari (Devanagari)
|
Makhuwa-Meetto
|
Makonde
|
Malagasy
|
Malay (Latin)
|
Maltese
|
Malto (Devanagari)
|
Mandinka
|
Manx
|
Maori
|
Mapudungun
|
Marathi
|
Mari (Russia)
|
Masai
|
Mende (Sierra Leone)
|
Meru
|
Meta'
|
Minangkabau
|
Mohawk
|
Mongolian (Cyrillic)
|
Mongondow
|
Montenegrin (Cyrillic)
|
Montenegrin (Latin)
|
Morisyen
|
Mundang
|
Nahuatl
|
Navajo
|
Ndonga
|
Neapolitan
|
Nepali
|
Ngomba
|
Niuean
|
Nogay
|
North Ndebele
|
Northern Sami (Latin)
|
Norwegian
|
Nyanja
|
Nyankole
|
Nzima
|
Occitan
|
Ojibwa
|
Oromo
|
Ossetic
|
Pampanga
|
Pangasinan
|
Papiamento
|
Pashto
|
Pedi
|
Persian
|
Polish
|
Portuguese
|
Punjabi (Arabic)
|
Quechua
|
Ripuarian
|
Romanian
|
Romansh
|
Rundi
|
Russian
|
Rwa
|
Sadri (Devanagari)
|
Sakha
|
Samburu
|
Samoan (Latin)
|
Sango
|
Sangu (Gabon)
|
Sanskrit (Devanagari)
|
Santali(Devanagiri)
|
Scots
|
Scottish Gaelic
|
Sena
|
Serbian (Cyrillic)
|
Serbian (Latin)
|
Shambala
|
Shona
|
Siksika
|
Sirmauri (Devanagari)
|
Skolt Sami
|
Slovak
|
Slovenian
|
Soga
|
Somali (Arabic)
|
Somali (Latin)
|
Songhai
|
South Ndebele
|
Southern Altai
|
Southern Sami
|
Southern Sotho
|
Spanish
|
Sundanese
|
Swahili (Latin)
|
Swati
|
Swedish
|
Tabassaran
|
Tachelhit
|
Tahitian
|
Taita
|
Tajik (Cyrillic)
|
Tamil
|
Tatar (Cyrillic)
|
Tatar (Latin)
|
Teso
|
Tetum
|
Thai
|
Thangmi
|
Tok Pisin
|
Tongan
|
Tsonga
|
Tswana
|
Turkish
|
Turkmen (Latin)
|
Tuvan
|
Udmurt
|
Uighur (Cyrillic)
|
Ukrainian
|
Upper Sorbian
|
Urdu
|
Uyghur (Arabic)
|
Uzbek (Arabic)
|
Uzbek (Cyrillic)
|
Uzbek (Latin)
|
Vietnamese
|
Volapük
|
Vunjo
|
Walser
|
Welsh
|
Western Frisian
|
Wolof
|
Xhosa
|
Yucatec Maya
|
Zapotec
|
Zarma
|
Zhuang
|
Zulu
|
|