Keywords: Kuzushiji, KMINST, Handwriting Recognition, Document Image Analysis, Machine Learning, ICDAR 2019
Related work: KuroNet: Pre-Modern Japanese Kuzushiji Character Recognition with Deep Learning (https://arxiv.org/abs/1910.09433)
At my GitHub you can find the code/explanations: click here
Opening the door to a thousand years of Japanese culture
Imagine the history contained in a thousand years of books. What stories are in those books? What knowledge can we learn from the world before our time? What was the weather like 500 years ago? What happened when Mt. Fuji erupted? How can one fold 100 cranes using only one piece of paper? The answers to these questions are in those books.
Japan has millions of books and over a billion historical documents such as personal letters or diaries preserved nationwide. Most of them cannot be read by the majority of Japanese people living today because they were written in “Kuzushiji”.
Even though Kuzushiji, a cursive writing style, had been used in Japan for over a thousand years, there are very few fluent readers of Kuzushiji today (only 0.01% of modern Japanese natives). Due to the lack of available human resources, there has been a great deal of interest in using Machine Learning to automatically recognize these historical texts and transcribe them into modern Japanese characters. Nevertheless, several challenges in Kuzushiji recognition have made the performance of existing systems extremely poor.
The hosts need help from machine learning experts to transcribe Kuzushiji into contemporary Japanese characters. The model is not only a great contribution to the machine learning community, but also a great help for making millions of documents more accessible and leading to new discoveries in Japanese history and culture.