Chandranath Adak is currently a researcher at CIBCI (Computational Intelligence & Brain-Computer Interface) Lab, University of Technology Sydney (UTS). He submitted his Ph.D. (Analytics) thesis on 18th January 2019, in the School of Software, FEIT, UTS and carried out the Ph.D. work under the supervision of Prof. Michael Blumenstein (principal), Prof. Bidyut B. Chaudhuri (co-) and Prof. C. T. Lin (co-).
Mr. Adak worked as a visiting researcher in different labs including School of Software, University of Technology Sydney (Mar. 2016, Feb. 2017, and Feb. 2018), DINFO, University of Florence, Italy (Jul. 2017 - Sep. 2017) and CVPR Unit, Indian Statistical Institute (Sep. 2015 - Feb. 2016).
Prior to joining UTS as a Ph.D. student, he was an HDR (Higher Degree Research) candidate in IIIS, School of ICT, Griffith University, Gold Coast, Australia. Before that, he worked as a research project linked personnel in CVPR Unit, Indian Statistical Institute, Kolkata, under the guidance of Prof. Bidyut B. Chaudhuri. Previous to that, he completed his M.Tech. (2014) and B.Tech. (2012) from University of Kalyani, India and West Bengal University of Technology, India, respectively, both in Computer Science and Engineering.
Chandranath's areas of interest are Image Processing, Pattern Recognition, Document Image Analysis, Data Analysis, Computer Vision, and Artificial Intelligence-related subjects.
School of Software, FEIT, University of Technology Sydney, Australia
Instructor | Autumn 2018
— UTS Subject code: #31250: “Introduction to Data Analytics” (Undergraduate).
— UTS Subject code: #32130: “Fundamentals of Data Analytics” (Postgraduate).
Tutor | Autumn 2019
— UTS Subject code: #32555: “Fundamentals of Software Development” (Postgraduate).
Adak, C, Chaudhuri, BB & Blumenstein, M 2019, 'An Empirical Study on Writer Identification and Verification From Intra-Variable Individual Handwriting', IEEE ACCESS, vol. 7, pp. 24738-24758.View/Download from: Publisher's site
This paper deals with the identification and processing of struck-out texts in unconstrained offline handwritten document images. If run on the OCR engine, such texts will produce nonsense character-string outputs. Here we present a combined (a) pattern classification and (b) graph-based method for identifying such texts. In case of (a), a feature-based two-class (normal vs. struck-out text) SVM classifier is used to detect moderate-sized struck-out components. In case of (b), skeleton of the text component is considered as a graph and the strike-out stroke is identified using a constrained shortest path algorithm. To identify zigzag or wavy struck-outs, all paths are found and some properties of zigzag and wavy line are utilized. Some other types of strike-out stroke are also detected by modifying the above method. The large sized multi-word and multi-line struck-outs are segmented into smaller components and treated as above. The detected struck-out texts can then be blocked from entering the OCR engine. In another kind of application involving historical documents, page images along with their annotated ground-truth are to be generated. In this case the strike-out strokes can be deleted from the words and then fed to the OCR engine. For this purpose an inpainting-based cleaning approach is employed. We worked on 500 pages of documents and obtained an overall F-Measure of 91.56% (91.06%) in English (Bengali) script for struck-out text detection. Also, for strike-out stroke identification and deletion, the F-Measures obtained were 89.65% (89.31%) and 91.16% (89.29%), respectively.
Adak, C, Marinai, S, Chaudhuri, BB & Blumenstein, M 2018, 'Offline Bengali writer verification by PDF-CNN and siamese net', Proceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018, pp. 381-386.View/Download from: UTS OPUS or Publisher's site
© 2018 IEEE. Automated handwriting analysis is a popular area of research owing to the variation of writing patterns. In this research area, writer verification is one of the most challenging branches, having direct impact on biometrics and forensics. In this paper, we deal with offline writer verification on complex handwriting patterns. Therefore, we choose a relatively complex script, i.e., Indic Abugida script Bengali (or, Bangla) containing more than 250 compound characters. From a handwritten sample, the probability distribution functions (PDFs) of some handcrafted features are obtained and input to a convolutional neural network (CNN). For such a CNN architecture, we coin the term 'PDFCNN', where handcrafted feature PDFs are hybridized with auto-derived CNN features. Such hybrid features are then fed into a Siamese neural network for writer verification. The experiments are performed on a Bengali offline handwritten dataset of 100 writers. Our system achieves encouraging results, which sometimes exceed the results of state-of-The-Art techniques on writer verification.
Adak, C, Chaudhuri, BB & Blumenstein, M 2018, 'Cognitive Analysis for Reading and Writing of Bengali Conjuncts', Proceedings of the International Joint Conference on Neural Networks, International Joint Conference on Neural Networks, IEEE, Rio de Janeiro, Brazil, pp. 1-7.View/Download from: UTS OPUS or Publisher's site
© 2018 IEEE. In this paper, we study the difficulties arising in reading and writing of Bengali conjunct characters by human-beings. Such difficulties appear when the human cognitive system faces certain obstructions in effortlessly reading/writing. In our computer-based investigation, we consider the reading/writing difficulty analysis task as a machine learning problem supervised by human perception. To this end, we employ two distinct models: (a) an auto-derived feature-based Inception network and (b) a hand-crafted feature-based SVM (Support Vector Machine). Two commonly used Bengali printed fonts and three contemporary handwritten databases are used for collecting subjective opinion scores from human readers/writers. On this corpus, which contains the perceptive ground-truth opinion of reading/writing complications, we have undertaken to conduct the experiments. The experimental results obtained on various types of conjunct characters are promising.
Adak, C, Chaudhuri, BB & Blumenstein, M 2018, 'A study on idiosyncratic handwriting with impact on writer identification', Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, International Conference on Frontiers in Handwriting Recognition, Niagara Falls, NY, USA, pp. 193-198.View/Download from: UTS OPUS or Publisher's site
© 2018 IEEE. In this paper, we study handwriting idiosyncrasy in terms of its structural eccentricity. In this study, our approach is to find idiosyncratic handwritten text components and model the idiosyncrasy analysis task as a machine learning problem supervised by human cognition. We employ the Inception network for this purpose. The experiments are performed on two publicly available databases and an in-house database of Bengali offline handwritten samples. On these samples, subjective opinion scores of handwriting idiosyncrasy are collected from handwriting experts. We have analyzed the handwriting idiosyncrasy on this corpus which comprises the perceptive ground-truth opinion. We also investigate the effect of idiosyncratic text on writer identification by using the SqueezeNet. The performance of our system is promising.
Adak, C, Chaudhuri, BB & Blumenstein, M 2017, 'Impact of struck-out text on writer identification', Proceedings of the International Joint Conference on Neural Networks, International Joint Conference on Neural Networks, IEEE, Anchorage, AK, USA, pp. 1465-1471.View/Download from: UTS OPUS or Publisher's site
© 2017 IEEE. The presence of struck-out text in handwritten manuscripts may affect the accuracy of automated writer identification. This paper presents a study on such effects of struck-out text. Here we consider offline English and Bengali handwritten document images. At first, the struck-out texts are detected using a hybrid classifier of a CNN (Convolutional Neural Network) and an SVM (Support Vector Machine). Then the writer identification process is activated on normal and struck-out text separately, to ascertain the impact of struck-out texts. For writer identification, we use two methods: (a) a hand-crafted feature-based SVM classifier, and (b) CNN-extracted auto-derived features with a recurrent neural model. For the experimental analysis, we have generated a database from 100 English and 100 Bengali writers. The performance of our system is very encouraging.
Adak, C, Chaudhuri, BB & Blumenstein, M 2017, 'Legibility and Aesthetic Analysis of Handwriting', Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, IAPR International Conference on Document Analysis and Recognition, IEEE, Kyoto, Japan, pp. 175-182.View/Download from: UTS OPUS or Publisher's site
© 2017 IEEE. This paper deals with computer-based cognitive analysis towards legibility and aesthetics of a handwritten document. The legible text creates a human perception that the writing can be read effortlessly because of its orthographic clarity. The aesthetic property relates to the beautiful appearance of a handwritten document. In this study, we deal with these properties on offline Bengali handwriting. We formulate both legibility and aesthetic analysis tasks as machine learning problems supervised by the human cognitive system. We employ automatically derived feature-based recurrent neural networks to investigate writing legibility. For aesthetics evaluation, we employ hand-crafted feature-based support vector machines (SVMs). We have collected contemporary Bengali handwritings, on which the subjective legibility and aesthetic scores are provided by human readers. On this corpus containing legibility and aesthetic ground-Truth information, we executed our experiments. The experimental results obtained on various handwritings are encouraging.
Adak, C., Chaudhuri, B.B. & Blumenstein, M. 2016, 'Offline Cursive Bengali Word Recognition using CNNs with a Recurrent Model', PROCEEDINGS OF 2016 15TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), IEEE, Shenzhen, PEOPLES R CHINA, pp. 429-434.View/Download from: Publisher's site
Adak, C, Chaudhuri, BB & Blumenstein, M 2016, 'Named Entity Recognition from Unstructured Handwritten Document Images', Proceedings - 12th IAPR International Workshop on Document Analysis Systems, DAS 2016, International Workshop on Document Analysis Systems, IEEE, Santorini, Greece, pp. 375-380.View/Download from: UTS OPUS or Publisher's site
© 2016 IEEE.Named entity recognition is an important topic in the field of natural language processing, whereas in document image processing, such recognition is quite challenging without employing any linguistic knowledge. In this paper we propose an approach to detect named entities (NEs) directly from offline handwritten unstructured document images without explicit character/word recognition, and with very little aid from natural language and script rules. At the preprocessing stage, the document image is binarized, and then the text is segmented into words. The slant/skew/baseline corrections of the words are also performed. After preprocessing, the words are sent for NE recognition. We analyze the structural and positional characteristics of NEs and extract some relevant features from the word image. Then the BLSTM neural network is used for NE recognition. Our system also contains a post-processing stage to reduce the true NE rejection rate. The proposed approach produces encouraging results on both historical and modern document images, including those from an Australian archive, which are reported here for the very first time.
Adak, C, Chaudhuri, BB & Blumenstein, M 2016, 'Offline cursive Bengali word recognition using CNNs with a recurrent model', Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, International Conference on Frontiers in Handwriting Recognition, IEEE, Shenzhen, China, pp. 429-434.View/Download from: UTS OPUS or Publisher's site
© 2016 IEEE. This paper deals with offline handwritten word recognition of a major Indic script: Bengali. Due to the structure of this script, the characters (mostly ortho-syllables) are frequently overlapping and hard to segment, especially when the writing is cursive. Individual character recognition and the combination of outputs can increase the likelihood of errors. Instead, a better approach can be sending the whole word to a suitable recognizer. Here we use the Convolutional Neural Network (CNN) integrated with a recurrent model for this purpose. Long short-term memory blocks are used as hidden units. Also, the CNN-derived features are employed in a recurrent model with a CTC (Connectionist Temporal Classification) layer to get the output. We have tested our method on three datasets: (a) a publicly available dataset, (b) a new dataset generated by our research group and (c) an unconstrained dataset. The dataset (a) contains 17,091 words, while our dataset (b) contains 107,550 number of words in total. In addition to these, the dataset (c) is comprised of 5,223 words. We have compared our results with those of some earlier work in the area and have found improved performance, which is due to the novel integration of CNNs with the recurrent model.
Adak, C, Chaudhuri, BB & Blumenstein, M 2016, 'Writer identification by training on one script but testing on another', Proceedings - International Conference on Pattern Recognition, International Conference on Pattern Recognition, Mexico, pp. 1153-1158.View/Download from: UTS OPUS or Publisher's site
© 2016 IEEE. This paper deals with identifying a writer from his/her offline handwriting. In a multilingual country where a writer can scribe in multiple scripts, writer identification becomes challenging when we have individual handwriting data in one script while we need to verify/identify a writer from handwriting in another script. In this paper such an issue is addressed with two scripts: English and Bengali. Here we model the task as a classification problem, where training data contains only Bengali handwritten samples and testing is performed on English handwritten texts. This work is based on the understanding that a writer has some inherent stroke characteristics that are independent of the script in which (s)he writes. In this work, some implicit structural and statistical features are extracted, and multiple classifiers are employed for writer identification. Many training sessions are run on a database of 100 writers and the performances are analyzed. We have obtained encouraging results on this database, which show the effectiveness of our method.
Adak, C & Chaudhuri, BB 2015, 'Writer Identification from Offline Isolated Bangia Characters and Numerals', 2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 13th IAPR International Conference on Document Analysis and Recognition (ICDAR), IEEE, Nancy, FRANCE, pp. 486-490.
Adak, C, Maitra, P, Chaudhuri, BB & Blumenstein, M 2015, 'Binarization of old halftone text documents', IEEE Region 10 Annual International Conference, Proceedings/TENCON, IEEE Tencon (IEEE Region 10 Conference), IEEE, Macao, PEOPLES R CHINA, pp. 1-5.View/Download from: UTS OPUS or Publisher's site
© 2015 IEEE. A degraded document image should be cleaned before subjecting to Optical Character Recognition (OCR), otherwise the result may be erroneous. Though major studies have been conducted on degraded document image cleaning, halftone documents received less attention. Since halftone documents contain halftone dot patterns, classical binarization techniques do not produce proper output for feeding into the OCR engine. In this paper, old halftone documents are considered for text area cleaning and binarization. At the beginning, the zone of interest (text area) is found using local binary pattern and contour analysis. Reasonably smaller zones are filtered out as noise. Then the foreground pixels are separated using background estimation. After this, an automated spatial smoothing technique is employed on the foreground. At last, a local binarization technique is used to produce the binary image. The proposed method is tested on various old and degraded halftone documents, which has produced fairly good results.
Adak, C & Chaudhuri, BB 2014, 'An Approach of Strike-through Text Identification from Handwritten Documents', 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), IEEE, Hersonissos, GREECE, pp. 643-648.View/Download from: Publisher's site
Adak, C & Chaudhuri, BB 2014, 'Text Line Identtification in Tagore's Manuscript', 2014 IEEE STUDENTS' TECHNOLOGY SYMPOSIUM (IEEE TECHSYM), 3rd IEEE Students' Technology Symposium (IEEE TechSym), IEEE, IIT Kharagpur, Kharagpur, INDIA, pp. 210-213.
Adak, C 2014, 'An Approach for Printed Document Labeling', 2014 FIRST INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL, ENERGY & SYSTEMS (ACES-14), 1st International Conference on Automation, Control, Energy and Systems (ACES), IEEE, INDIA, pp. 23-26.
Adak, C 2014, 'A Bilingual Machine Translation System: English & Bengali', 2014 FIRST INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL, ENERGY & SYSTEMS (ACES-14), 1st International Conference on Automation, Control, Energy and Systems (ACES), IEEE, INDIA, pp. 271-274.
Adak, C 2013, 'Gabor Filter and Rough Clustering Based Edge Detection', 2013 INTERNATIONAL CONFERENCE ON HUMAN COMPUTER INTERACTIONS (ICHCI), International Conference on Human Computer Interactions (ICHCI), IEEE, Chennai, INDIA.
Adak, C 2013, 'Unsupervised Text Extraction from G-Maps', 2013 INTERNATIONAL CONFERENCE ON HUMAN COMPUTER INTERACTIONS (ICHCI), International Conference on Human Computer Interactions (ICHCI), IEEE, Chennai, INDIA.
Adak, C & Chaudhuri, BB 2013, 'Extraction of doodles and drawings from manuscripts', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 515-520.View/Download from: Publisher's site
In this paper we propose an approach to separate the non-texts from texts of a manuscript. The non-texts are mainly in the form of doodles and drawings of some exceptional thinkers and writers. These have enormous historical values due to study on those writers' subconscious as well as productive mind. We also propose a computational approach to recover the struck-out texts to reduce human effort. The proposed technique has a preprocessing stage, which removes noise using median filter and segments object region using fuzzy c-means clustering. Now connected component analysis finds the major portions of non-texts, and window examination eliminates the partially attached texts. The struck-out texts are extracted by eliminating straight lines, measuring degree of continuity, using some morphological operations. © Springer-Verlag 2013.