In News: Researchers from IIT Madras developed a method for reading documents in Bharati script using a multi-lingual optical character recognition (OCR) scheme. Earlier they developed a unified script for nine Indian languages, named the Bharati Script. 

About Bharati Script:

  • It is an alternative script for the languages of India developed by a team at the Indian Institute of Technology (IIT) in Madras led by Dr. Srinivasa Chakravarthy.
  • Type of writing system: alphabet
  • Direction of writing: left to right in horizontal lines
  • Used to write: all major languages of India
  • The scripts that have been integrated include Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Telugu, Kannada, Malayalam and Tamil.
  • The Bharati characters are made up of three tiers stacked vertically. 
  • It has 17 vowels and 22 consonants.

A common script for the entire country is hoped to bring down many communication barriers in India.

Optical Character Recognition (OCR) scheme:

  • The first step is separating (or segmenting) the document into text and non-text.
  • The text is then segmented into paragraphs, sentences, words and letters.
  • Each letter has to be recognised as a character in some recognisable format such as ASCII or Unicode.
  • The letter has various components such as the basic consonant, consonant modifiers, vowels etc.