4.1. Thresholding and connected components detection The separation between the text zones and the location stages considerably increases the computation time and lead to an over-segmentation of the noise and paper texture on empty zones of the image. Indeed, none of the traditional methods (whether global [12] or local [13]) efficiently meets all the required conditions, in particular, a certain efficiency on all the images during a limited computing time. We have managed to optimize this stage by applying a local threshold only near the text zones (figure 6) that can be located by the cumulated gradients method with the multi-resolution and mathematical morphology [14]. After the binarization step, we detect the connected components (noted CCs) on the block card that are used then to guide the detection of foreground CCs [15]. The method used is inspired from Pavlidis’s studies on the line adjacency graph structure [16]. It consists in connecting the black pixels runs of two consecutive lines of a binary image. In the next sections of paper, we use the following notations: the sets 0... iNb indicate the lists of components in the CCs, lines and blocks card. (, , , ) <, < ddf f d f d fiii i i i i ix yxy x x y y are the bounding box coordinates of each component. The first pyramidal level is then formed by elements of set Cand the last level by elements of set B. Nc and Nb are the numbers of connected components at each level. 4.2. Physical structure extraction Let G be a non oriented graph at three hierarchical levels independent but coherent defined by the following relationship: 31 (, ) ( , ) kkkkSk GV E G V E such as: 11 i=1...Nc22 i=1...Nl33 i=1...NbV={ (,,,)} 1 V={ ( , , , ) } 2V={ (,,,)} 3ddf fiiiiiiddf fki iiiiiddf fiiiiiiv Descriptor c x y x y if kV v Descriptor l x y x y if kv Descriptor b x y x y if k
(1) Where Vk=1, 2 and 3 are the finite sets of graph Gnodes. These nodes are represented by the components descriptors at each k pyramidal level (see figure 4). The features are then selected in such manner that they can show up the difference between disjoint components. ELk>Sk is the finite set of edges represented by the pairs of adjacent nodes. Two nodes are then considered as adjacent if and only if their dissemblance d(vi,vj)(distance between their two descriptors) is strictly greater than a threshold S (the optimization mechanism of the thresholds Sk is detailed in [3]). This definition is given by the following relationship: 1 ( , ) (2) 0 kkkk ij kkSk i jif d sE, otherwise
The hierarchical coloring of graph G is used here to split-up the set of nodes at each level K into homogeneous subsets. It focuses on the dissemblance between the components (represented by nodes) of the same level in the data pyramid. The process uses a hybrid strategy of progression into the hierarchy of the graph: the colors of a level take part in the formation and the description of the next level nodes (see figure7). We can summarize the coloring stages of graph Gk=1,2 and 3 by the following algorithm [ 11 ]: Our method of physical layout segmentation is based, at the beginning, on the construction of G1which represents the components of the first pyramidal level (Formule1). This construction should be used by the raw blocks of the last pyramidal level to speed up the coloring process. When the graph G1 is colored by “Coloring ()” algorithm and the nodes of textual colors are preserved in 1textV , the graph 11 11 (, )textS GV E is colored. The obtained colors represent the text lines and constitute the graph
22 22 (, ) S GVE . A coloring of G2 is then applied in order to constitute 33 33 (, ) S GVE which represents the textual blocks. Consequently, a last coloring of G3 is necessary in order to merge the blocks which have the same alignment criteria and compose a last level of homogeneous corrected blocks. The following figure summarizes the different stages of segmentation: The evaluation of the performances of our approach has been achieved on a corpus of 10000 envelope images (considered as difficult and noisy). More than 95 % of address-blocks are correctly segmented by our method, in opposition to 60% by RLSA method and 30% by profile projection method. The analysis of these results shows that several errors of over or under segmentation introduced by RLSA and profile projection methods can be considerably reduced by using graph coloring based method (figure 9 and figure 10). These performances can be justified by the powerfulness of our method to efficiently extract or separate the textual components (characters, lines and blocks of text) and allows easy rejection of parasite components. 6. ConclusionWe presented a new physical layout segmentation method based on hierarchical graph coloring. This method uses both a hybrid segmentation strategy and a pyramidal architecture of data. We have seen that the hierarchical graph coloring reveals a high robustness of parasite components rejection. These parasites can be regarded as principal errors cause of traditional segmentation methods.Thanks to the restricted number of rules, this new technique is adapted to a large mail variety. Moreover, we could increase the coherences between the various segmentation stages and reduce the times computing. This work is granted by CESA Company (www.cesa.fr).7. References [1] Wang Ching-Huei, P.W. Palumbo, S.N. Srihari, Object recognition in visually complex environments: an architecture for locating address blocks on mail pieces, Pattern Recognition, 1988., 9th International Conference, IEEE, 1988, vol.1, Pages: 365 – 367. [2] C Viard-Gaudin, D Barba, A multi-resolution approach to extract the address block on flat mail pieces, ICASSP-91, International Conference, vol.4, Pages: 2701 - 2704 . [3] DJ. Gaceb, V. Eglin, F Lebourgeois, H. Emptoz. Address block localization based on graph theory. Document Recognition and Retrieval XIV, SPIE Int. Soc. Opt. Eng ed. San Jose (USA, Californie). p. 12. 2008. [4] Rémy Mullot, Livre: Les documents écrits de la numérisation à l’indexation par le contenu, Editeur : Hermes science Publication, Nov 2006, 365 pages, ISBN-10 : 2746211432.
- 上一篇:破碎机偏心速度英文文献和中文翻译
- 下一篇:模糊PID控制算法英文文献和中文翻译
-
-
-
-
-
-
-
浅析中国古代宗法制度
g-C3N4光催化剂的制备和光催化性能研究
巴金《激流三部曲》高觉新的悲剧命运
NFC协议物理层的软件实现+文献综述
现代简约美式风格在室内家装中的运用
高警觉工作人群的元情绪...
中国传统元素在游戏角色...
C++最短路径算法研究和程序设计
江苏省某高中学生体质现状的调查研究
上市公司股权结构对经营绩效的影响研究