Abstract
The result of a document image segmentation task, e.g. text line or word segmentation, is usually a labeled image with each label corresponding to a different segmented region. For many applications, the segmented regions need to be stored and represented in an efficient way, using simple geometric shapes. A challenging task is to restrict all pixels corresponding to a specific label inside a polygon with a minimum number of vertices. Such a polygon promotes the description simplicity and the storage efficiency, while providing a much more user-friendly representation that can be edited easily. The proposed method is a cost-effective approximation of the minimum-edges polygon problem, computing a contour enclosing only pixels of a certain label and using a greedy algorithm in order to reduce the contour into a minimum-link polygon that retains the separability property between the labeled set of pixels.