Big Data

What is it and why does it matter?

Advanced methods to extract value from large and complex data sets

BigData@BTH

Which areas are we focusing on?
Big data analytics for decision support
Big data analytics for image processing
Core technologies
Foundations and enabling technologies

Read more about what we do in our subprojects

News

Doctoral thesis on detecting financial fraud successfully defended by Edgar Lopez-Rojas

October 28th, 2016|

Today Edgar Alonso Lopez-Rojas’s publicly defended his doctoral thesis at the department of Computer Science and Engineering, Blekinge Institute of Technology. The thesis “Applying Simulation to the Problem of Detecting Financial Fraud” introduces a financial [...]

All news items

Partner companies

Partner testimonials

Ericsson has worked together with BTH in several successful projects in the past. Since the subject of bigdata is of great importance to us it was obvious that we should expand our ongoing collaboration with BTH to include this profile.
Martin Wallin, Ericsson
Together with the expertise of BTH we will test novel ideas, as well as develop prototype image processing and image analysis software. Our goal is to keep our position as the largest provider of digital Swedish Church records and other historical records online and to provide the most powerful yet easiest to use software to navigate all the documents.
Jonas Tehler, Arkiv Digital
We have good experiences from previous projects regarding research collaboration. It has led to product improvement and valuable knowledge to the company. We expect that the close collaboration will continue with BTH-researchers and our developers actively working together and sharing information as well as results.
Stefan Bernbo, Compuverde
Together with BTH we are working on different image processing and analysis algorithms. The work has already been very fruitful, with both joint publications and patent applications.

“Data are becoming the new raw material of business.” – Craig Mundie, Senior Advisor to the CEO at Microsoft.

Recent publications

A. Cheddad, “Structure Preserving Binary Image Morphing using Delaunay triangulation,” Pattern Recognition Letters, 2016. (accepted for publication, to appear)

Abstract
Mathematical morphology has been of a great significance to several scientific fields. Dilation, as one of the fundamental operations, has been very much reliant on the common methods based on the set theory and on using specific shaped structuring elements to morph binary blobs. We hypothesised that by performing morphological dilation while exploiting geometry relationship between dot patterns, one can gain some advantages. The Delaunay triangulation was our choice to examine the feasibility of such hypothesis due to its favourable geometric properties. We compared our proposed algorithm to existing methods and it becomes apparent that Delaunay based dilation has the potential to emerge as a powerful tool in preserving objects structure and elucidating the influence of noise. Additionally, defining a structuring element is no longer needed in the proposed method and the dilation is adaptive to the topology of the dot patterns. We assessed the property of object structure preservation by using common measurement metrics. We also demonstrated such property through handwritten digit classification using HOG descriptors extracted from dilated images of different approaches and trained using Support Vector Machines. The confusion matrix shows that our algorithm has the best accuracy estimate in 80% of the cases. In both experiments, our approach shows a consistent improved performance over other methods which advocates for the suitability of the proposed method.

Read more at http://www.sciencedirect.com/science/article/pii/S016786551630335X

H. Kusetogullari, “Unsupervised Text Binarization in Handwritten Historical Documents using k-means Clustering”, in Proc. IEEE International Science and Information Conference on Intelligent Systems, London, Sep. 21-22, 2016.

In this paper, we propose a novel technique for unsupervised text binarization in handwritten historical documents using kmeans clustering. In the text binarization problem, there are many challenges such as noise, faint characters and bleedthrough and it is necessary to overcome these tasks to increase the correct detection rate. To overcome these problems, preprocessing strategy is first used to enhance the contrast to improve faint characters and Gaussian Mixture Model (GMM) is used to ignore the noise and other artifacts in the handwritten historical documents. After that, the enhanced image is normalized which will be used in the postprocessing part of the proposed method. The handwritten binarization image is achieved by partitioning the normalized pixel values of the handwritten image into two clusters using k-means clustering with k = 2 and then assigning each normalized pixel to the one of the two clusters by using the minimum Euclidean distance between the normalized pixels intensity and mean normalized pixel value of the clusters. Experimental results verify the effectiveness of the proposed approach.

Read more at http://www2.bth.se/bloggar/bigdata/files/2016/11/Unsupervised-text-binarization-in-handwritten-historical-documents.pdf

All publications