For web content analysis, the information contained in image files has been identified as a valuable source that is currently not used to its full extent. In order to categorize images for this purpose, Arcada researchers have developed a computer vision prototype using deep learning . A journal paper describing the methodology and preliminary results will have been submitted by the end of December 2017 more detailed results will be ready for submission soon .
F-Secure has begun to explore how the new models can optimally be integrated into existing implementations of content filtering services, in collaboration with Arcada.
Related to the image classification task, Arcada has additionally focused on developing new strategies for dealing with noisy labels based on unsupervised learning  and active learning .
Arcada researchers have continued to study further how sparse random projections can be used for large scale learning tasks . The paper presents results using data related to the Android malware detection use case (Scenario 1), but the methods can also be applied to the other scenarios, and a wider range of problems involving large unstructured data. Related work  has been presented at the ELM 2017 conference (where Kaj-Mikael Björk acted as international liaison) to develop international collaboration around the subject.
Contributions published / to appear:
- Luiza Sayfullina, Emil Eirola, Dmitry Komashinsky, Paolo Palumbo, Juha Karhunen, "Android Malware Detection: Building Useful Representations", IEEE International Conference on Machine Learning and Applications (IEEE ICMLA'16), Anaheim, USA, 2016.
- Paolo Palumbo, Luiza Sayfullina, Dmitriy Komashinskiy, Emil Eirola, Juha Karhunen, "A pragmatic android malware detection procedure". Computers & Security, vol. 70, pp. 689-701, 2017.
- Monshizadeh, Mehrnoosh; Khatri, Vikramajeet; and Kantola, Raimo. “Detection as a service: An SDN application”. In 19th International Conference on Advanced Communication Technology (ICACT), Bongpyeong, pp. 285-290, 2017.
- Monshizadeh, Mehrnoosh; Khatri, Vikramajeet; Kantola, Raimo; and Yan, Zheng. “An Orchestrated Security Platform for Internet of Robots”. In proceedings of 12th International Conference, Green, Pervasive, and Cloud Computing (GPC), Italy, pp. 298-312, 2017.
- Monshizadeh, Mehrnoosh; and Khatri, Vikramajeet. “Mobile Virtual Network Operators (MVNO) Security”. In A Comprehensive Guide to 5G Security, Wiley Publishers, pp. 323-346, 2017.
- Monshizadeh, Mehrnoosh; and Khatri, Vikramajeet. “IoT Security”. In A Comprehensive Guide to 5G Security, Wiley Publishers, pp. 247-266, 2017.
- Monshizadeh, Mehrnoosh; Khatri, Vikramajeet; and Kantola, Raimo. “An adaptive detection and prevention architecture for unsafe traffic in SDN enabled mobile networks”. In IFIP/IEEE Symposium on Integrated Network and Service Management (IM), Lisbon, pp. 883-884, 2017.
- Anton Akusok, Emil Eirola, Yoan Miche, Ian Oliver, Kaj-Mikael Björk, Andrey Gritsenko, Stephen Baek, Amaury Lendasse, "Incremental ELMVIS for unsupervised learning", International Conference on Extreme Learning Machines (ELM2016), Singapore, 2016.
- Anton Akusok, Emil Eirola, Yoan Miche, Andrey Gritsenko, and Amaury Lendasse “Advanced Query Strategies for Active Learning with Extreme Learning Machine”. In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), 2017.
- Anton Akusok, Emil Eirola. “Comparison of Classification Methods for Very High-Dimensional Data in Sparse Random Projection Representation” Submitted to Neurocomputing – Special Issue on Advances in Data Representation and Learning for Pattern Analysis, under review.
- Anton Akusok, Emil Eirola, Kaj-Mikael Björk, Amaury Lendasse, "Extreme Learning Tree". International Conference on Extreme Learning Machines (ELM 2017), Yantai, China, 2017.
- Leonardo Espinosa Leal, Kaj-Mikael Björk, Amaury Lendasse, Anton Akusok, "A Web Page Classifier Library Based on Random Image Content Analysis Using Deep Learning". Submitted to PETRA'18.
- Anton Akusok, Leonardo Espinosa Leal. “Full Page Web Content Analysis”. To appear.