Deep learning in histopathology: A review

Deep learning in histopathology: A review

In this advanced review, we address the challenges of analyzing scanned tissue slides to diagnose cancer using deep learning.


Abstract

Histopathology is diagnosis based on visual examination of tissue sections under a microscope. With the growing number of digitally scanned tissue slide images, computer-based segmentation and classification of these images is a high-demand area of research. Convolutional neural networks (CNNs) constitute the most popular classification architecture for a variety of image classification problems. However, applying CNNs to histology slides is not a trivial task and has several challenges, ranging from variations in the colors of slides to excessive high resolution and lack of proper labeling. In this advanced review, we introduce the application of CNN-based architectures to digital histological image analysis, discuss some problems associated with such analysis, and look at possible solutions.

This article is categorized under: Application Areas > Health Care Fundamental Concepts of Data and Knowledge > Big Data Mining Technologies > Machine Learning

Themes in data mining, big data, and crime analytics

Themes in data mining, big data, and crime analytics

Fast-paced developments in crime analytics presently draw from big data and analytics. While this offers great potential for a better society, there are many factors that need serious consideration and regulation.


Abstract

This article examines the impact of new AI-related technologies in data mining and big data on important research questions in crime analytics. Because the field is so broad, the review focuses on a selection of the most important topics. Challenges for information management, and in turn law and society, include: AI-powered predictive policing; big data for legal and adversarial decisions; bias using big data and analytics in profiling and predicting criminality; forecasting crime risk and crime rates; and, regulating AI systems.

This article is categorized under: Algorithmic Development > Spatial and Temporal Data Mining Fundamental Concepts of Data and Knowledge > Big Data Mining Technologies > Artificial Intelligence Application Areas > Data Mining Software Tools

Predicting home sale prices: A review of existing methods and illustration of data stream methods for improved performance

Predicting home sale prices: A review of existing methods and illustration of data stream methods for improved performance

The framework of predicting home sale prices.


Abstract

The need for accurate and unbiased assessment of residential real property has always been important not only to financial institutions lending on or holding such assets but also to municipalities that rely on property taxes as their critical source of revenue. The common methodology for predicting residential property sale price is based on traditional multiple regression in spite of known issues. Machine learning methods have been proposed as an alternative approach but the results are far from satisfactory. A review of existing studies and relevant issues can help researchers better assess the pros and cons of the approaches in this important stream of research and move the field forward. This article provides such a review. In our review, we have noticed that common to both the regression-based methods and machine learning methods are the use of batch-mode learning. Thus in addition to providing a review of recent research on batch-based residential property prediction models, this article also explores a new approach to constructing residential property price prediction models by treating past sale records as an evolving data stream. The results of our study show that the data stream approach outperforms the traditional regression method and demonstrate the potential of data stream methods in improving prediction models for residential property prices.

This article is categorized under: Application Areas > Business and Industry Technologies > Machine Learning Technologies > Prediction

Multivariate temporal data analysis ‐ a review

Multivariate temporal data analysis - a review

The increase of monitoring devices increases the availability of multivariate longitudinal data, which provides significant opportunities in understanding how things evolve in various domains. Nevertheless, with opportunities come also challenges.


ABSTRACT

The information technology revolution, especially with the adoption of the Internet of Things, longitudinal data in many domains become more available and accessible for secondary analysis. Such data provide meaningful opportunities to understand process in many domains along time, but also challenges. A main challenge is the heterogeneity of the temporal variables due to the different types of data, whether a measurement or an event, and type of samplings: fixed or irregular. Other variables can be also events that may or not have duration. In this review, we discuss the various types of temporal data, and the various relevant analysis methods. Starting with fixed frequency variables, with forecasting and time series methods, and proceeding with sequential data, and sequential patterns mining, and time intervals mining for events having various time duration. Also the use of various deep learning based architectures for temporal data is discussed. The challenge of heterogeneous multivariate temporal data analysis and discuss various options to deal with it, focusing on an increasingly used option of transforming the data into symbolic time intervals through temporal abstraction and the use of time intervals related patterns discovery for temporal knowledge discovery, clustering, classification prediction, and more. Finally, we discuss the overview of the field, and areas in which more studies and contributions are needed.

This article is categorized under: Algorithmic Development > Spatial and Temporal Data Mining

Detecting communities using social network analysis in online learning environments: Systematic literature review

Detecting communities using social network analysis in online learning environments: Systematic literature review

Most applicable SNA measures


Abstract

Uncovering community structure has made a significant advancement in explaining, analyzing, and forecasting behaviors and dynamics of networks related to different fields in sociology, criminology, biology, medicine, communication, economics, and academia. Detecting and clustering communities is a powerful step toward identifying the structural properties and the behavioral patterns in social networks. Recently, online learning has been progressively adopted by a lot of educational practices which raise many questions about assessing the learners' engagement, collaboration, and behaviors in the new emerging learning communities. This systematic literature review aims to assess the use of community detection techniques in analyzing the network's structure in online learning environments. It provides a comprehensive overview of the existing research that adopted those techniques with identifying the educational objectives behind their application as well as suggesting possible future research directions. Our analysis covered 65 studies that found in the literature and applied different community discovery techniques on various types of online learning environments to analyze their users' interactions patterns. Our review revealed the potential of this field in improving educational practices and decisions and in utilizing the massive amount of data generated from interacting with those environments. Finally, we highlighted the need to include automated community discovery techniques in online learning environments to facilitate and enhance their use as well as we stressed on the urge for further advance research to uncover a lot of hidden opportunities.

This article is categorized under: Algorithmic Development > Statistics Algorithmic Development > Web Mining Application Areas > Education and Learning

Overview of accurate coresets

Overview of accurate coresets

Accurate coreset illustration example for the problem of computing an optimal fitting line for an input set of points.


Abstract

A coreset of an input set is its small summarization, such that solving a problem on the coreset as its input, provably yields the same result as solving the same problem on the original (full) set, for a given family of problems (models/classifiers/loss functions). Coresets have been suggested for many fundamental problems, for example, in machine/deep learning, computer vision, databases, and theoretical computer science. This introductory paper was written following requests regarding the many inconsistent coreset definitions, lack of source code, the required deep theoretical background from different fields, and the dense papers that make it hard for beginners to apply and develop coresets. The article provides folklore, classic, and simple results including step-by-step proofs and figures, for the simplest (accurate) coresets. Nevertheless, we did not find most of their constructions in the literature. Moreover, we expect that putting them together in a retrospective context would help the reader to grasp current results that usually generalize these fundamental observations. Experts might appreciate the unified notation and comparison table for existing results. Open source code is provided for all presented algorithms, to demonstrate their usage, and to support the readers who are more familiar with programming than mathematics.

This article is categorized under: Algorithmic Development > Structure Discovery Fundamental Concepts of Data and Knowledge > Big Data Mining Technologies > Machine Learning

Mining text from natural scene and video images: A survey

Mining text from natural scene and video images: A survey

Text mining at a glance.


Abstract

In computer terminology, mining is considered as extracting meaningful information or knowledge from a large amount of data/information using computers. The meaningful information can be extracted from normal text, and images obtained from different resources, such as natural scene images, video, and documents by deriving semantics from text and content of the images. Although there are many pieces of work on text/data mining and several survey/review papers are published in the literature, to the best of our knowledge there is no survey paper on mining textual information from the natural scene, video, and document images considering word spotting techniques. In this article, we, therefore, provide a comprehensive review of both the non-spotting and spotting based mining techniques. The mining approaches are categorized as feature, learning and hybrid-based methods to analyze the strengths and limitations of the models of each category. In addition, it also discusses the usefulness of the methods according to different situations and applications. Furthermore, based on the review of different mining approaches, this article identifies the limitations of the existing methods and suggests new applications and future directions to continue the research in multiple directions. We believe such a review article will be useful to the researchers to quickly become familiar with the state-of-the-art information and progresses made toward mining textual information from natural scene and video images.

This article is categorized under: Algorithmic Development > Text Mining

Critical insights into modern hyperspectral image applications through deep learning

Critical insights into modern hyperspectral image applications through deep learning

Emerging Applications of Hyperspectral Imaging.


Abstract

Hyperspectral imaging has shown tremendous growth over the past three decades. Hyperspectral imaging was evolved through remote sensing. Along, with the technological enhancements hyperspectral imaging has outgrown, conquering over other various application areas. In addition to it, data enriched data cubes with abundant spectral and spatial information works as perk for capturing, analyzing, reviewing, and interpreting results from data. This review concentrates on emerging application areas of hyperspectral imaging. Emerging application areas are selected in ways where there is a vast scope for future enhancements by exploiting cutting edge technology, that is, deep learning. Applications of hyperspectral imaging techniques in some selected areas (remote sensing, document forgery, history and archaeology conservation, surveillance and security, machine vision for fruit quality inspection, medical imaging) are focused. The review pivots around the publicly available datasets and features used domain wise. This review can act as a baseline for deep learning and machine vision experts, historical geographers, and scholars by providing them a view of how hyperspectral imaging is implemented in multiple domains along with future research prospects.

This article is categorized under: Technologies > Machine Learning Technologies > Prediction

Multimodal sentimental analysis for social media applications: A comprehensive review

Multimodal sentimental analysis for social media applications: A comprehensive review

Block diagram of multimodal sentiment analysis.


Abstract

The analysis of sentiments is essential in identifying and classifying opinions regarding a source material that is, a product or service. The analysis of these sentiments finds a variety of applications like product reviews, opinion polls, movie reviews on YouTube, news video analysis, and health care applications including stress and depression analysis. The traditional approach of sentiment analysis which is based on text involves the collection of large textual data and different algorithms to extract the sentiment information from it. But multimodal sentimental analysis provides methods to carry out opinion analysis based on the combination of video, audio, and text which goes a way beyond the conventional text-based sentimental analysis in understanding human behaviors. The remarkable increase in the use of social media provides a large collection of multimodal data that reflects the user's sentiment on certain aspects. This multimodal sentimental analysis approach helps in classifying the polarity (positive, negative, and neutral) of the individual sentiments. Our work aims to present a survey of recent developments in analyzing the multimodal sentiments (involving text, audio, and video/image) which involve human–machine interaction and challenges involved in analyzing them. A detailed survey on sentimental dataset, feature extraction algorithms, data fusion methods, and efficiency of different classification techniques are presented in this work.

This article is categorized under: Commercial, Legal, and Ethical Issues > Social Considerations

Text‐based question answering from information retrieval and deep neural network perspectives: A survey

Text-based question answering from information retrieval and deep neural network perspectives: A survey

Text-based question answering (QA) has been widely studied in information retrieval (IR) communities. By the advent of deep learning (DL) techniques, various DL-based methods have been used for this task which have not been studied and compared well. In this paper, we provide a comprehensive overview of different models proposed for QA, including both traditional IR perspective, and more recent DL perspective.


Abstract

Text-based question answering (QA) is a challenging task which aims at finding short concrete answers for users' questions. This line of research has been widely studied with information retrieval (IR) techniques and has received increasing attention in recent years by considering deep neural network approaches. Deep learning (DL) approaches, which are the main focus of this paper, provide a powerful technique to learn multiple layers of representations and interaction between the questions and the answer sentences. In this paper, we provide a comprehensive overview of different models proposed for the QA task, including both a traditional IR perspective and a more recent deep neural network environment. We also introduce well-known datasets for the task and present available results from the literature to have a comparison between different techniques.

This article is categorized under: Algorithmic Development > Text Mining Technologies > Machine Learning