EMPNet: An extract-map-predict neural network architecture for cross-domain recommendation

Abstract

Cross-domain recommendation leverages a user’s historical interactions in the auxiliary domain to suggest items within the target domain, particularly for cold-start users with no prior activity in the target domain. Existing cross-domain recommendation models often overlook key aspects such as the complexities of transferring user interests between domains and the biases inherent in user behavior patterns. In contrast, our Extract-Map-Predict Neural Network Architecture (EMPNet) employs a disentanglement approach to map fine-grained user interests and utilize the biases inherent in the cross-domain recommendation. In feature extraction, we use the Bidirectional Encoder Representations from Transformers (BERT) and Identity-Enhanced Multi-Head Attention Mechanism to obtain the user and item feature vectors. In cross-domain user mapping, we disentangle the user feature vector into domain-shared and domain-specific interests for fine-grained cross-domain mapping to obtain the feature vector of cold-start users in the target domain. In rating prediction, we design a biased Attentional Factorization Machine (AFM) to utilize biases extracted from user and item features. We experimentally evaluate EMPNet on the Amazon dataset. The results show that it clearly outperforms the selected baselines.

Path-based approximate matching of fuzzy spatiotemporal RDF data

Abstract

As fuzzy spatiotemporal information continuously increases in RDF database, it is challenging to model and query fuzzy spatiotemporal RDF data efficiently and effectively. However, various researches are studied in temporal RDF database, spatial RDF database, and spatiotemporal RDF database. Querying fuzzy spatiotemporal RDF data has received relatively little attention, especially approximate matching of fuzzy spatiotemporal RDF data. To accomplish this, we first study fuzzy spatiotemporal RDF data graph, spatiotemporal RDF query graph, and path of fuzzy spatiotemporal RDF data graph. Then, we propose a scoring function for approximate evaluation of fuzzy spatiotemporal RDF data graph and spatiotemporal RDF query graph. After dividing the fuzzy spatiotemporal RDF data graphs into five categories based on their structure, we propose the decomposition algorithm, matching algorithm, and combination algorithm for approximate matching of fuzzy spatiotemporal RDF data. Our approach adopts path-based matching so that it is easy to discover the relations between two vertices in fuzzy spatiotemporal RDF data graph. Finally, the experimental results demonstrate the performance advantages of our approach.

PriMonitor: An adaptive tuning privacy-preserving approach for multimodal emotion detection

Abstract

The proliferation of edge computing and the Internet of Vehicles (IoV) has significantly bolstered the popularity of deep learning-based driver assistance applications. This has paved the way for the integration of multimodal emotion detection systems, which effectively enhance driving safety and are increasingly prevalent in our daily lives. However, the utilization of in-vehicle cameras and microphones has raised concerns regarding the extensive collection of driver privacy data. Applying privacy-preserving techniques to a single modality alone proves insufficient in preventing privacy re-identification when correlated with other modalities. In this paper, we introduce PriMonitor, an adaptive tuning privacy-preserving approach for multimodal emotion detection. PriMonitor tackles these challenges by proposing a generalized random response-based differential privacy method that not only enhances the speed and data availability of text privacy protection but also ensures privacy preservation across multiple modalities. To determine suitable weight assignments within a given privacy budget, we introduce pre-aggregator and iterative mechanisms. Our PriMonitor effectively mitigates privacy re-identification due to modal correlation while maintaining a high level of accuracy in multimodal models. Experimental results validate the efficiency and competitiveness of our approach.

Invariant representation learning to popularity distribution shift for recommendation

Abstract

Recommender systems often suffer from severe performance drops due to popularity distribution shift (PDS), which arises from inconsistencies in item popularity between training and test data. Most existing methods aimed at mitigating PDS focus on reducing popularity bias, but they usually require inaccessible information or rely on implausible assumptions. To solve the above problem, in this work, we propose a novel framework called Invariant Representation Learning (IRL) to PDS. Specifically, for simulating diverse popularity environments where popular items and active users become even more popular and active, or conversely, we apply perturbations to the user-item interaction matrix by adjusting the weights of popular items and active users in the matrix, without any prior assumptions or specialized information. In different simulated popularity environments, dissimilarities in the distribution of representations for items and users occur. We further utilize contrastive learning to minimize the dissimilarities among the representations of users and items under different simulated popularity environments, resulting in invariant representations that remain consistent across varying popularity distributions. Extensive experiments on three real-world datasets demonstrate that IRL outperforms state-of-the-art baselines in effectively alleviating PDS for recommendation.

Multi-stage dynamic disinformation detection with graph entropy guidance

Abstract

Online disinformation has become one of the most severe concerns in today’s world. Recognizing disinformation timely and effectively is very hard, because the propagation process of disinformation is dynamic and complicated. The existing newest research leverage uniform time intervals to study the multi-stage propagation features of disinformation. However, uniform time intervals are unrealistic in the real world, cause the process of information propagation is not regular. In light of these facts, we propose a novel and effective framework Multi-stage Dynamic Disinformation Detection with Graph Entropy Guidance(MsDD) to better analyze multi-stage propagation patterns. Instead of traditional snapshots, we analyze the dynamic propagation network via graph entropy, which can work effectively in finding the dynamic and variable-length stages. In this way, we can explicitly learn the changing pattern of propagation stages and support timely detection even at the early stages. Based on this effective multi-stage analysis framework, we further propose a novel dynamic analysis model to model both the structural and sequential evolving features. Extensive experiments on two real-world datasets prove the superiority of our model. We open the datasets and source code at https://github.com/researchxr/MsDD.

Adaptive retrofitting for industrial machines: utilizing webassembly and peer-to-peer connectivity on the edge

Abstract

Leveraging previously untapped data sources offers significant potential for value creation in the manufacturing sector. However, asset-heavy shop floors, extended machine replacement cycles, and equipment diversity necessitate considerable investments for achieving smart manufacturing, which can be particularly challenging for small businesses. Retrofitting presents a viable solution, enabling the integration of low-cost sensors and microcontrollers with older machines to collect and transmit data. In this paper, we introduce a concept and a prototype for retrofitting industrial environments using lightweight web technologies at the edge. Our approach employs WebAssembly as a novel bytecode standard, facilitating a consistent development environment from the cloud to the edge by operating on both browsers and bare-metal hardware. By attaining near-native performance and modularity reminiscent of container-based service architectures, we demonstrate the feasibility of our approach. Our prototype was evaluated with an actual industrial robot within a showcase factory, including measurements of data exchange with a cutting-edge data lake system. We further extended the prototype to incorporate a peer-to-peer network that facilitates message routing and WebAssembly software updates. Our technology establishes a foundational framework for the transition towards Industry 4.0. By integrating considerations of sustainability and human factors, it further extends this groundwork to facilitate progression into Industry 5.0.

Discrete cross-modal hashing with relaxation and label semantic guidance

Abstract

Supervised cross-modal hashing has attracted many researchers. In these studies, they seek a common semantic space or directly regress the zero-one label information into the Hamming space. Although they achieve many achievements, they neglect some issues: 1) some methods of the classification task are not suitable for retrieval tasks, since they are lack of learning personalized features of sample; 2) the outcomes of hash retrieval are related to both the length and encoding method of hash codes. Because a sample possess more personalized features than label semantics, in this paper, we propose a novel supervised cross-modal hashing collaboration learning method called discrete Cross-modal Hashing with Relaxation and Label Semantic Guidance (CHRLSG). First, we introduce two relaxation variables as latent spaces. One is used to extract text features and label semantic information collaboratively, and the other is used to extract image features and label semantics collaboratively. Second, the more accurate hash codes are generated from latent spaces, since CHRLSG learns collaboratively feature semantics and label semantics by using labels as the domination and features as the auxiliary. Third, we utilize labels to strengthen the similar relationship of inter-modal samples via keeping the pairwise closeness. Label semantics are made full use of to avoid classification error. Fourth, we introduce class weight to further increase the discrimination of samples that belong to different classes in intra-modal and keep the similarity of samples unchanged. Therefore, CHRLSG model preserves not only the relationship between samples, but also maintains the consistency of label semantic during collaboration optimization. Experimental results of three common benchmark datasets demonstrate that the proposed model is superior to the existing advanced methods.

Rumor blocking with pertinence set in large graphs

Abstract

Online social networks facilitate the spread of information, while rumors can also propagate widely and fast, which may mislead some users. Therefore, suppressing the spread of rumors has become a daunting task. One of the widely used approaches is to select users in the social network to spread the truth and compete against the rumor, so that users who receive the truth before receiving rumors will not trust or propagate the rumor. However, the existing works only aim to speed up blocking rumors without considering the pertinency of users. For example, consider a social media platform operator aiming to enhance user online safety. Based on the user’s online behavior, the users who are at high risk should be alerted first. Motivated by this, we formally define the rumor blocking with pertinence set (RBP) problem, which aims to find a truth seed set that maximizes the number of nodes affected by truth and ensures that the number of influenced nodes within the pertinence set reaches at least a given threshold. To solve this problem, we design a hybrid greedy framework (HGF) algorithm with local and global phases. We prove that HGF can provide a \((1-1/e-\epsilon )\) -approximate solution with high probability while reducing the cost of the sampling process. Extensive experiments on 8 real social networks demonstrate the efficiency and effectiveness of our proposed algorithms.

Efficient approximation and privacy preservation algorithms for real time online evolving data streams

Abstract

Because of the processing of continuous unstructured large streams of data, mining real-time streaming data is a more challenging research issue than mining static data. The privacy issue persists when sensitive data is included in streaming data. In recent years, there has been significant progress in research on the anonymization of static data. For the anonymization of quasi-identifiers, two typical strategies are generalization and suppression. However, the high dynamicity and potential infinite properties of the streaming data make it a challenging task. To end this, we propose a novel Efficient Approximation and Privacy Preservation Algorithms (EAPPA) framework in this paper to achieve efficient data pre-processing from the live streaming and its privacy preservation with minimum Information Loss (IL) and computational requirements. As the existing privacy preservation solutions for streaming data suffer from the challenges of redundant data, we first propose the efficient technique of data approximation with data pre-processing. We design the Flajolet Martin (FM) algorithm for robust and efficient approximation of unique elements in the data stream with a data cleaning mechanism. We fed the periodically approximated and pre-processed streaming data to the anonymization algorithm. Using adaptive clustering, we propose innovative k-anonymization and l-diversity privacy principles for data streams. The proposed approach scans a stream to detect and reuse clusters that fulfill the k-anonymity and l-diversity criteria for reducing anonymization time and IL. The experimental results reveal the efficiency of the EAPPA framework compared to state-of-art methods.

A novel robust memetic algorithm for dynamic community structures detection in complex networks

Abstract

Networks in the real world are dynamic and evolving. The most critical process in networks is to determine the structure of the community, based on which we can detect hidden communities in a complex network. The design of strong network structures is of great importance, meaning that a system must maintain its function in the face of attacks and failures and have a strong community structure. In this paper, we proposed the robust memetic algorithm and used the idea to optimize the detection of dynamic communities in complex networks called RDMA_NET (Robust Dynamic Memetic Algorithm). In this method, we work on dynamic data that affects the two main parts of the initial population value and the calculation of the evaluation function of each population, and there is no need to determine the number of communities in advance. We used two sets of real-world networks and the LFR dataset. The results show that our proposed method, RDMA_Net, can find a better solution than modern approaches and provide near-optimal performance in search of network topologies with a strong community structure.