The process unfolds in five core stages: data input, graph construction, feature processing, core computation, and output prediction, emphasizing the key roles of heterogeneous graph characteristics, meta-path index construction, and node-level attention mechanisms. The following is a detailed process description: Model Overall Process Overview The THAN model, based on a heterogeneous graph (containing users, cascade nodes, and various relationships), captures semantic associations between nodes through meta-path indexing, combines a time decay mechanism and multi-head attention to compute node embeddings, and is ultimately used for single-step prediction of cascade propagation. The process can be divided into 5 core stages, as follows: 1. Data Input and Initialization Input Data: Heterogeneous graph data (graph): Includes nodes (user, cascade), edges (social, interaction, diffusion), and edge attributes (such as timestamp). Initial features: User features (user_initial_features) and cascade features (cascade_initial_features), stored in dictionary form (id_to_idx maps node ID to feature index). Configuration parameters: Meta-path types (e.g., U-U-social, U-U-interact, C-U-C), number of attention heads, time decay coefficient lambda_time, etc. Initialization Operations: Device configuration (CPU/GPU) and memory optimization (cache clearing, asynchronous data transfer). Model component initialization: Feature projection layer (type_transform), multi-head attention parameters (att_params), etc. 2. Heterogeneous Graph Meta-Path Index Construction The build_metapath_index method is used to pre-compute the meta-path index, capturing the associations between different types of nodes and time decay features, to speed up subsequent attention calculations: Meta-Path Definition: U-U-social: User - Social - User (no timestamp, only stores neighbor relationships). U-U-interact: User - Interact - User (stores interaction timestamp, time decay value, and prefix sum). C-U-C: Cascade - User - Cascade (cascade relationship connected through intermediate users, stores diffusion timestamp, time decay value, and prefix sum). Index Content: For meta-paths containing time information (U-U-interact, C-U-C), store by node pair: Sorted list of timestamps (ts). Time decay value (decay, calculated based on 1 - exp(-lambda*(t-T_earliest))). Prefix sum of decay values (prefix, accelerates cumulative weight calculation). Index Cache: The calculation results are saved locally (save_metapath_full_cache) to avoid repeated calculations. 3. Feature Projection and Node Embedding Initialization Feature Projection: Through type_tr
This document outlines a mathematical modeling approach for ...