Dr. Marian-Andrei Rizoiu is lecturer with the University of Technology Sydney, leading the Behavioral Data Science group, studying the dynamics of human attention in the online environment. He is interested in stochastic behavioural modelling of human actions online, at the intersection of applied statistics, artificial intelligence and social data science. His research has made several key contributions, particularly to the areas of online popularity prediction and online privacy. For the past four years, he has been developing theoretical models for online information diffusion, which can account for complex social phenomena, such as the rise and fall of online popularity, the spread of misinformation or the adoption of disruptive technologies. He approached questions such as "Why did X become popular, but not Y?" and "How can items be promoted?" with implications in advertising and marketing. Marian-Andrei has also worked on detecting the evolution of privacy loss over time. His research has shown that privacy "leaks" over time and it identified the factors causing the loss: the individual's own actions and the environment. The conclusions were staggering: privacy continues to decrease even for users who retired from activity.
Marian-Andrei research has an inter-disciplinary focus. He lead two research grants: the first on quantifying the social influence of automatic diffusion systems in the electoral process (with social scientists) and detecting hate speech for the early prediction of mass atrocities and genocides (with political scientists).
Marian-Andrei published in the most selective venues of the field of Data Science and Web Research, such the International World Wide Web Conference (WWW), the conference on Web Search and Data Mining (WSDM), the International Conference of the Web and Social Media (ICWSM), or the Conference on Information and Knowledge Management (CIKM). He serves as a PC member for prestigious conferences and journals, such as AAAI, WWW and ICWSM, and the Journal of Machine Learning Research. His work has received significant media attention, including from the Wikimedia Foundation for the work concerning the privacy of Wikipedia editors (which featured in the March 2016 Wikimedia Research Showcase). See more at http://www.rizoiu.eu
Media attention. Marian-Andrei's work has received significant media attention, among which:
- Both the Business Insider and the ANU Reporter wrote about our findings concerning the bot influence in the 2016 US elections.
- I presented my findings concerning the privacy of Wikipedia editors to the Wikimedia Foundation (the legal entity that handles and represents Wikipedia), in the March 2016 edition of the Wikimedia Research Showcase. The showcase was live streamed on YouTube and it had an international reach to both researchers and general public.
- My Wikipedia privacy work was featured in ANU’s news media outlet.
- My work on social media popularity was covered by the ANU Reporter and NCI News.
Can supervise: YES
- Machine Learning for social media;
- Big Social Data Science: algorithms and applications;
- influence, polarisation, radicalisation through the prism of online social media;
- spatio-temporal information diffusion;
- (technical) stochastic point process modelling, epidemic models, bayesian learning.
See here for the complete list of courses taught and student projects.
Teaching. I hold a pedagogical degree in higher education and I have a teaching experience of 10 years. Overall, I have delivered more than 600 hours of lectures and tutoring for Undergraduates, Masters and Honours and I lectured in international excellent degree programs, such as the Masters Erasmus Mundus Excellence DMKM1 and the Franco-Ukrainian Masters IDSM2 (cooperation between the University Lumiere Lyon and the University of Kharkov, Ukraine).
Supervision completion. More than 45 students: 4 PhD students, 2 RA/postdoc, 1 visiting postgrad students, 5 Honours (Masters by research) students, 4 summer scholar students, more than 30 coursework masters students. See here for the complete list of alumni students and their projects.
Teaching quality. For the past four years, I obtained high evaluations in ANU’s official Student Experience of Learning and Teaching (SELT) (see attached 2017 SELT evaluation of my teaching).
Diverse teaching. I taught a wide range of CS subjects (Programming, Calculus, Networking, Algorithms Design), of Machine Learning and Data Mining subjects (association rules mining, decision trees, clustering, symbolic learning, ensemble methods) and Social Media Analysis. This document details the complete list of these courses.
Kern, ML, McCarthy, PX, Chakrabarty, D & Rizoiu, M-A 2019, 'Social media-predicted personality traits and values can help match people to their ideal jobs.', Proceedings of the National Academy of Sciences of the United States of America.View/Download from: UTS OPUS or Publisher's site
Work is thought to be more enjoyable and beneficial to individuals and society when there is congruence between one's personality and one's occupation. We provide large-scale evidence that occupations have distinctive psychological profiles, which can successfully be predicted from linguistic information unobtrusively collected through social media. Based on 128,279 Twitter users representing 3,513 occupations, we automatically assess user personalities and visually map the personality profiles of different professions. Similar occupations cluster together, pointing to specific sets of jobs that one might be well suited for. Observations that contradict existing classifications may point to emerging occupations relevant to the 21st century workplace. Findings illustrate how social media can be used to match people to their ideal occupation.
Kim, D, Graham, T, Wan, Z & Rizoiu, M-A 2019, 'Analysing user identity via time-sensitive semantic edit distance (t-SED): a case study of Russian trolls on Twitter', Journal of Computational Social Science, vol. 2, no. 2, pp. 331-351.View/Download from: UTS OPUS or Publisher's site
Rizoiu, M-A, Guille, A & Velcin, J 2015, 'CommentWatcher: An Open Source Web-based platform for analyzing discussions on web forums.', CoRR, vol. abs/1504.07459.
Rizoiu, M-A, Velcin, J & Lallich, S 2015, 'Semantic-enriched visual vocabulary construction in a weakly supervised context', INTELLIGENT DATA ANALYSIS, vol. 19, no. 1, pp. 161-185.View/Download from: Publisher's site
Rizoiu, M-A, Velcin, J & Lallich, S 2015, 'Semantic-enriched visual vocabulary construction in a weakly supervised context.', Intell. Data Anal., vol. 19, pp. 161-185.
Rizoiu, M-A, Velcin, J & Lallich, S 2014, 'How to Use Temporal-Driven Constrained Clustering to Detect Typical Evolutions.', International Journal on Artificial Intelligence Tools, vol. 23.
Rizoiu, M-A, Velcin, J & Lallich, S 2013, 'Unsupervised feature construction for improving data representation and semantics.', J. Intell. Inf. Syst., vol. 40, pp. 501-527.
Rizoiu, M-A, Velcin, J & Lallich, S 2013, 'Unsupervised feature construction for improving data representation and semantics', JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, vol. 40, no. 3, pp. 501-527.View/Download from: UTS OPUS or Publisher's site
Muşat, C, Trǎuşan-Matu, S, Velcin, J & Rizoiu, MA 2012, 'Automatic extraction of conceptual labels from topic models', UPB Scientific Bulletin, Series C: Electrical Engineering, vol. 74, no. 2, pp. 57-68.
This work outlines a novel system that automatically extracts conceptual labels for statistically obtained topics. By creating a projection of the topic, which is a distribution over all the vocabulary words, over the WordNet ontology we succeed in associating concepts to the said groups of words. The most important contributions of this paper are connected to the validation of the role of these concepts as topical labels and the determination of correlations that emerge between the utility of these labels and the strength of the relation between the concepts and the topics.
In this work, we develop a new approximation method to solve the analytically
intractable Bayesian inference for Gaussian process models with factorizable
Gaussian likelihoods and single-output latent functions. Our method -- dubbed
QP -- is similar to the expectation propagation (EP), however it minimizes the
$L^2$ Wasserstein distance instead of the Kullback-Leibler (KL) divergence. We
consider the specific case in which the non-Gaussian likelihood is approximated
by the Gaussian likelihood. We show that QP has the following properties: (1)
QP matches quantile functions rather than moments in EP; (2) QP and EP have the
same local update for the mean of the approximate Gaussian likelihood; (3) the
local variance estimate for the approximate likelihood is smaller for QP than
for EP's, addressing EP's over-estimation of the variance; (4) the optimal
approximate Gaussian likelihood enjoys a univariate parameterization, reducing
memory consumption and computation time. Furthermore, we provide a unified
interpretations of EP and QP -- both are coordinate descent algorithms of a KL
and an $L^2$ Wasserstein global objective function respectively, under the same
assumptions. In the performed experiments, we employ eight real world datasets
and we show that QP outperforms EP for the task of Gaussian process binary
Rizoiu, MA & Velcin, J 2011, 'Topic extraction for ontology learning' in Ontology Learning and Knowledge Discovery Using the Web: Challenges and Recent Advances, pp. 38-60.View/Download from: Publisher's site
This chapter addresses the issue of topic extraction from text corpora for ontology learning. The first part provides an overview of some of the most significant solutions present today in the literature. These solutions deal mainly with the inferior layers of the Ontology Learning Layer Cake. They are related to the challenges of the Terms and Synonyms layers. The second part shows how these pieces can be bound together into an integrated system for extracting meaningful topics. While the extracted topics are not proper concepts as yet, they constitute a convincing approach towards concept building and therefore ontology learning. This chapter concludes by discussing the research undertaken for filling the gap between topics and concepts as well as perspectives that emerge today in the area of topic extraction. © 2011, IGI Global.
Dawson, NJ, Rizoiu, M-A, Johnston, B & Williams, M-A 2019, 'Adaptively selecting occupations to detect skill shortages from online job ads', IEEE International Conference on Big Data (IEEE Big Data 2019), IEEE International Conference on Big Data, Los Angeles, CA, USA, pp. 1-7.View/Download from: UTS OPUS
Labour demand and skill shortages have historically been difficult to assess given the high costs of conducting representative surveys and the inherent delays of these indicators. This is particularly consequential for fast developing skills and occupations, such as those relating to Data Science and Analytics (DSA). This paper develops a data-driven solution to detecting skill shortages from online job advertisements (ads) data. We first propose a method to generate sets of highly similar skills based on a set of seed skills from job ads. This provides researchers with a novel method to adaptively select occupations based on granular skills data. Next, we apply this adaptive skills similarity technique to a dataset of over 6.7 million Australian job ads in order to identify occupations with the highest proportions of DSA skills. This uncovers 306,577 DSA job ads across 23 occupational classes from 2012-2019. Finally, we propose five variables for detecting skill shortages from online job ads: (1) posting frequency; (2) salary levels; (3) education requirements; (4) experience demands; and (5) job ad posting predictability. This contributes further evidence to the goal of detecting skills shortages in real-time. In conducting this analysis, we also find strong evidence of skills shortages in Australia for highly technical DSA skills and occupations. These results provide insights to Data Science researchers, educators, and policy-makers from other advanced economies about the types of skills that should be cultivated to meet growing DSA labour demands in the future.
Mihaita, A-S, Li, H, He, Z & Rizoiu, M-A 2019, 'Motorway Traffic Flow Prediction using Advanced Deep Learning', 2019 IEEE Intelligent Transportation Systems Conference (ITSC), IEEE, pp. 1683-1690.View/Download from: UTS OPUS or Publisher's site
Congestion prediction represents a major priority for traffic management centres around the world to ensure timely incident response handling. The increasing amounts of generated traffic data have been used to train machine learning predictors for traffic, however this is a challenging task due to inter-dependencies of traffic flow both in time and space. Recently, deep learning techniques have shown significant prediction improvements over traditional models, however open questions remain around their applicability, accuracy and parameter tuning. This paper proposes an advanced deep learning framework for simultaneously predicting the traffic flow on a large number of monitoring stations along a highly circulated motorway in Sydney, Australia, including exit and entry loop count stations, and over varying training and prediction time horizons. The spatial and temporal features extracted from the 36.34 million data points are used in various deep learning architectures that exploit their spatial structure (convolutional neuronal networks), their temporal dynamics (recurrent neuronal networks), or both through a hybrid spatio-temporal modelling (CNN-LSTM). We show that our deep learning models consistently outperform traditional methods, and we conduct a comparative analysis of the optimal time horizon of historical data required to predict traffic flow at different time points in the future.
Mihăiţă, AS, Liu, Z, Rizoiu, MA & Cai, C 2019, 'Arterial incident duration prediction using a bi-level framework of extreme gradient-tree boosting', ITS World Congress 2019 (ITSWC2019), Singapore.View/Download from: UTS OPUS
© 2019 Association for Computing Machinery. Online videos have shown tremendous increase in Internet traffic. Most video hosting sites implement recommender systems, which connect the videos into a directed network and conceptually act as a source of pathways for users to navigate. At present, little is known about how human attention is allocated over such large-scale networks, and about the impacts of the recommender systems. In this paper, we first construct the Vevo network — a YouTube video network with 60,740 music videos interconnected by the recommendation links, and we collect their associated viewing dynamics. This results in a total of 310 million views every day over a period of 9 weeks. Next, we present large-scale measurements that connect the structure of the recommendation network and the video attention dynamics. We use the bow-tie structure to characterize the Vevo network and we find that its core component (23.1% of the videos), which occupies most of the attention (82.6% of the views), is made out of videos that are mainly recommended among themselves. This is indicative of the links between video recommendation and the inequality of attention allocation. Finally, we address the task of estimating the attention flow in the video recommendation network. We propose a model that accounts for the network effects for predicting video popularity, and we show it consistently outperforms the baselines. This model also identifies a group of artists gaining attention because of the recommendation network. Altogether, our observations and our models provide a new set of tools to better understand the impacts of recommender systems on collective social attention.
Zhang, R, Walder, C, Rizoiu, M-A & Xie, L 2019, 'Efficient Non-parametric Bayesian Hawkes Processes', Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, International Joint Conferences on Artificial Intelligence Organization, pp. 4299-4305.View/Download from: UTS OPUS or Publisher's site
In this paper, we develop an efficient non-parametric Bayesian estimation of the kernel function of Hawkes processes. The non-parametric Bayesian approach is important because it provides flexible Hawkes kernels and quantifies their uncertainty. Our method is based on the cluster representation of Hawkes processes. Utilizing the stationarity of the Hawkes process, we efficiently sample random branching structures and thus, we split the Hawkes process into clusters of Poisson processes. We derive two algorithms — a block Gibbs sampler and a maximum a posteriori estimator based on expectation maximization — and we show that our methods have a linear time complexity, both theoretically and empirically. On synthetic data, we show our methods to be able to infer flexible Hawkes triggering kernels. On two large-scale Twitter diffusion datasets, we show that our methods outperform the current state-of-the-art in goodness-of-fit and that the time complexity is linear in the size of the dataset. We also observe that on diffusions related to online videos, the learned kernels reflect the perceived longevity for different content types such as music or pets videos.
Rizoiu, M-A, Graham, T, Zhang, R, Zhang, Y, Ackland, R & Xie, L 2018, '#DebateNight: The Role and Influence of Socialbots on Twitter During the 1st 2016 U.S. Presidential Debate.', ICWSM, AAAI Press, pp. 300-309.
Kong, Q, Rizoiu, M-A, Wu, S & Xie, L 2018, 'Will This Video Go Viral', Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW '18, Companion of the The Web Conference 2018, ACM Press.View/Download from: Publisher's site
Mishra, S, Rizoiu, MA & Xie, L 2018, 'Modeling popularity in asynchronous social media streams with recurrent neural networks', 12th International AAAI Conference on Web and Social Media, ICWSM 2018, International AAAI Conference on Web and Social Media,, AAAI, Stanford, USA, pp. 201-210.View/Download from: UTS OPUS
Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Understanding and predicting the popularity of online items is an important open problem in social media analysis. Considerable progress has been made recently in data-driven predictions, and in linking popularity to external promotions. However, the existing methods typically focus on a single source of external influence, whereas for many types of online content such as YouTube videos or news articles, attention is driven by multiple heterogeneous sources simultaneously - e.g. microblogs or traditional media coverage. Here, we propose RNN-MAS, a recurrent neural network for modeling asynchronous streams. It is a sequence generator that connects multiple streams of different granularity via joint inference. We show RNN-MAS not only outperforms the current state-of-the-art Youtube popularity prediction system by 17%, but also captures complex dynamics, such as seasonal trends of unseen influence. We define two new metrics: the promotion score quantifies the gain in popularity from one unit of promotion for a Youtube video; the loudness level captures the effects of a particular user tweeting about the video. We use the loudness level to compare the effects of a video being promoted by a single highly-followed user (in the top 1% most followed users) against being promoted by a group of mid-followed users. We find that results depend on the type of content being promoted: superusers are more successful in promoting Howto and Gaming videos, whereas the cohort of regular users are more influential for Activism videos. This work provides more accurate and explainable popularity predictions, as well as computational tools for content producers and marketers to allocate resources for promotion campaigns.
Rizoiu, MA, Graham, T, Zhang, R, Zhang, Y, Ackland, R & Xie, L 2018, '#DebateNight: The role and influence of socialbots on twitter during the first 2016 U.S. presidential debate', 12th International AAAI Conference on Web and Social Media, ICWSM 2018, International AAAI Conference on Web and Social Media, AAAI, Stanford, USA, pp. 300-309.View/Download from: UTS OPUS
Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Serious concerns have been raised about the role of 'socialbots' in manipulating public opinion and influencing the outcome of elections by retweeting partisan content to increase its reach. Here we analyze the role and influence of socialbots on Twitter by determining how they contribute to retweet diffusions. We collect a large dataset of tweets during the 1st U.S. presidential debate in 2016 and we analyze its 1.5 million users from three perspectives: user influence, political behavior (partisanship and engagement) and botness. First, we define a measure of user influence based on the user's active contributions to information diffusions, i.e. their tweets and retweets. Given that Twitter does not expose the retweet structure - it associates all retweets with the original tweet - we model the latent diffusion structure using only tweet time and user features, and we implement a scalable novel approach to estimate influence over all possible unfoldings. Next, we use partisan hashtag analysis to quantify user political polarization and engagement. Finally, we use the BotOrNot API to measure user botness (the likelihood of being a bot). We build a two-dimensional “polarization map” that allows for a nuanced analysis of the interplay between botness, partisanship and influence. We find that not only are socialbots more active on Twitter - starting more retweet cascades and retweeting more - but they are 2.5 times more influential than humans, and more politically engaged. Moreover, pro-Republican bots are both more influential and more politically engaged than their pro-Democrat counterparts. However we caution against blanket statements that software designed to appear human dominates politics-related activity on Twitter. Firstly, it is known that accounts controlled by teams of humans (e.g. organizational accounts) are often identified as bots. Seco...
Rizoiu, M-A, Mishra, S, Kong, Q, Carman, M & Xie, L 2018, 'SIR-Hawkes: Linking Epidemic Models and Hawkes Processes to Model Diffusions in Finite Populations', WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018), 27th World Wide Web (WWW) Conference, ASSOC COMPUTING MACHINERY, Lyon, FRANCE, pp. 419-428.View/Download from: UTS OPUS or Publisher's site
Wu, S, Rizoiu, MA & Xie, L 2018, 'Beyond views: Measuring and predicting engagement in online videos', 12th International AAAI Conference on Web and Social Media, ICWSM 2018, International AAAI Conference on Web and Social Media, AAAI, Stanford, USA, pp. 434-443.View/Download from: UTS OPUS
Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. The share of videos in the internet traffic has been growing, therefore understanding how videos capture attention on a global scale is also of growing importance. Most current research focus on modeling the number of views, but we argue that video engagement, or time spent watching is a more appropriate measure for resource allocation problems in attention, networking, and promotion activities. In this paper, we present a first large-scale measurement of video-level aggregate engagement from publicly available data streams, on a collection of 5.3 million YouTube videos published over two months in 2016. We study a set of metrics including time and the average percentage of a video watched. We define a new metric, relative engagement, that is calibrated against video properties and strongly correlate with recognized notions of quality. Moreover, we find that engagement measures of a video are stable over time, thus separating the concerns for modeling engagement and those for popularity - the latter is known to be unstable over time and driven by external promotions. We also find engagement metrics predictable from a cold-start setup, having most of its variance explained by video context, topics and channel information - R2=0.77. Our observations imply several prospective uses of engagement metrics - choosing engaging topics for video production, or promoting engaging videos in recommender systems.
Rizoiu, MA & Xie, L 2017, 'Online popularity under promotion: Viral potential, forecasting, and the economics of time', Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017, pp. 182-191.View/Download from: UTS OPUS
© Copyright 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Modeling the popularity dynamics of an online item is an important open problem in computational social science. This paper presents an in-depth study of popularity dynamics under external promotions, especially in predicting popularity jumps of online videos, and determining effective and efficient schedules to promote online content. The recently proposed Hawkes Intensity Process (HIP) models popularity as a non-linear interplay between exogenous stimuli and the endogenous reactions. Here, we propose two novel metrics based on HIP: to describe popularity gain per unit of promotion, and to quantify the time it takes for such effects to unfold. We make increasingly accurate forecasts of future popularity by including information about the intrinsic properties of the video, promotions it receives, and the non-linear effects of popularity ranking. We illustrate by simulation the interplay between the unfolding of popularity over time, and the time-sensitive value of resources. Lastly, our model lends a novel explanation of the commonly adopted periodic and constant promotion strategy in advertising, as increasing the perceived viral potential. This study provides quantitative guidelines about setting promotion schedules considering content virality, timing, and economics.
Rizoiu, M-A & Xie, L 2017, 'Online Popularity Under Promotion: Viral Potential, Forecasting, and the Economics of Time.', ICWSM, AAAI Press, pp. 182-191.
Rizoiu, M-A, Xie, L, Sanner, S, Cebrian, M, Yu, H & Van Henteryck, P 2017, 'Expecting to be HIP: Hawkes Intensity Processes for Social Media Popularity', PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'17), 26th International Conference on World Wide Web (WWW), ASSOC COMPUTING MACHINERY, Perth, AUSTRALIA, pp. 735-744.View/Download from: UTS OPUS or Publisher's site
Mishra, S, Rizoiu, M-A & Xie, L 2016, 'Feature Driven and Point Process Approaches for Popularity Prediction', CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 25th ACM International Conference on Information and Knowledge Management (CIKM), ASSOC COMPUTING MACHINERY, IUPUI, Indianapolis, IN, pp. 1069-1078.View/Download from: UTS OPUS or Publisher's site
Mishra, S, Rizoiu, M-A & Xie, L 2016, 'Feature Driven and Point Process Approaches for Popularity Prediction.', CIKM, ACM, pp. 1069-1078.
Rizoiu, M-A, Velcin, J, Bonnevay, S & Lallich, S 2016, 'ClusPath: a temporal-driven clustering to infer typical evolution paths', DATA MINING AND KNOWLEDGE DISCOVERY, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD), SPRINGER, Riva del Garda, ITALY, pp. 1324-1349.View/Download from: Publisher's site
Rizoiu, M-A, Xie, L, Caetano, T & Cebrian, M 2016, 'Evolution of Privacy Loss in Wikipedia', PROCEEDINGS OF THE NINTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'16), 9th Annual ACM International Conference on Web Search and Data Mining (WSDM), ASSOC COMPUTING MACHINERY, San Francisco, CA, pp. 215-224.View/Download from: UTS OPUS or Publisher's site
Rizoiu, M-A, Xie, L, Caetano, TS & Cebrián, M 2016, 'Evolution of Privacy Loss in Wikipedia.', WSDM, ACM, pp. 215-224.
Kim, YM, Velcin, J, Bonnevay, S & Rizoiu, MA 2015, 'Temporal multinomial mixture for instance-oriented evolutionary clustering', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 593-604.View/Download from: UTS OPUS
© Springer International Publishing Switzerland 2015. Evolutionary clustering aims at capturing the temporal evolution of clusters. This issue is particularly important in the context of social media data that are naturally temporally driven. In this paper, we propose a new probabilistic model-based evolutionary clustering technique. The Temporal Multinomial Mixture (TMM) is an extension of classical mixture model that optimizes feature co-occurrences in the trade-off with temporal smoothness. Our model is evaluated for two recent case studies on opinion aggregation over time. We compare four different probabilistic clustering models and we show the superiority of our proposal in the task of instance-oriented clustering.
Rizoiu, M-A, Velcin, J & Lallich, S 2012, 'How to Use Temporal-Driven Constrained Clustering to Detect Typical Evolutions', INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, IEEE 24th International Conference on Tools with Artificial Intelligence (ICTAI), WORLD SCIENTIFIC PUBL CO PTE LTD, Athens, GREECE.View/Download from: Publisher's site
Rizoiu, MA 2013, 'Semi-supervised structuring of complex data', IJCAI International Joint Conference on Artificial Intelligence, pp. 3239-3240.
The objective of the thesis is to explore how complex data can be treated using unsupervised machine learning techniques, in which additional information is injected to guide the exploratory process. Starting from specific problems, our contributions take into account the different dimensions of the complex data: their nature (image, text), the additional information attached to the data (labels, structure, concept ontologies) and the temporal dimension. A special attention is given to data representation and how additional information can be leveraged to improve this representation.
Rizoiu, M-A, Velcin, J & Lallich, S 2012, 'Structuring typical evolutions using Temporal-Driven Constrained Clustering', 2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, IEEE 24th International Conference on Tools with Artificial Intelligence (ICTAI), IEEE, Athens, GREECE, pp. 610-617.View/Download from: Publisher's site
We propose a system which employs conceptual knowledge to improve topic models by removing unrelated words from the simplified topic description. We use WordNet to detect which topical words are not conceptually similar to the others and then test our assumptions against human judgment. Results obtained on two different corpora in different test conditions show that the words detected as unrelated had a much greater probability than the others to be chosen by human evaluators as not being part of the topic at all. We prove that there is a strong correlation between the said probability and an automatically calculated topical fitness and we discuss the variation of the correlation depending on the method and data used. © 2011 Springer-Verlag Berlin Heidelberg.
Musat, CC, Velcin, J, Trausan-Matu, S & Rizoiu, MA 2011, 'Improving topic evaluation using conceptual knowledge', IJCAI International Joint Conference on Artificial Intelligence, pp. 1866-1871.View/Download from: Publisher's site
The growing number of statistical topic models led to the need to better evaluate their output. Traditional evaluation means estimate the model's fitness to unseen data. It has recently been proven than the output of human judgment can greatly differ from these measures. Thus the need for methods that better emulate human judgment is stringent. In this paper we present a system that computes the conceptual relevance of individual topics from a given model on the basis of information drawn from a given concept hierarchy, in this case WordNet. The notion of conceptual relevance is regarded as the ability to attribute a concept to each topic and separate words related to the topic from the unrelated ones based on that concept. In multiple experiments we prove the correlation between the automatic evaluation method and the answers received from human evaluators, for various corpora and difficulty levels. By changing the evaluation focus from a statistical one to a conceptual one we were able to detect which topics are conceptually meaningful and rank them accordingly.
Rizoiu, M-A, Velcin, J & Chauchat, J-H 2010, 'Regrouper les données textuelles et nommer les groupes à l'aide de classes recouvrantes.', EGC, Cépaduès-Éditions, pp. 561-572.
Epidemic models and self-exciting processes are two types of models used to
describe diffusion phenomena online and offline. These models were originally
developed in different scientific communities, and their commonalities are
under-explored. This work establishes, for the first time, a general connection
between the two model classes via three new mathematical components. The first
is a generalized version of stochastic Susceptible-Infected-Recovered (SIR)
model with arbitrary recovery time distributions; the second is the
relationship between the (latent and arbitrary) recovery time distribution,
recovery hazard function, and the infection kernel of self-exciting processes;
the third includes methods for simulating, fitting, evaluating and predicting
the generalized process. On three large Twitter diffusion datasets, we conduct
goodness-of-fit tests and holdout log-likelihood evaluation of self-exciting
processes with three infection kernels --- exponential, power-law and Tsallis
Q-exponential. We show that the modeling performance of the infection kernels
varies with respect to the temporal structures of diffusions, and also with
respect to user behavior, such as the likelihood of being bots. We further
improve the prediction of popularity by combining two models that are
identified as complementary by the goodness-of-fit tests.
The Hawkes process (HP) has been widely applied to modeling self-exciting
events including neuron spikes, earthquakes and tweets. To avoid designing
parametric triggering kernel and to be able to quantify the prediction
confidence, the non-parametric Bayesian HP has been proposed. However, the
inference of such models suffers from unscalability or slow convergence. In
this paper, we aim to solve both problems. Specifically, first, we propose a
new non-parametric Bayesian HP in which the triggering kernel is modeled as a
squared sparse Gaussian process. Then, we propose a novel variational inference
schema for model optimization. We employ the branching structure of the HP so
that maximization of evidence lower bound (ELBO) is tractable by the
expectation-maximization algorithm. We propose a tighter ELBO which improves
the fitting performance. Further, we accelerate the novel variational inference
schema to linear time complexity by leveraging the stationarity of the
triggering kernel. Different from prior acceleration methods, ours enjoys
higher efficiency. Finally, we exploit synthetic data and two large social
media datasets to evaluate our method. We show that our approach outperforms
state-of-the-art non-parametric frequentist and Bayesian methods. We validate
the efficiency of our accelerated variational inference schema and practical
utility of our tighter ELBO for model selection. We observe that the tighter
ELBO exceeds the common one in model selection.
Rizoiu, M-A, Wang, T, Ferraro, G & Suominen, H, 'Transfer Learning for Hate Speech Detection in Social Media'.
In today's society more and more people are connected to the Internet, and
its information and communication technologies have become an essential part of
our everyday life. Unfortunately, the flip side of this increased connectivity
to social media and other online contents is cyber-bullying and -hatred, among
other harmful and anti-social behaviors. Models based on machine learning and
natural language processing provide a way to detect this hate speech in web
text in order to make discussion forums and other media and platforms safer.
The main difficulty, however, is annotating a sufficiently large number of
examples to train these models. In this paper, we report on developing
automated text analytics methods, capable of jointly learning a single
representation of hate from several smaller, unrelated data sets. We train and
test our methods on the total of $37,520$ English tweets that have been
annotated for differentiating harmless messages from racist or sexists contexts
in the first detection task, and hateful or offensive contents in the second
detection task. Our most sophisticated method combines a deep neural network
architecture with transfer learning. It is capable of creating word and
sentence embeddings that are specific to these tasks while also embedding the
meaning of generic hate speech. Its prediction correctness is the
macro-averaged F1 of $78\%$ and $72\%$ in the first and second task,
respectively. This method enables generating an interpretable two-dimensional
text visualization --- called the Map of Hate --- that is capable of separating
different types of hate speech and explaining what makes text harmful. These
methods and insights hold a potential for not only safer social media, but also
reduced need to expose human moderators and annotators to distressing