Sablok, G, Pérez-Pulido, AJ, Do, T, Seong, TY, Casimiro-Soriguer, CS, La Porta, N, Ralph, PJ, Squartini, A, Muñoz-Merida, A & Harikrishna, JA 2016, 'PlantFuncSSR: Integrating first and next generation transcriptomics for mining of SSR-functional domains markers', Frontiers in Plant Science, vol. 7, pp. 1-9.View/Download from: UTS OPUS or Publisher's site
© 2016 Sablok, Pérez-Pulido, Do, Seong, Casimiro-Soriguer, La Porta, Ralph, Squartini, Muñoz-Merida and Harikrishna. Analysis of repetitive DNA sequence content and divergence among the repetitive functional classes is a well-accepted approach for estimation of inter- and intrageneric differences in plant genomes. Among these elements, microsatellites, or Simple Sequence Repeats (SSRs), have been widely demonstrated as powerful genetic markers for species and varieties discrimination. We present PlantFuncSSRs platform having more than 364 plant species with more than 2 million functional SSRs. They are provided with detailed annotations for easy functional browsing of SSRs and with information on primer pairs and associated functional domains. PlantFuncSSRs can be leveraged to identify functional-based genic variability among the species of interest, which might be of particular interest in developing functional markers in plants. This comprehensive on-line portal unifies mining of SSRs from first and next generation sequencing datasets, corresponding primer pairs and associated in-depth functional annotation such as gene ontology annotation, gene interactions and its identification from reference protein databases. PlantFuncSSRs is freely accessible at: http://www. bioinfocabd.upo.es/plantssr.
Do, TDT & Cao, L 2018, 'Coupled Poisson factorization integrated with user/item metadata for modeling popular and sparse ratings in scalable recommendation', 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, AAAI Conference on Artificial Intelligence, AAI, New Orleans, USA, pp. 2918-2925.View/Download from: UTS OPUS
Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Modelling sparse and large data sets is highly in demand yet challenging in recommender systems. With the computation only on the non-zero ratings, Poisson Factorization (PF) enabled by variational inference has shown its high efficiency in scalable recommendation, e.g., modeling millions of ratings. However, as PF learns the ratings by individual users on items with the Gamma distribution, it cannot capture the coupling relations between users (items) and the rating popularity (i.e., favorable rating scores that are given to one item) and rating sparsity (i.e., those users (items) with many zero ratings) for one item (user). This work proposes Coupled Poisson Factorization (CPF) to learn the couplings between users (items), and the user/item attributes (i.e., metadata) are integrated into CPF to form the Metadata-integrated CPF (mCPF) to not only handle sparse but also popular ratings in very large-scale data. Our empirical results show that the proposed models significantly outperform PF and address the key limitations in PF for scalable recommendation.
Do, TDT & Cao, L 2018, 'Gamma-Poisson Dynamic Matrix Factorization Embedded with Metadata Influence', Advances in Neural Information Processing Systems, Advances in Neural Information Processing Systems, Neural Information Processing Systems Foundation, Montreal, Canada, pp. 1-12.View/Download from: UTS OPUS
A conjugate Gamma-Poisson model for Dynamic Matrix Factorization incorporated with metadata influence (mGDMF for short) is proposed to effectively and efficiently model massive, sparse and dynamic data in recommendations. Modeling recommendation problems with a massive number of ratings and very sparse or even no ratings on some users/items in a dynamic setting is very demanding and poses critical challenges to well-studied matrix factorization models due to the large-scale, sparse and dynamic nature of the data. Our proposed mGDMF tackles these challenges by introducing three strategies: (1) constructing a stable Gamma-Markov chain model that smoothly drifts over time by combining both static and dynamic latent features of data; (2) incorporating the user/item metadata into the model to tackle sparse ratings; and (3) undertaking stochastic variational inference to efficiently handle massive data. mGDMF is conjugate, dynamic and scalable. Experiments show that mGDMF significantly (both effectively and efficiently) outperforms the state-of-the-art static and dynamic models on large, sparse and dynamic data.
Do, TDT & Cao, L 2018, 'Metadata-dependent infinite poisson factorization for efficiently modelling sparse and large matrices in recommendation', IJCAI International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, IJCAI, Stockholm, Sweden, pp. 5010-5016.View/Download from: UTS OPUS or Publisher's site
© 2018 International Joint Conferences on Artificial Intelligence.All right reserved. Matrix Factorization (MF) is widely used in Recommender Systems (RSs) for estimating missing ratings in the rating matrix. MF faces major challenges of handling very sparse and large data. Poisson Factorization (PF) as an MF variant addresses these challenges with high efficiency by only computing on those non-missing elements. However, ignoring the missing elements in computation makes PF weak or incapable for dealing with columns or rows with very few observations (corresponding to sparse items or users). In this work, Metadata-dependent Poisson Factorization (MPF) is invented to address the user/item sparsity by integrating user/item metadata into PF. MPF adds the metadata-based observed entries to the factorized PF matrices. In addition, similar to MF, choosing the suitable number of latent components for PF is very expensive on very large datasets. Accordingly, we further extend MPF to Metadata-dependent Infinite Poisson Factorization (MIPF) that integrates Bayesian Nonparametric (BNP) technique to automatically tune the number of latent components. Our empirical results show that, by integrating metadata, MPF/MIPF significantly outperform the state-of-the-art PF models for sparse and large datasets. MIPF also effectively estimates the number of latent components.