Sablok, G, Pérez-Pulido, AJ, Do, T, Seong, TY, Casimiro-Soriguer, CS, La Porta, N, Ralph, PJ, Squartini, A, Muñoz-Merida, A & Harikrishna, JA 2016, 'PlantFuncSSR: Integrating first and next generation transcriptomics for mining of SSR-functional domains markers', Frontiers in Plant Science, vol. 7, pp. 1-9.View/Download from: Publisher's site
© 2016 Sablok, Pérez-Pulido, Do, Seong, Casimiro-Soriguer, La Porta, Ralph, Squartini, Muñoz-Merida and Harikrishna. Analysis of repetitive DNA sequence content and divergence among the repetitive functional classes is a well-accepted approach for estimation of inter- and intrageneric differences in plant genomes. Among these elements, microsatellites, or Simple Sequence Repeats (SSRs), have been widely demonstrated as powerful genetic markers for species and varieties discrimination. We present PlantFuncSSRs platform having more than 364 plant species with more than 2 million functional SSRs. They are provided with detailed annotations for easy functional browsing of SSRs and with information on primer pairs and associated functional domains. PlantFuncSSRs can be leveraged to identify functional-based genic variability among the species of interest, which might be of particular interest in developing functional markers in plants. This comprehensive on-line portal unifies mining of SSRs from first and next generation sequencing datasets, corresponding primer pairs and associated in-depth functional annotation such as gene ontology annotation, gene interactions and its identification from reference protein databases. PlantFuncSSRs is freely accessible at: http://www. bioinfocabd.upo.es/plantssr.
Do, TDT, Termier, A, Laurent, A, Negrevergne, B, Omidvar-Tehrani, B & Amer-Yahia, S 2015, 'PGLCM: efficient parallel mining of closed frequent gradual itemsets', Knowledge and Information Systems, vol. 43, no. 3, pp. 497-527.View/Download from: Publisher's site
© 2014, Springer-Verlag London. Numerical data (e.g., DNA micro-array data, sensor data) pose a challenging problem to existing frequent pattern mining methods which hardly handle them. In this framework, gradual patterns have been recently proposed to extract covariations of attributes, such as: “When X increases, Y decreases”. There exist some algorithms for mining frequent gradual patterns, but they cannot scale to real-world databases. We present in this paper GLCM, the first algorithm for mining closed frequent gradual patterns, which proposes strong complexity guarantees: the mining time is linear with the number of closed frequent gradual itemsets. Our experimental study shows that GLCM is two orders of magnitude faster than the state of the art, with a constant low memory usage. We also present PGLCM, a parallelization of GLCM capable of exploiting multicore processors, with good scale-up properties on complex datasets. These algorithms are the first algorithms capable of mining large real world datasets to discover gradual patterns.