UTS site search

Alireza Ahadi

Biography

My primary research interests are in the cognitive development of the novice computer programmer, and bioinformatics.

Professional

Reviewer for ICER. ACE, ITiCSE, SIGCSE, TOCE, Scientific Reports (Nature), and PLoS ONE.
Academic Mentor, School of Software
Information Technology
 

Research Interests

My primary field of research is machine learning and knowledge discovery on large scale datasets, specially in the context of computer science education research. However, I am also very passionate about Bioinformatics. I am currently involved with predicting microRNA targets on coding and non-coding genes.

Bioinformatics, Data analytics, Databases, Programming Languages, Data Structures and Algorithms (DSA), and Fundamentals of Software Development (FSD).

Conferences

Ahadi, A., Lal, S., Leinonen, J., Hellas, A. & Lister, R.A.Y.M.O.N.D. 2017, 'Performance and Consistency in Learning to Program', Australasian Computing Education Conference, Geelong.
View/Download from: UTS OPUS
Performance and consistency play a large role in learning. Decreasing the effort that one invests into course work may have short-term benefits such as reduced stress. However, as courses progress, neglected work accumulates and may cause challenges with learning the course content at hand. In this work, we analyze students' performance and consistency with programming assignments in an introductory programming course. We study how performance, when measured through progress in course assignments, evolves throughout the course, study weekly fluctuations in students' work consistency, and contrast this with students' performance in the course final exam. Our results indicate that whilst fluctuations in students' weekly performance do not distinguish poor performing students from well performing students with a high accuracy, more accurate results can be achieved when focusing on the performance of students on individual assignments which could be used for identifying struggling students who are at risk of dropping out of their studies.
Castro-Wunsch, K., Ahadi, A.L.I.R.E.Z.A. & Peterson, A. 2017, 'Evaluating Neural Networks as a Method for Identifying Students in Need of Assistance', SIGCSE technical symposium on computer science education, Washington, USA.
View/Download from: UTS OPUS
Course instructors need to be able to students in need of assistance as early in the course as possible. Recent work has suggested that machine learning approaches applied to snapshots of small programming exercises may be an effective solution to this problem. However, these results have been obtained using data from a single institution, and prior work using features extracted from student code has been highly sensitive to differences in context. This work provides two contributions: first, a partial reproduction of previously published results, but in a different context, and second, an exploration of the efficacy of neural networks in solving this problem. Our findings confirm the importance of two features (the number of steps required to solve a problem and the correctness of key problems), indicate that machine learning techniques are relatively stable across contexts (both across terms in a single course and across courses), and suggest that neural network based approaches are as effective as the best Bayesian and decision tree methods. Furthermore, neural networks can be tuned to be reliably pessimistic, so they may serve a complementary role in solving the problem of identifying students who need assistance.
Ahadi, A., Behbood, V., Lister, R., Prior, J. & Vihavainen, A. 2016, 'Students' Syntactic Mistakes in Writing Seven Different Types of SQL Queries and its Application to Predicting Students' Success', Proceedings of the 47th ACM Technical Symposium on Computing Science Education, Special Interest Group in COmputer Science Education, ACM, Memphis, Tennessee, pp. 401-406.
View/Download from: UTS OPUS or Publisher's site
The computing education community has studied extensively the errors of novice programmers. In contrast, little attention has been given to student's mistake in writing SQL statements. This paper represents the first large scale quantitative analysis of the student's syntactic mistakes in writing different types of SQL queries. Over 160 thousand snapshots of SQL queries were collected from over 2000 students across eight years. We describe the most common types of syntactic errors that students make. We also describe our development of an automatic classifier with an overall accuracy of 0.78 for predicting student performance in writing SQL queries.
Ihantola, P., Vihavainen, A., Ahadi, A., Butler, M., Börstler, J., Edwards, S.H., Isohanni, E., Korhonen, A., Petersen, A., Rivers, K., Rubio, M.A., Sheard, J., Skupas, B., Spacco, J., Szabo, C. & Toll, D. 2015, 'Educational Data Mining and Learning Analytics in Programming: Literature Review and Case Studies', ITiCSE-WGP'15: Proceedings of the 2015 ITiCSE on Working Group Reports, Innovation and Technology in Computer Science Education, Association for Computing Machinery, Lithunia, pp. 41-63.
View/Download from: UTS OPUS or Publisher's site
Educational data mining and learning analytics promise better understanding of student behavior and knowledge, as well as new information on the tacit factors that contribute to student actions. This knowledge can be used to inform decisions related to course and tool design and pedagogy, and to further engage students and guide those at risk of failure. This working group report provides an overview of the body of knowledge regarding the use of educational data mining and learning analytics focused on the teaching and learning of programming. In a literature survey on mining students' programming processes for 2005-2015, we observe a significant increase in work related to the field. However, the majority of the studies focus on simplistic metric analysis and are conducted within a single institution and a single course. This indicates the existence of further avenues of research and a critical need for validation and replication to better understand the various contributing factors and the reasons why certain results occur. We introduce a novel taxonomy to analyse replicating studies and discuss the importance of replicating and reproducing previous work. We describe what is the state of the art in collecting and sharing programming data. To better understand the challenges involved in replicating or reproducing existing studies, we report our experiences from three case studies using programming data. Finally, we present a discussion of future directions for the education and research community.
Ahadi, A., behbood, V., prior, J. & Lister, R. 2016, 'Students' Semantic Mistakes in Writing Seven Different Types of SQL Queries', Innovation and Technology in Computer Science Education, Peru.
View/Download from: UTS OPUS
Leinonen, J., Longi, K., Klami, A., Ahadi, A. & Vihavainen, A. 2016, 'Typing Patterns and Authentication in Practical Programming Exams', ITiCSE'16: Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education, Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education, Association for Computing Machinery, Inc, Peru, pp. 160-165.
View/Download from: UTS OPUS or Publisher's site
In traditional programming courses, students have usually been at least partly graded using pen and paper exams. One of the problems related to such exams is that they only partially connect to the practice conducted within such courses. Testing students in a more practical environment has been constrained due to the limited resources that are needed, for example, for authentication. In this work, we study whether students in a programming course can be identi ed in an exam setting based solely on their typing patterns. We replicate an earlier study that indicated that keystroke analysis can be used for identifying programmers. Then, we examine how a controlled machine examination setting a ects the identi cation accuracy, i.e. if students can be identi ed reliably in a machine exam based on typing pro les built with data from students' programming assignments from a course. Finally, we investigate the identification accuracy in an ncontrolled machine exam,where students can complete the exam at any time using any computer they want. Our results indicate that even though the identi cation accuracy deteriorates when identifying students in an exam, the accuracy is high enough to reliably identify students if the identi cation is not required to be exact, but top k closest matches are regarded as correct.
Ahadi, A., Lister, R.F. & Vihavainen, A. 2016, 'On the Number of Attempts Students Made on Some Online Programming Exercises During Semester and their Subsequent Performance on Final Exam Questions', Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE, Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE, Association for Computing Machinery, Arequipa, Peru, pp. 218-223.
View/Download from: UTS OPUS or Publisher's site
This paper explores the relationship between student performance on online programming exercises completed during semester with subsequent student performance on a final exam. We introduce an approach that combines whether or not a student produced a correct solution to an online exercise with information on the number of attempts at the exercise submitted by the student. We use data collected from students in an introductory Java course to assess the value of this approach. We compare the approach that utilizes the number of attempts to an approach that simply considers whether or not a student produced a correct solution to each exercise. We found that the results for the method that utilizes the number of attempts correlates better with performance on a final exam.
Ahadi, A. 2016, 'Early Identification of Novice Programmers' Challenges in Coding Using Machine Learning Techniques', Proceedings of the 2016 ACM Conference on International Computing Education Research, International Computing Education Research, ACM, Melbourne, Australia., pp. 263-264.
View/Download from: UTS OPUS or Publisher's site
It is well known that many first year undergraduate university students struggle with learning to program. Educational Data Mining (EDM) applies machine learning and statistics to information generated from educational settings. In this PhD project, EDM is used to study first semester novice programmers, using data collected from students as they work on computers to complete their normal weekly laboratory exercises. Analysis of the generated snapshots has shown the potential for early identification of students who later struggle in the course. The aim of this study is to propose a method for early identification of "at risk" students while providing suggestions on how they can improve their coding style. This PhD project is within its final year.
Ahadi, A., Hellas, A., Ihantola, P., Korhonen, A. & Petersen, A. 2016, 'Replication in Computing Education Research: Researcher Attitudes and Experiences', Koli International Conference on Computing Education Research, Koli, FInland.
View/Download from: UTS OPUS
Reproducibility is a core principle of the scientific method. However, several scientific disciplines have suffered crises in confidence caused, in large part, by attitudes toward replication. This work reports on the value the computing education research community associates with studies that aim to replicate, reproduce or repeat earlier research. The results were obtained from a large-scale (n=73) survey of computing education researchers. An analysis of the responses confirms that researchers in our field hold many of the same biases as those in other fields experiencing a crisis in replication. In particular, researchers agree that original works -- novel works that report new phenomena -- have more impact and are more prestigious. They also agree that originality is an important criteria for accepting a paper, making such work more likely to be published. Furthermore, while the respondents agree that verifiability is a desirable property of published work, they doubt this standard is widely met in the computing education field and, in addition, are not eager to do the work of verifying others' work themselves.
Ahadi, A., Lister, R.F. & Vihavainen, A. 2016, 'On the Number of Attempts Students Made on Some Online Programming Exercises During Semester and their Subsequent Performance on Final Exam Questions', Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE, Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE, Association for Computing Machinery, Arequipa, Peru, pp. 218-223.
View/Download from: UTS OPUS or Publisher's site
This paper explores the relationship between student performance on online programming exercises completed during semester with subsequent student performance on a final exam. We introduce an approach that combines whether or not a student produced a correct solution to an online exercise with information on the number of attempts at the exercise submitted by the student. We use data collected from students in an introductory Java course to assess the value of this approach. We compare the approach that utilizes the number of attempts to an approach that simply considers whether or not a student produced a correct solution to each exercise. We found that the results for the method that utilizes the number of attempts correlates better with performance on a final exam.
Ahadi, A., Prior, J., Behbood, V. & Lister, R. 2015, 'A Quantitative Study of the Relative Difficulty for Novices of Writing Seven Different Types of SQL Queries', Proceedings of the 2015 ACM Conference on Innovation and Technology in Computer Science Education, 2015 ACM Conference on Innovation and Technology in Computer Science Education, ACM, Lithuania, pp. 201-206.
View/Download from: UTS OPUS or Publisher's site
This paper presents a quantitative analysis of data collected by an online testing system for SQL "select" queries. The data was collected from almost one thousand students, over eight years. We examine which types of queries our students found harder to write. The seven types of SQL queries studied are: simple queries on one table; grouping, both with and without "having"; natural joins; simple and correlated sub-queries; and self-joins. The order of queries in the preceding sentence reflects the order of student difficulty we see in our data.
Ahadi, A., Lister, R., Haapala, H. & Vihavainen, A. 2015, 'Exploring Machine Learning Methods to Automatically Identify Students in Need of Assistance', Proceedings of the Eleventh Annual International Conference on International Computing Education Research, ACM, pp. 121-130.
View/Download from: UTS OPUS or Publisher's site
Teague, R., Ahadi, A. & Lister, R. 2015, 'Mired in the Web: Vignettes from Charlotte and Other Novice Programmers', 17th Australasian Computing Education Conference (ACE 2015), 17th Australasian Computer Education Conference, ACS, Sydney, Australia, pp. 165-174.
View/Download from: UTS OPUS
Ahadi and Lister (2013) found that many of their introductory programming students had fallen behind as early as week 3 of semester, and those students often then stayed behind. Our later work (Ahadi, Lister and Teague 2014) supported that finding, for students at another institution. In this paper, we go one step further than those earlier studies by observing a number of students as they complete programming tasks while thinking aloud. We describe the types of inconsistencies students manifest, which are often not evident on analysis of conventional written tests. We again interpret our findings using neoPiagetian theory. We conclude with some thoughts on the pedagogical implications of our research results.
Ahadi, A., Teague, D. & Lister, R.F. 2014, 'Falling Behind Early and Staying Behind When Learning to Program', Psychology of Programming Interest Group Annual Conference, Darwin College, Brighton, United Kingdom, pp. 77-88.
View/Download from: UTS OPUS
We have performed a study of novice programmers, using students at two different institutions, who were learning different programming languages. Influenced by the work of Dehnadi and Bornat, we gave our students a simple test, of our own devising, in their first three weeks of formal instruction in programming. That test only required knowledge of assignment statements. We found a wide performance difference among our two student cohorts. Furthermore, our test was a good indication of how students performed about 10 weeks later, in their final programming exam. We interpret our results in terms of our neo-Piagetian theory of how novices learn to program.
Ahadi, A. 2014, 'Applying Educational Data Mining to the Study of the Novice Programmer, within a Neo-Piagetian Theoretical Perspective', Psychology of Programming Interest Group.
View/Download from: UTS OPUS
Teague, D., Corney, M., Ahadi, A. & Lister, R.F. 2013, 'A Qualitative Think Aloud Study of the Early Neo-Piagetian Stages of Reasoning in Novice Programmers', Volume 136 - Fifteenth Australasian Computing Education Conference, Australasian Computing Education Conference, Australian Computer Society Inc, Adelaide, Australia, pp. 87-96.
View/Download from: UTS OPUS
Abstract: Recent research indicates that some of the difficulties faced by novice programmers are manifested very early in their learning. In this paper, we present data from think aloud studies that demonstrate the nature of those difficulties. In the think alouds, novices were required to complete short programming tasks which involved either hand executing ("tracing") a short piece of code, or writing a single sentence describing the purpose of the code. We interpret our think aloud data within a neo-Piagetian framework, demonstrating that some novices reason at the sensorimotor and preoperational stages, not at the higher concrete operational stage at which most instruction is implicitly targeted.
Ahadi, A. & Lister, R.F. 2013, 'Geek Genes, Prior Knowledge, Stumbling Points and Learning Edge Momentum: Parts of the One Elephant?', Proceedings of the 2013 ACM Conference on International Computing Education Research ICER, ACM Conference on International Computing Education Research, ACM, San Diego, CA, USA, pp. 123-128.
View/Download from: UTS OPUS or Publisher's site
ABSTRACT: Computing academics report bimodal grade distributions in their CS1 classes. Some academics believe that such a distribution is due to their being an innate talent for programming, a geek gene, which some students have, while other students do not have it. Robins introduced the concept of learning edge momentum, which offers an alternative explanation for the purported bimodal grade distribution. In this paper, we analyze empirical data from a real introductory programming class, looking for evidence of geek genes, learning edge momentum and other possible factors.
Corney, M., Teague, D., Ahadi, A. & Lister, R.F. 2012, 'Some Empirical Results for Neo-Piagetian Reasoning in Novice Programmers and the Relationship to Code Explanation Questions', Fourteenth Australasian Computing Education Conference (ACE2012), Australasian Computing Education Conference, Australian Computer Society Inc, Melbourne, Australia, pp. 77-86.
View/Download from: UTS OPUS
Abstract: Recent research on novice programmers has suggested that they pass through neo-Piagetian stages: sensorimotor, preoperational, and concrete operational stages, before eventually reaching programming competence at the formal operational stage. This paper presents empirical results in support of this neo-Piagetian perspective. The major novel contributions of this paper are empirical results for some exam questions aimed at testing novices for the concrete operational abilities to reason with quantities that are conserved, processes that are reversible, and properties that hold under transitive inference. While the questions we used had been proposed earlier by Lister, he did not present any data for how students performed on these questions. Our empirical results demonstrate that many students struggle to answer these problems, despite the apparent simplicity of these problems. We then compare student performance on these questions with their performance on six explain in plain English questions.
Teague, D., Corney, M., Ahadi, A. & Lister, R.F. 2012, 'Swapping as the Hello World of Relational Reasoning: Replications, Reflections and Extensions', Fourteenth Australasian Computing Education Conference (ACE2012), Australasian Computing Education Conference, Australian Computer Society Inc, Melbourne, Australia, pp. 87-94.
View/Download from: UTS OPUS
Abstract: At the previous conference in this series, Corney, Lister and Teague presented research results showing relationships between code writing, code tracing and code explaining, from as early as week 3 of semester. We concluded that the problems some students face in learning to program start very early in the semester. In this paper we report on our replication of that experiment, at two institutions, where one is the same as the original institution. In some cases, we did not find the same relationship between explaining code and writing code, but we believe this was because our teachers discussed the code in lectures between the two tests. Apart from that exception, our replication results at both institutions are consistent with our original study.
Teague, D., Corney, M., Fidge, C.F., Roggenkamp, M., Ahadi, A. & Lister, R.F. 2012, 'Using Neo-Piagetian Theory, Formative In-Class Tests and Think Alouds to Better Understand Student Thinking: A Preliminary Report on Computer Programming', Proceedings of the 23rd Annual Conference for the Australasian Association for Engineering Education - The Profession of Engineering Education: Advancing Teaching, Research and Careers, 23rd Annual Conference for the Australasian Association for Engineering Education - The Profession of Engineering Education: Advancing Teaching, Research and Careers, Swinburne University of Technology, Melbourne, Australia, pp. 1-9.
View/Download from: UTS OPUS
BACKGROUND Around the world, and for many years, students have struggled to learn to program computers. The reasons for this are poorly understood by their lecturers. PURPOSE When the intuitions of many skilled lecturers have failed to solve a pedagogical problem, then a systematic research programme is needed. We have implemented a research programme based on three elements: (1) a theory that provides an organising conceptual framework, (2) representative data on how the class performs on formative assessment tasks, and (3) microgenetic data from one-on-one think aloud sessions, to establish why students struggle with some of the formative tasks. DESIGN / METHOD We have adopted neo-Piagetian theory as our organising framework. We collect data by two methods. The first method is a series of small tests that we have students complete during lectures, at roughly two week intervals. These tests did not count toward the studentsâ final grade, which affords us the opportunity to ask unusual questions that probe at the boundaries of student understanding. Think aloud sessions are the second data collection method, in which a small number of selected, volunteer students attempt problems similar to the problems in the in-class tests. RESULTS The results in this paper serve to illustrate our research programme rather than answer a single, tight research question. These illustrative results focus upon one very simple type of programming question that was put to students, very early in their first programming subject. That simple question required students to write code to swap the values in two variables (e.g., temp = a; a = b; b = temp). The common intuition among programming lecturers is that students should be able to easily solve such a problem by, say, week 4 of semester. On the contrary, we found that 40% of students in a class at one of the participating institutions answered this question incorrectly in week 4 of semester. CONCLUSIONS What is emerging from this res...

Journal articles

Hatoum, D., Yagoub, D., Ahadi, A., Nassif, N.T. & McGowan, E.M. 2017, 'Annexin/S100A protein family regulation through p14ARF-p53 activation: A role in cell survival and predicting treatment outcomes in breast cancer', PLoS ONE, vol. 12, no. 1.
View/Download from: UTS OPUS or Publisher's site
© 2017 Hatoum et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.The annexin family and S100A associated proteins are important regulators of diverse calcium- dependent cellular processes including cell division, growth regulation and apoptosis. Dysfunction of individual annexin and S100A proteins is associated with cancer progression, metastasis and cancer drug resistance. This manuscript describes the novel finding of differential regulation of the annexin and S100A family of proteins by activation of p53 in breast cancer cells. Additionally, the observed differential regulation is found to be beneficial to the survival of breast cancer cells and to influence treatment efficacy. We have used unbiased, quantitative proteomics to determine the proteomic changes occurring post p14ARF-p53 activation in estrogen receptor (ER) breast cancer cells. In this report we identified differential regulation of the annexin/S100A family, through unique peptide recognition at the N-terminal regions, demonstrating p14ARF-p53 is a central orchestrator of the annexin/S100A family of calcium regulators in favor of pro-survival functions in the breast cancer cell. This regulation was found to be cell-type specific. Retrospective human breast cancer studies have demonstrated that tumors with functional wild type p53 (p53wt) respond poorly to some chemotherapy agents compared to tumors with a non-functional p53. Given that modulation of calcium signaling has been demonstrated to change sensitivity of chemotherapeutic agents to apoptotic signals, in principle, we explored the paradigm of how p53 modulation of calcium regulators in ER+ breast cancer patients impacts and influences therapeutic outcomes.
Ahadi, A., Lister, R. & Hellas, A. 2017, 'A Contingency Table Derived Methodology for Analyzing Course Data', ACM Transactions on Computing Education.
View/Download from: UTS OPUS
Ahadi, A., Hutvagner, G. & Connerty, P. 2016, 'RNA Binding Proteins in the miRNA Pathway', International Journal of Molecular Sciences, vol. 17, no. 31, pp. 1-16.
View/Download from: UTS OPUS or Publisher's site
microRNAs (miRNAs) are short ~22 nucleotides (nt) ribonucleic acids which post-transcriptionally regulate gene expression. miRNAs are key regulators of all cellular processes, and the correct expression of miRNAs in an organism is crucial for proper development and cellular function. As a result, the miRNA biogenesis pathway is highly regulated. In this review, we outline the basic steps of miRNA biogenesis and miRNA mediated gene regulation focusing on the role of RNA binding proteins (RBPs). We also describe multiple mechanisms that regulate the canonical miRNA pathway, which depends on a wide range of RBPs. Moreover, we hypothesise that the interaction between miRNA regulation and RBPs is potentially more widespread based on the analysis of available high-throughput datasets.
Ahadi, A., Brennan, S., Kennedy, P., Hutvagner, G. & Tran, N. 2016, 'Long non-coding RNAs harboring miRNA seed regions are enriched in prostate cancer exosomes', Scientific Reports, vol. 6, pp. 1-14.
View/Download from: UTS OPUS or Publisher's site
Long non-coding RNAs (lncRNAs) form the largest transcript class in the human transcriptome. These lncRNA are expressed not only in the cells, but they are also present in the cell-derived extracellular vesicles such as exosomes. The function of these lncRNAs in cancer biology is not entirely clear, but they appear to be modulators of gene expression. In this study, we characterize the expression of lncRNAs in several prostate cancer exosomes and their parental cell lines. We show that certain lncRNAs are enriched in cancer exosomes with the overall expression signatures varying across cell lines. These exosomal lncRNAs are themselves enriched for miRNA seeds with a preference for let-7 family members as well as miR-17, miR-18a, miR-20a, miR-93 and miR-106b. The enrichment of miRNA seed regions in exosomal lncRNAs is matched with a concomitant high expression of the same miRNA. In addition, the exosomal lncRNAs also showed an over representation of RNA binding protein binding motifs. The two most common motifs belonged to ELAVL1 and RBMX. Given the enrichment of miRNA and RBP sites on exosomal lncRNAs, their interplay may suggest a possible function in prostate cancer carcinogenesis
Ahadi, A., Khoury, S., Tran, N. & Zhang, X. 2016, 'Expression of microRNAs in HPV negative tonsil cancers and their regulation of PDCD4', Genomics Data, vol. 8, pp. 93-96.
View/Download from: UTS OPUS or Publisher's site
Global rates of tonsil cancer have been increasing since the turn of the millennia, however we still have a limited understanding of the genes and pathways which control this disease. This array dataset which is linked to our publication (Zhang et al., 2015) describes the profiling of human miRNAs in tonsil and normal adjacent tissues. With this dataset, we identified a list of microRNA (miRNA) which were highly over represented in tonsil cancers and showed that several miRNAs were able to regulate the tumour suppressor PDCD4 in a temporal manner. The dataset has been deposited into Gene Expression Omnibus (GSE75630).
Ahadi, A., Khoury, S., losseva, M. & Tran, N. 2016, 'A comparative analysis of lncRNAs in prostate cancer exosomes and theirparental cell lines', Genomics Data, vol. 9, pp. 7-9.
View/Download from: UTS OPUS or Publisher's site
Prostate cancer is the second leading cancer in men world-wide. Due to its heterogeneous nature, a considerable amount of research effort has been dedicated in identifying effective clinical biomarkers with a focus on proteins, messenger RNA and microRNAs. However, there is limited data on the role and expression of long noncoding RNAs (lncRNAs) in prostate cancer exosomes. This array dataset which is linked to our publication describes the profiling of human lncRNAs in prostate cancer and their exosomes from five different cell lines. From this dataset, we identified a list of statistically significant prostate cancer lncRNAs which are differentially expressed in the exosomes compared to their parent cell lines. This dataset has been deposited into Gene Expression Omnibus (GSE81034).
Ahadi, A., Sablok, G. & Hutvagner, G. 2016, 'miRTar2GO: a novel rule-based model learning method for cell line specific microRNA target prediction that integrates Ago2 CLIP-Seq and validated microRNA–target interaction data', Nucleic Acids Research, vol. 45, no. 6, pp. 1-10.
View/Download from: UTS OPUS or Publisher's site
MicroRNAs (miRNAs) are 19-22 nucleotides (nt) long regulatory RNAs that regulate gene expression by recognizing and binding to complementary sequences on mRNAs. The key step in revealing the function of a miRNA, is the identification of miRNA target genes. Recent biochemical advances including PAR-CLIP and HITS-CLIP allow for improved miRNA target predictions and are widely used to validate miRNA targets. Here, we present miRTar2GO, which is a model, trained on the common rules of miRNA-target interactions, Argonaute (Ago) CLIP-Seq data and experimentally validated miRNA target interactions. miRTar2GO is designed to predict miRNA target sites using more relaxed miRNA-target binding characteristics. More importantly, miRTar2GO allows for the prediction of cell-type specific miRNA targets. We have evaluated miRTar2GO against other widely used miRNA target prediction algorithms and demonstrated that miRTar2GO produced significantly higher F1 and G scores. Target predictions, binding specifications, results of the pathway analysis and gene ontology enrichment of miRNA targets are freely available at http://www.mirtar2go.org.