Big Data Visual Analytics
Business Data Visual Modelling
Property Data Visual Mining
Clinic Data Visualization
System and Network Security
Subjects have been teaching in UTS:
Property Data Visualization and Analytics (FDAB, 16238)
Property Market Research and Analysis (FDAB, 16643)
Data Visualization and Visual Analytics (FEIT, 32146)
Internet Programming (FEIT, 32516)
Li, G, Zhang, Y, Dong, Y, Liang, J, Zhang, J, Wang, J, Mcguffin, MJ & Yuan, X 2020, 'BarcodeTree: Scalable Comparison of Multiple Hierarchies.', IEEE transactions on visualization and computer graphics, vol. 26, no. 1, pp. 1022-1032.View/Download from: Publisher's site
We propose BarcodeTree (BCT), a novel visualization technique for comparing topological structures and node attribute values of multiple trees. BCT can provide an overview of one hundred shallow and stable trees simultaneously, without aggregating individual nodes. Each BCT is shown within a single row using a style similar to a barcode, allowing trees to be stacked vertically with matching nodes aligned horizontally to ease comparison and maintain space efficiency. We design several visual cues and interactive techniques to help users understand the topological structure and compare trees. In an experiment comparing two variants of BCT with icicle plots, the results suggest that BCTs make it easier to visually compare trees by reducing the vertical distance between different trees. We also present two case studies involving a dataset of hundreds of trees to demonstrate BCT's utility.
Qian, J & Zhang, J 2017, 'Application and investigation of big data visualization method for hospital outpatient management of antibiotics', Chinese Journal of Hospital Pharmacy, vol. 37, no. 18, pp. 1850-1856.View/Download from: Publisher's site
OBJECTIVE To research and explore big data visualization for applications in management of outpatient antibiotics. METHODS Antibiotics in 2015 outpatient section were analyzed by big data visualization and multidimensional analytics, to find out the relationship and multidimensional patterns of hospital outpatient antimicrobial drug usage. RESULTS The results of drug usages and statuses of 2015 outpatient antibiotics met the overall requirements, outpatient antimicrobial usage was effectively controlled, 92.2% of doctors in the outpatient department had controlled the total amount of antimicrobial prescription under RMB 50 000 in whole year, though there were still a few departments and doctors had over used the antimicrobial drugs. CONCLUSION Big data visualization analytical techniques can effectively improve the reliability and accuracy in clinical application of antimicrobial drug management.
Keywords： big data visualization outpatient antibiotics
Visualization and interaction of multidimensional data are challenges in visual data analytics, which requires optimized solutions to integrate the display, exploration and analytical reasoning of data into one visual pipeline for human-centered data analysis and interpretation. Though it is considered to be one of the most popular techniques for visualization and analysis of multidimensional data, parallel coordinate visualization is also suffered from the visual clutter problem as well as the computational complexity problem, same as other visualization methods in which visual clutter occurs where the volume of data needs to be visualized to be increasing. One straightforward way to address these problems is to change the ordering of axis to reach the minimal number of visual clutters. However, the optimization of the ordering of axes is actually a NP-complete problem. In this paper, two axes re-ordering methods are proposed in parallel coordinates visualization: (1) a contribution-based method and (2) a similarity-based method.
Big Data is composed of text, image, video, audio, mobile or other forms of data collected from multiple datasets, and is rapidly growing in size and complexity. It has created a huge volume of multidimensional data within a very short time period. This raises several new challenges, including; how to classify Big Data for multiple datasets, how to analyze Big Data for different forms of data, and how to visualize Big Data without the loss of information. In this paper, we extended our 5Ws density methods to Big Data behaviours analysis and visualization. Our approach classifies Big Data into the 5Ws dimensions based on the data behaviours, and then further creates the 5Ws densities to measure Big Data patterns across multiple datasets for any form of data. We also establish non-dimensional data axes as additional parallel axes for Big Data visualization. The experimental results have shown that the proposed new model has significantly improved the accuracy of Big Data visualization, and has large potential benefits and applications.
Zhang, J & Huang, ML 2016, 'Density approach: A new model for BigData analysis and visualization', Concurrency and Computation: Practice and Experience, vol. 28, no. 3, pp. 661-673.View/Download from: Publisher's site
Copyright © 2014 John Wiley & Sons, Ltd. In this paper, we extended our density model to BigData analysis and visualization. BigData, which contains images, videos, texts, audio files and other forms of data collected from multiple datasets, is difficult to process and visualize using traditional database management and visualization tools. The challenges are in representing multiple datasets and illustrating and visualizing data patterns to meet business, government and organization needs. We have established the 5Ws density model which uses the 5Ws dimensions for BigData analysis and visualization. The 5Ws dimensions are what the data contain, why the data were transferred, where the data came from, when the data occurred, who received the data and how the data were transferred. According to the network log dataset, an example of BigData, each data incident can be classified into these 5Ws dimensions. The network log dataset ISCX2012 is tested throughout our model. This new model not only classifies network attributes and patterns but also establishes density patterns that provide more analytical features for BigData analysis and visualization. The experimental result shows that this new model with clustered visualization can be efficiently used for BigData analysis and network intrusion detection. Concurrency and Computation: Practice and Experience, 2014.
Zhang, J, Huang, ML & Meng, Z-P 2015, 'Visual Analytics for BigData Variety and Its Behaviours', COMPUTER SCIENCE AND INFORMATION SYSTEMS, vol. 12, no. 4, pp. 1171-1191.View/Download from: Publisher's site
Zhang, J, Huang, M & Hoang, DB 2013, 'Visual analytics for intrusion detection in spams', International Journal Grid and Utility Computing Vol X No X 20XX, vol. 4, no. 2/3, pp. 178-186.View/Download from: Publisher's site
Spam email attacks are increasing at an alarming rate and have become more and more cunning in nature. This has necessitated the need for visual spam email analysis within an intrusion detection system to identify these attacks. The challenges are how to increase the accuracy of detection and how to visualise large volumes of spam email to better understand the analysis results and identify email attacks. This paper proposes a DensityWeight model that is to strengthen and extend the system capacity for analysis of network attacks in spam emails, including DDoS attacks. An interactive visual clustering method DATU is introduced to classify and display spam emails. The experimental results have shown that the proposed new model has improved the accuracy of intrusion detection and provides a better understanding of the nature of spam email attacks on though the network.
Ge, XJ & Zhang, J 2019, 'Analyse Property Data Through Visualisation', Proceedings from the PRRES Conference - 2019, 25th Annual Pacific-RIM Real Estate Society Conference, PACIFIC RIM REAL ESTATE SOCIETY (PRRES), Melbourne, Australia, pp. 1-1.
People's activities create 2.5 Quintilian bytes of data every day (Marr, 2018). The examples of activities include shopping, sleeping, property purchasing, selling or leasing, etc. A large amount of data is usually with high-dimensional geometry and multivariate characters. Traditional text-based data may be able to record the facts of activities, but the hidden story behind the data may not be discovered. Data visualisation is an instrument for reasoning about quantitative information and allows us to analyse data behaviours by understanding data patterns, trends and correlations that could not be detected by the traditional text-based data. This paper focuses on analysing property data for six suburbs in Sydney using visualisation. Data with 31 elements from the three-year censuses were used to create visual patterns for analysis. Parallel coordinates and dashboard techniques are applied for data visualization for the selected six suburbs. The results suggest that the well-designed data graphics is a powerful tool, and property data visualisation provides us with visual access to huge amounts of data in easily digestible visuals.
Zhang, J & Huang, ML 2016, '2D Approach Measuring Multidimensional Data Pattern in Big Data Visualization', Proceedings of 2016 IEEE International Conference on Big Data Analysis, IEEE International Conference on Big Data Analysis (ICBDA), IEEE, Hangzhou, China, pp. 194-199.View/Download from: Publisher's site
Big Data, structured and unstructured data, contains millions attributes in multiple dimensions. This has arisen three issues: 1) how to measure the structured and unstructured multidimensional data patterns for Big Data
analysis; 2) how to display multidimensional data patterns in normal size of screen; 3) how to optimize the data attributes in Big Data visualization. In this work, we have visual analyzed Big Data variety based on the complexity of multidimensional data. Firstly, we introduce 2D dimension which divided the multidimensional dataset into 2D data pattern subsets, and then establish 2D-Ratio algorithm to measure 2D dimension in multiple data patterns. Second, we create two additional parallel axes by using 2D-Ratio to compare 2D dimensional patterns for visualization. Third, the dimension clustering and shrunk attribute have been introduced in 2D-Ratio parallel coordinates to reduce the data over-crowed. The experiment shows that our
model can be efficiently and accurately used for Big Data analysis and visualization.
Wang, W, Huang, ML, Zhang, J & Lai, W 2015, 'Detecting Criminal Relationships through SOM Visual Analytics', Proceedings of the 19th International Conference Information Visualization, IEEE Information Visualization Conference, IEEE, Barcelona, pp. 316-321.View/Download from: Publisher's site
Feature analysis is always beneficial to the detection of anonymous criminals in digital forensics, including people and activities, where vast amount of features extracted from databases are involved. Not all features extracted are continuous or different, some of them are discrete or have the same value with others. We discovered that using visual analytics to select features for forensic investigations is not only improve the analysis time of selection, but can also deeply and obviously display the slight changes of features and criminals and also the relationship between features and criminals in order to find the target with significant difference with others, and also predict the more active features to be used in the future. Experiments show that visual feature analysis can help to catch the desire results quickly and clearly.
Zhang, J & Huang, M 2015, 'A New Analytics Model for Large Scale Multidimensional Data Visualization', Volume 9106 of the series Lecture Notes in Computer Science (LNCS), The International Conference on Cloud Computing and Big Data, Springer, Huangshan, China, pp. 55-71.View/Download from: Publisher's site
With the rise of Big Data, the challenge for modern multidimensional data analysis and visualization is how it grows very quickly in size and complexity. In this paper, we first present a classification method called the 5Ws Dimensions which classifies multidimensional data into the 5Ws definitions. The 5Ws Dimensions can be applied to multiple datasets such as text datasets, audio datasets and video datasets. Second, we establish a Pair-Density model to analyze the data patterns to compare the multidimensional data on the 5Ws patterns. Third, we created two additional parallel axes by using pair-density for visualization. The attributes has been shrunk to reduce data over-crowding in pair-density parallel coordinates. This has achieved more than 80 % clutter reduction without the loss of information. The experiment shows that our model can be efficiently used for Big Data analysis and visualization.
Huang, M & Zhang, J 2013, 'Visual Analysis and Detection of Network Flood Attacks through Two-Layer Density', Proc. of 3rd IEEE Int. Conference on Computer Science and Network Technology (ICCSNT-13), International Conference on Computer Science and Network Technology, IEEE, Dalian, China, pp. 625-629.View/Download from: Publisher's site
Flood attack patterns have variability depending on the network environment. It has been necessitated that the need for visual analysis within an Intrusion Detection System (IDS) is to identify these flood-attack patterns. The challenges are how to increase the accuracy of detection and how to visualize and present flood attack patterns in networks for early detection. In this paper, we propose a Two-Layer density model for flood attack detection. The first density layer describes sending-density and receiving-density in analyzing Internet traffic. The second density layer describes attack-density and normal-density in analyzing local network traffic at a victim site. Several visualization techniques are used to facilitate the detection process. The experiments demonstrate that the Two-Layer density model has significantly improved the accuracy of the detection of flood attacks and provides users with a better understanding of the nature of flood attacks.
Wang, WB, Huang, ML, Lu, LF & Zhang, J 2014, 'Improving performance of forensics investigation with parallel coordinates visual analytics', Proceedings - 17th IEEE International Conference on Computational Science and Engineering, CSE 2014, Jointly with 13th IEEE International Conference on Ubiquitous Computing and Communications, IUCC 2014, 13th International Symposium on Pervasive Systems, Algorithms, and Networks, I-SPAN 2014 and 8th International Conference on Frontier of Computer Science and Technology, FCST 2014, International Conference on Computational Science and Engineering, Institute of Electrical and Electronics Engineers Inc., Chengdu, China, pp. 1838-1843.View/Download from: Publisher's site
Computer forensics investigators aim to analyse and present facts through the examination of digital evidences in short times. As the volume of suspicious data is becoming large, the difficulties of catching the digital evidence in a legally acceptable time are high. This paper proposes an effective method for reducing investigation time redundancy to achieve the normalization of data on hard disk drives (HDD) for computer forensics. We use visualization techniques, parallel coordinates, to analyse data instead of using data analysis algorithms only, and also choose a Red-Black tree structure to de-duplicate data. It reduces the time complexity, including the time spent of searching data, adding data as well as deleting data. We show the advantages of our approach; moreover, we demonstrate how this method can enhance the efficiency and quality of computer forensics task.
Zhang, J & Huang, ML 2013, 'Detecting flood attacks through new density-pattern based approach', Proceedings - 2013 IEEE International Conference on High Performance Computing and Communications, HPCC 2013 and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, EUC 2013, IEEE International Conference on High Performance Computing and Communications, IEEE, Zhangjiajie, China, pp. 246-253.View/Download from: Publisher's site
Flood attacks are common threats to Internet, which has necessitated the need for visual analysis within an intrusion detection system to identify these attacks patterns. The challenges are how to increase the accuracy of detection and how to visualize and present the patterns of flood attack for early detection. In this paper, we introduce a Two-Density model that contains two coefficients: sending-density and receiving-density for the network traffic analysis during flood attacks. The attack pattern is established based on these two coefficients which are also displayed in our clustering visualization graph. The experimental results are presented to demonstrate that the proposed new model significantly improves the detection of flood attacks and provides a better understanding of the nature of flood attacks on networks. © 2013 IEEE.
Zhang, J, Huang, ML, Wang, WB, Lu, LF & Meng, Z-P 2014, 'Big data density analytics using parallel coordinate visualization', Proceedings - 17th IEEE International Conference on Computational Science and Engineering, CSE 2014, Jointly with 13th IEEE International Conference on Ubiquitous Computing and Communications, IUCC 2014, 13th International Symposium on Pervasive Systems, Algorithms, and Networks, I-SPAN 2014 and 8th International Conference on Frontier of Computer Science and Technology, FCST 2014, International Conference on Computational Science and Engineering, Institute of Electrical and Electronics Engineers Inc., Chengdu, China, pp. 1115-1120.View/Download from: Publisher's site
Parallel coordinate is a popular tool for visualizing high-dimensional data and analyzing multivariate data. With the rapid growth of data size and complexity, data clutter in parallel coordinates is a major issue for Big Data visualization. This has given rise to three problems; 1) how to rearrange the parallel axes without the loss of data patterns, 2) how to shrink data attributes on each axis without the loss of data trends, 3) how to visualize the structured and unstructured data patterns for Big Data analysis. In this paper, we introduce the 5Ws dimensions as the parallel axes and establish the 5Ws sending density and receiving density as additional axes for Big Data visualization. Our model not only demonstrates Big Data attributes and patterns, but also reduces data over-lapping by up to 80 percent without the loss of data patterns. Experiments show that this new model can be efficiently used for Big Data analysis and visualization.
Zhang, J, Meng, Z & Huang, ML 2014, 'BigData visualization: Parallel coordinates using density approach', 2014 2nd International Conference on Systems and Informatics, ICSAI 2014, International Conference on Systems and Informatics, Institute of Electrical and Electronics Engineers Inc., Shanghai, China, pp. 1056-1063.View/Download from: Publisher's site
Information visualization is a very important tool in BigData analytics. BigData, structured and unstructured data which contains images, videos, texts, audio and other forms of data, collected from multiple datasets, is too big, too complex and moves too fast to analyse using traditional methods. This has given rise to two issues; 1) how to reduce multidimensional data without the loss of any data patterns for multiple datasets, 2) how to visualize BigData patterns for analysis. In this paper, we have classified the BigData attributes into '5Ws' data dimensions, and then established a '5Ws' density approach that represents the characteristics of data flow patterns. We use parallel coordinates to display the '5Ws' sending and receiving densities which provide more analytic features for BigData analysis. The experiment shows that this new model with parallel coordinate visualization can be efficiently used for BigData analysis and visualization.
Zhang, J & Huang, M 2013, '5Ws Model for Big Data Analysis and Visualization', Computational Science and Engineering (CSE), 2013 IEEE 16th International Conference on, International Conference on Computational Science and Engineering, IEEE CS Press, Sydney, Australia, pp. 1021-1028.View/Download from: Publisher's site
Zhang, J & Huang, M 2013, 'Visual Analytics Model for Intrusion Detection in Flood Attack', 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), IEEE, Melbourne, Australia, pp. 277-284.View/Download from: Publisher's site
Flood attacks are common forms of Distributed Denial-of-Service (DDoS) attack threats on internet in nature. This has necessitated the need for visual analysis within an intrusion detection system to identify these attacks. The challenges are how to increase the accuracy of detection and how to visualize and present flood attacks in networks for early detection. In this paper, we introduce three coefficients, which not only classify the behaviors of flood attacks, but also measure the system performance under those flood attacks: a) attack-density that patterns the characters of flood attack, b) system workload which represents the system capability in handling flood attack and c) the scalability to classify the impact level of the flood attack at victim site. A visual clustered method is used to display the DDoS flood attacks. The experimentation results are presented to demonstrate our new model significantly improves the accuracy of the detection of DDoS attacks and provides a better understanding of the nature of flood attacks on networks.
Huang, M, Zhang, J, Nguyen, Q & Wang, J 2011, 'Visual Clustering of Spam Emails for DDoS Analysis', Proc. of 15th IEEE International Conference on Information Visualization (2011), IEEE Information Visualization Conference, IEEE Computer Society, London, United Kingdom, pp. 65-72.View/Download from: Publisher's site
Networking attacks embedded in spam emails are increasingly becoming numerous and sophisitcated in nature. Hence this has given a growing need for spam e-mail analysis to identify these attacks. The use of these intrusion detection systems has given rise to other two issues 1) the presentation and undersatanding of large ammounts of spam e-mails, 2) the user assisted input and quantified adjustment during the analysis process. In this paper we introduce a new analytical model that uses two coefficient vectors: 'density' and 'weight' for the analysis of spam email viruses and attacks. We then use a visual clustering method to classify and display the spam emails. The visualisation allows users to interactively select and scale down the scope of views for better undersanding of different types of the spam email attacks. The experiment shows that this new model with the clustering visualization can be effecitvely used for network security analysis.
Zhang, J, Huang, M & Hoang, DB 2011, 'Detecting DDoS Attack in Spam Emails using Density-Weight Model', Volume II, Proceedings of 2011 IEEE International Conference on Information Theory and Information Security, IEEE International Conference on Information Theory and Information Security, IEEE Press, Hangzhou, China, pp. 344-352.
DDoS attacks whose are embedded in spam emails are increasingly becoming numerous and sophisticated in nature. Hence this has given a growing need for spam email analysis to identify these attacks. The uses of these intrusion detection systems have given rise to two new challenges, 1) how to incrase the accuracy of detection, 2) how to present large spam email networks for better understanding. In this paper we introduce a new analytical model that uses two coefficient vectors: 'density' and 'weight' to measure the network density and system workload for the analysis of DDoS attack of spam emails. We then use a visual clustering method to classify and display the spam emails for better understanding of the spam email network. The experiment shows that the proposed new model can increase the accuracy of the detection of DDoS attacks.