My current research lies in the areas of Data Science and Artificial Intelligence (Natural Language Understanding). I try to answer these questions:
1) How to make better recommendations in the era of Big Data? For example, explainable Recommender Systems, conversational Recommender Systems, privacy-preserving Recommender Systems.
2) How to discover new knowledge based on complex, large-scale, noisy, diverse, and dynamic data? For example, social data (e.g., social media posts in Twitter), pervasive data (e.g., exercises data in smart phones), text data, image data, environment data, climate data.
3) What is intelligence and how to model, test, and apply findings and theories of biological learning to machines?
Student applications and cooperations from other discipline areas and industry are very welcome! My email is my first name.last name@reading.ac.uk
- Grants:
1) Collaborative Innovation Fund of Royal Berkshire NHS Foundation Trust and University of Reading project entitled: Improving preoperative diagnosis of thyroid nodules by developing ultrasound artificial intelligence (AI) decision support system. 2020-2021 (PI)
2) EIT Food/Horizon2020 project entitled: Developing a Digital Toolkit to Enhance the Communication of Scientific Health Claims (Co-I, Project in total 495,204 € for phase 1, 708,622 € for phase 2), collaborators: Department of English Language and Applied Linguistics, Department of Design, the School of Agriculture, Policy and Development of University of Reading, Technical University of Munich (TUM), British Nutrition Foundation, start-up company Food Maestro. 2019-2020
Project website: https://www.healthclaimsunpacked.co.uk/
Toolkit website: https://www.unpackinghealthclaims.eu/
- Available PhD Student Projects
User Profiling for Personalisation. This project is to develop scalable and effective explicit and implicit user profiling approaches to discover new knowledge about users’ individual interests, preferences, emotional status, and information needs. It will use advanced natural language understanding, reinforcement learning, and deep learning techniques to construct user profiles based on both explicit and implicit user behaviour data. The proposed user profiling techniques will be applied to recommender systems to make personalised recommendations.
Hashing Techniques for High Dimensional Data. Hashing is a key technique to analyse big data. It has been popularly used for dimensionality reduction and data size reduction. This project is to develop novel hashing algorithms to sample, compress, and index big data such as social and climate data to facilitate effective and efficient information retrieval and recommendation. This project will also explore machine learning based hashing techniques. This project will contribute to new solutions to make better usage and processing of big data.
Other directions such as Responsible Recommender Systems, Explainable Recommender Systems, Deep Learning based Recommender Systems are also available. Note: Good programming skills are required for all projects.
- Selected Recent Research Work
1. User Profiling and Recommender Systems.
Personalisation attempts to help users solve the information overload issue. It is the ability to provide content and services tailored to individuals based on knowledge about their preferences and behaviours. User profiling is the foundation of personalisation. I proposed novel user profiling approaches to discover knowledge about users such as their interests, preferences, and information needs, from massive social data that contains user generated content and behaviour information.
Social tags. I investigated the distinctive features and multiple relationships of social tags, and explored novel approaches to solve the tag quality problem and profile users accurately.
Social Tags & Item Taxonomy. I proposed an approach to integrate social tags from community users and the standard taxonomy information provided by experts to profile users and make recommendations.
Social Media. I modeled the recency phenomenon and the implicit information network among users, topics, and micro-blogs. I proposed to take the temporal factor and implicit information network in social media to profile users and recommend topics to users.
Ratings. Inspired by Neural Language Model, I proposed a probabilistic rating auto-encoder to perform unsupervised feature learning and generate latent user feature profiles from large-scale user rating data.
Some example figures are shown below. (The related publications please see G1 on Publications page)
2. Big Data Processing Techniques.
Targeting the challenge of big data and high dimensional data, I proposed parallel user profiling approaches and Hashing based indexing/blocking techniques.
Parallel user profiling. I proposed a parallel user profiling implementation based on advanced cloud computing techniques such as Hadoop, MapReduce and Cascading. The experiments were conducted on a 7GB delicious.com dataset with 420 million tag assignments.
Hashing based indexing/blocking. I proposed noise-tolerant hashing based indexing, two stage similarity-aware indexing techniques for noisy large-scale datasets in the application areas of real-time entity resolution and real-time social recommender systems. In the joint work with students and colleagues, a semantic-aware blocking approach has been proposed to efficiently unify both textual and semantic features to map large noisy data into small data blocks.
Some example figures are shown below. (The related publications please see G2 on Publications page)
3. Sentiment Analysis and Question Answering Systems.
We applied deep learning techniques for sentiment analysis task and question answering systems. The sentiment analysis task was trained on 1 million tweets randomly selected from a 5.3TB Twitter dataset. Some example figures are shown below. (The related publications please see G3 on Publications page)
- Current Postdoc and research assistant
- Zehao Liu (2020- )
- Xiao Li (2020- )
- Current Student Projects
- Thanet Markchom (2019- ). PhD student project. Image based Explainable Recommender Systems
- Aleksandra Makarova (2019- ). PhD student project. Conversational Recommender Systems
- Chirag Khanna (2020). Master student project. Reinforcement Learning for Optimising Climate I/O data layout
- Selected Final Year undergraduate student projects: Conversational Chatbot in 3D Models (2020)
- Past Student Projects
- Umarani Ganeshbabu (2019). A Dynamic Bayesian Network Approach for Analysing Topic-sentiment Evolution
- Bhuvana Madhusudana (2019). Deep Learning for Bot Detection
- Banda Ramadan, Co-supervisor of PhD Student Project, Indexing Techniques for Real-time Entity Resolution, 2012-2015, ANU (Thesis)
- XingYi Xu, Co-supervisor of Master Student Project, Deep Learning for Sentiment Analysis, 2015-2016, The University of Melbourne (Paper)
- Haifan (Tony) Wu, Co-supervisor of Master Student Project, Recommender Systems based on Social Media in Health, 2015-2016, The University of Melbourne (Report)
- Mingyuan Cui, Co-supervisor of Master Student Project, Towards a Scalable and Robust Entity resolution Framework — Blocking under Relational Constraints, 2014, ANU (Paper | Report)
- Haoran Du, Co-supervisor of Masters Student Project, 2014, Big Data Analysis — A Case Study for Recommender Systems, 2014, ANU (PDF | Report | Slides)
- Paper | Thesis) Honours Student Project, Noise-Tolerant Approximate Blocking for Dynamic Real-time Entity Resolution, 2013, ANU (
- Shouheng Li, Primary supervisor of Master Student Project, Two-stage Similarity-aware Indexing for Real-time Entity Resolution, 2013, ANU (Paper | Report)
- Chrislyn Braganza, Primary supervisor of Summer Research Scholar project, Real-time Social Recommender System, 2012, ANU
- Primary supervisor of Undergraduate IT Capstone Project, A Personalized Recommender System based on Microblogs, 2011, QUT
Leave a Reply