My current research lies in the areas of Data Science and Artificial Intelligence (Natural Language Processing). I try to answer these questions:
1) How to make better recommendations in the era of Big Data? For example, explainable Recommender Systems, conversational Recommender Systems, privacy-preserving Recommender Systems.
2) How to discover new knowledge based on complex, large-scale, noisy, diverse, and dynamic data? For example, social data (e.g., social media posts in Twitter), pervasive data (e.g., exercises data in smart phones), text data, image data, environment data, climate data.
3) What is intelligence and how to model, test, and apply findings and theories of biological learning to machines?
Student applications and collaboration from other discipline areas and industry are very welcome! My email is my first name.last name@newcastle.ac.uk
- Grants:
1) Collaborative Innovation Fund of Royal Berkshire NHS Foundation Trust and University of Reading project entitled: Improving preoperative diagnosis of thyroid nodules by developing ultrasound artificial intelligence (AI) decision support system. Collaborators: Royal Berkshire Hospital Trust Foundation, NHS, 1/1/2021-30/12/2022 (PI)
2) EIT Food/Horizon2020 project entitled: Developing a Digital Toolkit to Enhance the Communication of Scientific Health Claims (Co-I, Project in total €495,204 for phase 1, €708,622 for phase 2, €466,000 for phase 3), collaborators: Department of English Language and Applied Linguistics, Department of Design, the School of Agriculture, Policy and Development of University of Reading, Technical University of Munich (TUM), British Nutrition Foundation, start-up company Food Maestro, food company Maspex, Institute of Animal Reproduction and Food Research of the Polish Academy of Sciences. 1/1/2019-31/12/2021
Project website: https://www.healthclaimsunpacked.co.uk/
Toolkit website: https://www.unpackinghealthclaims.eu/
- Vacancies/Scholarships
1) 1 postdoc position in NLP/Recommender Systems, deadline 23 March 2022, apply here
2) 1 EPSRC PhD studentship competition in Scalable Patient Profiling for Personalised Healthcare, 3.5 years, deadline 18 March 2022, apply here, select number 2 in this form. Not available.
3) 1 PhD studentship in Conversational Recommender Systems, 3.5 years, this scholarship is open for home students. But international students have the opportunity to apply for a scholarship to cover the pay difference between home students and international students. Open until filled, apply here , contact me first.
- Available PhD Student Projects
User Profiling for Personalisation. This project is to develop scalable and effective explicit and implicit user profiling approaches to discover new knowledge about users’ individual interests, preferences, emotional status, and information needs. It will use advanced natural language understanding, reinforcement learning, and deep learning techniques to construct user profiles based on both explicit and implicit user behaviour data. The proposed user profiling techniques will be applied to recommender systems to make personalised recommendations.
Hashing Techniques for High Dimensional Data. Hashing is a key technique to analyse big data. It has been popularly used for dimensionality reduction and data size reduction. This project is to develop novel hashing algorithms to sample, compress, and index big data such as social and climate data to facilitate effective and efficient information retrieval and recommendation. This project will also explore machine learning based hashing techniques. This project will contribute to new solutions to make better usage and processing of big data.
Other directions such as Responsible Recommender Systems (e.g., trustworthy, sustainable), Conversational Recommender Systems, Reinforcement Learning based Recommender Systems, Scalable Recommender Systems are also available. Note: Good programming skills and theoretical modelling are required for all projects.
- Selected Recent Research Work
1. User Profiling and Recommender Systems.
Personalisation attempts to help users solve the information overload issue. It is the ability to provide content and services tailored to individuals based on knowledge about their preferences and behaviours. User profiling is the foundation of personalisation. I proposed novel user profiling approaches to discover knowledge about users such as their interests, preferences, and information needs, from massive social data that contains user generated content and behaviour information.
Social tags. I investigated the distinctive features and multiple relationships of social tags, and explored novel approaches to solve the tag quality problem and profile users accurately.
Social Tags & Item Taxonomy. I proposed an approach to integrate social tags from community users and the standard taxonomy information provided by experts to profile users and make recommendations.
Social Media. I modeled the recency phenomenon and the implicit information network among users, topics, and micro-blogs. I proposed to take the temporal factor and implicit information network in social media to profile users and recommend topics to users.
Ratings. Inspired by Neural Language Model, I proposed a probabilistic rating auto-encoder to perform unsupervised feature learning and generate latent user feature profiles from large-scale user rating data.
Images. The traditional implicit rating information network is augumented with visual factors based on item images to make recommendations.
Heterogenous Information Network. I proposed a deep reinforcement learning based approach to profile users in heterogenous information network for recommender systems.
Some example figures are shown below. (The related publications please see G1 on Publications page)
2. Big Data Processing Techniques.
Targeting the challenge of big data and high dimensional data, I proposed parallel user profiling approaches and Hashing based indexing/blocking techniques.
Parallel user profiling. I proposed a parallel user profiling implementation based on advanced cloud computing techniques such as Hadoop, MapReduce and Cascading. The experiments were conducted on a 7GB delicious.com dataset with 420 million tag assignments.
Hashing based indexing/blocking. I proposed noise-tolerant hashing based indexing, two stage similarity-aware indexing techniques for noisy large-scale datasets in the application areas of real-time entity resolution and real-time social recommender systems. In the joint work with students and colleagues, a semantic-aware blocking approach has been proposed to efficiently unify both textual and semantic features to map large noisy data into small data blocks.
Some example figures are shown below. (The related publications please see G2 on Publications page)
3. Natural Language Processing: Sentiment Analysis and Question Answering Systems.
We applied deep learning techniques for sentiment analysis task and question answering systems. The sentiment analysis task was trained on 1 million tweets randomly selected from a 5.3TB Twitter dataset. Some example figures are shown below. (The related publications please see G3 on Publications page)
- Postdoc and research assistant
- Nicolay Rusnachenko (12/2022- )
- Zehao Liu (4/2020-12/2021)
- Xiao Li (7/2020-12/2021)
- Selected Current Student Projects
- Thanet Markchom (2019- ). PhD student project. Image based Explainable Recommender Systems
- Aleksandra Makarova (2019- ). PhD student project. Conversational Recommender Systems
- Zehao Liu (2020- ). PhD student project. Hashing Techniques for Recommender Systems
- Selected Past Student Projects
- Shanthini Ramu (2021). Master student project. Multimodal Sentiment Analysis based on Video, Audio, and Text
- Yu Zhou (2021). Master student project. Automatic Question Answer Generation Bots (github)
- Umarani Ganeshbabu (2019). A Dynamic Bayesian Network Approach for Analysing Topic-sentiment Evolution
- Bhuvana Madhusudana (2019). Deep Learning for Bot Detection
- Selected Final Year undergraduate student projects: Conversational Chatbot in 3D Models (2020, demo), twitter (2021, github),
- Banda Ramadan, Co-supervisor of PhD Student Project, Indexing Techniques for Real-time Entity Resolution, 2012-2015, ANU (Thesis)
- XingYi Xu, Co-supervisor of Master Student Project, Deep Learning for Sentiment Analysis, 2015-2016, The University of Melbourne (Paper)
- Haifan (Tony) Wu, Co-supervisor of Master Student Project, Recommender Systems based on Social Media in Health, 2015-2016, The University of Melbourne (Report)
- Mingyuan Cui, Co-supervisor of Master Student Project, Towards a Scalable and Robust Entity resolution Framework — Blocking under Relational Constraints, 2014, ANU (Paper | Report)
- Haoran Du, Co-supervisor of Masters Student Project, 2014, Big Data Analysis — A Case Study for Recommender Systems, 2014, ANU (PDF | Report | Slides)
- Paper | Thesis) Honours Student Project, Noise-Tolerant Approximate Blocking for Dynamic Real-time Entity Resolution, 2013, ANU (
- Shouheng Li, Primary supervisor of Master Student Project, Two-stage Similarity-aware Indexing for Real-time Entity Resolution, 2013, ANU (Paper | Report)
- Chrislyn Braganza, Primary supervisor of Summer Research Scholar project, Real-time Social Recommender System, 2012, ANU
- Primary supervisor of Undergraduate IT Capstone Project, A Personalized Recommender System based on Microblogs, 2011, QUT
Leave a Reply