Become a sponsor to Benedek Rozemberczki
π About Me
I am a Data Science PhD student who lives in sunny Edinburgh π°β and creates machine learning tools which solve non trivial problems related to graph structured data . Currently I am mainly working on Karate Club, a library built on NetworkX which allows you to perform the most standard unsupervised learning tasks on graph structured data. Karate club supports community detection, network embedding and graph summarization procedures. I also maintain repositories which curate content about machine learning research such as decision trees, gradient boosting, fraud detection, and graph classification. When you decide to be my Github Sponsor, you support the development of these data mining tools and the maintenance of these open source materials.
π Mission
My mission is to create a scikit-learn like library for unsupervised learning from graph structured data with an API driven design. There exists no out-of-the-box tools for unsupervised machine learning from graph structured data. By using an easy to use tool people could extract value from linked webpages, molecular graphs, financial transaction series, and social networks with ease.
π° Funding
Maintaining the flagship graph mining package, smaller graph neural network projects, and the curated machine learning repositories is a time consuming process. These projects are only tangentially related to my research. Currently I am doing all of these these without any external financial help. Getting funding would be truly beneficial for me and the machine learning community.
π Goals
First, I would like to release the first stable and tested version of Karate Club which covers a wide variety of community detection, node embedding and network description algorithms. As part of this I would benchmark the performance of these algorithms on new larger datasets. Specifically I would like to include scalable and accurate algorithms which can perform:
- Neighbourhood-based node embedding
- Attributed node embedding
- Structural node embedding
- Whole graph embedding (network summarization)
- Overlapping community detection
- Non-overlapping community detection
Second, I would like to develop a scikit-learn like sparsity-aware matrix factorization package which can be used in recommender systems and other sparse data scenarios. This would be beneficial both for researchers and practitioners.
Finally, the research paper aggregator repositories require frequent updates. With additional resources I could allocate more time to do more frequent updates.
π To Infinity and Beyond
My long term plan and dream is to create a company that would offer graph mining, geometric deep learning and network science consulting and services. I am not there yet, but tools like Karate Club, Graph2Vec and Walklets are small but important steps in that direction.
π Thank you!
Thank you for your consideration!
Featured work
-
benedekrozemberczki/GEMSEC
The TensorFlow reference implementation of 'GEMSEC: Graph Embedding with Self Clustering' (ASONAM 2019).
Python 255 -
benedekrozemberczki/karateclub
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)
Python 2,185 -
benedekrozemberczki/littleballoffur
Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)
Python 706 -
benedekrozemberczki/shapley
The official implementation of "The Shapley Value of Classifiers in Ensemble Games" (CIKM 2021).
Python 219 -
benedekrozemberczki/awesome-graph-classification
A collection of important graph embedding, classification and representation learning papers with implementations.
Python 4,772 -
benedekrozemberczki/pytorch_geometric_temporal
PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models (CIKM 2021)
Python 2,712