- Notifications
You must be signed in to change notification settings - Fork0
Embedding transaction graphs to classify fraudulent Ethereum wallet addresses.
adrian-io/ethXpose
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
The blockchain's decentralized nature has enabled financial innovation but also sophisticated fraud, including phishing scams and Ponzi schemes. Traditional fraud detection methods struggle with the scale and complexity of blockchain transactions.
ETHXpose provides a graph-based fraud detection approach using graph embeddings and machine learning models to classify illicit activity in Ethereum transactions in a computationally cost-effective way.
- Transaction graphs are constructed from XBlock data, wherenodes represent Ethereum wallets andedges denote transactions.
- Instead of resource-intensive Graph Neural Networks (GNNs),Graph2Vec andFeather-G embeddings are used to convert transaction graphs into structured vectors.
- Classification is performed withRandom Forest,Support Vector Machines (SVMs), orGradient Boosting (GB).
Results show thatFeather-G embeddings with Gradient Boosting achieve the highest accuracy (93.9%) and F1-score (93.6%), effectively detecting fraudulent wallets while remaining computationally efficient.
This project wraps the trained models into aFastAPI web service, enabling real-time wallet classification through a REST API.
This project usesPython 3.9 (recommended for compatibility withkarateclub andscikit-learn).
Install dependencies with:
pip install -r requirements.txt
Data is collected fromXBlock, an academic blockchain data platform.
- The dataset contains1,660 phishing addresses,200 Ponzi-scheme addresses, and1,700 normal addresses.
- Transaction networks includefirst-order (direct neighbors) andsecond-order (neighbors of neighbors) networks.
- Each transaction records: sender, receiver, value, and timestamp.
The dataset is availablehere.
Train and evaluate models:
python train.py
Classify a wallet address:
python test_classify.py --wallet_address<WALLET_ADDRESS> --embedding<EMBEDDING_METHOD> --model<MODEL_NAME>
Ensure the following graphs are available inapi/data/graphs before training:
- Normal first-order nodes
- Normal second-order nodes
- Phishing first-order nodes
- Phishing second-order nodes
--graph STR Order of transaction graphs (first, second). Default: 'first'--embedding STR Embedding algorithm (Feather-G, Graph2Vec, GL2Vec). Default: 'Feather-G'--classifier STR Classifier (SVM, MLP, RF, GB). Default: 'GB'Examples:
Train and evaluate:
python train.py
Classify a wallet with Feather-G embeddings and RandomForest:
python test_classify.py --wallet_address 0x123456789abcdef --embedding"Feather-G" --model"RF"
The web service exposes the following endpoints:
POST
/api/py/classifyClassify a wallet in real-time. Request JSON:{"wallet_address":"0x0c90ddbeaf1d855e9fb6a7180b9dbd07156215b6","model_name":"first_Feather-G_GB.joblib"}Response JSON:
{"fraud_probability":0.93,"graph": {"nodes": [{"id":"1","label":"0xabc..."}],"edges": [{"source":"1","target":"2","value":1.5,"timestamp":"2025-09-03 12:34:56"}] }}GET
/api/healthCheck if the service is running:{"status":"ok"}
Run locally:
uvicorn main:app --reload --host 0.0.0.0 --port 8000
- Recommended Python version:3.9 for full package compatibility.
- For reproducibility and deployment, usepinned versions in
requirements.txt.
About
Embedding transaction graphs to classify fraudulent Ethereum wallet addresses.
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.