Skip to content

Commit

Permalink
Generalize distributed testing docs (#22)
Browse files Browse the repository at this point in the history
* Generalize distributed testing docs

Signed-off-by: Austin Liu <[email protected]>

* Seperate docs to single node and distriubuted setup

Signed-off-by: Austin Liu <[email protected]>

---------

Signed-off-by: Austin Liu <[email protected]>
  • Loading branch information
austin362667 authored Oct 7, 2024
1 parent d173581 commit a84eb1c
Showing 1 changed file with 90 additions and 11 deletions.
101 changes: 90 additions & 11 deletions docs/testing.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
# Distributed Testing
# Testing

Install Ray on at least two nodes.
* [Single Node Testing](#Single-Node-Testing)
* [Distributed Testing](#Distributed-Testing)
* [Ray Installation Docs](https://docs.ray.io/en/latest/ray-overview/installation.html)

## Single Node Testing

https://docs.ray.io/en/latest/ray-overview/installation.html
Install Ray on one (head) node.

```shell
sudo apt install -y python3-pip python3.12-venv
Expand All @@ -11,19 +15,23 @@ source venv/bin/activate
pip3 install -U "ray[default]"
```

## Start Ray Head Node
### Start Ray Head Node

```shell
ray start --head --node-ip-address=10.0.0.23 --port=6379 --dashboard-host=0.0.0.0
ray start --head --dashboard-host 0.0.0.0 --include-dashboard true
```

## Start Ray Worker Nodes(s)
### Start Ray Worker Nodes(s) (Optional)

This is optional, if you go add Ray worker noeds, it becomes distributed.

Also [Ray doesn't support MacOS multi-node cluster](https://docs.ray.io/en/latest/cluster/getting-started.html#where-can-i-deploy-ray-clusters)

```shell
ray start --address=10.0.0.23:6379 --redis-password='5241590000000000'
ray start --address=127.0.0.1:6379
```

## Install DataFusion Ray (on each node)
### Install DataFusion Ray (on head node)

Clone the repo with the version that you want to test. Run `maturin build --release` in the virtual env.

Expand All @@ -42,9 +50,80 @@ cd datafusion-ray
maturin develop --release
```

## Submit Job
### Submit Job

1. If started the cluster manually, simply connect to the existing cluster instead of reinitializing it.
```python
# Start a local cluster
# ray.init(resources={"worker": 1})

# Connect to a cluster
ray.init()
```

2. Submit the job to Ray Cluster
```shell
RAY_ADDRESS='http://127.0.0.1:8265' ray job submit --working-dir examples -- python3 tips.py
```

## Distributed Testing

Install Ray on at least two nodes.

```shell
sudo apt install -y python3-pip python3.12-venv
python3 -m venv venv
source venv/bin/activate
pip3 install -U "ray[default]"
```

### Start Ray Head Node

```shell
ray start --head --dashboard-host 0.0.0.0 --include-dashboard true
```

### Start Ray Worker Nodes(s)

Replace `NODE_IP_ADDRESS` with the address accessible in your distributed setup, which will be displayed after the previous step.

```shell
ray start --address={NODE_IP_ADDRESS}:6379
```

### Install DataFusion Ray (on each node)

Clone the repo with the version that you want to test. Run `maturin build --release` in the virtual env.

```shell
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
. "$HOME/.cargo/env"
```

```shell
pip3 install maturin
```

```shell
git clone https://github.com/apache/datafusion-ray.git
cd datafusion-ray
maturin develop --release
```

### Submit Job

1. If starting the cluster manually, simply connect to the existing cluster instead of reinitializing it.

```python
# Start a local cluster
# ray.init(resources={"worker": 1})

# Connect to a cluster
ray.init()
```

2. Submit the job to Ray Cluster

```shell
cd examples
RAY_ADDRESS='http://10.0.0.23:8265' ray job submit --working-dir `pwd` -- python3 tips.py
RAY_ADDRESS='http://{NODE_IP_ADDRESS}:8265' ray job submit --working-dir examples -- python3 tips.py
```

0 comments on commit a84eb1c

Please sign in to comment.