Fix typos. Add getting started todo

chrisyeh96 · Aug 21, 2023 · 773d26e · 773d26e
1 parent 2d08cd2
commit 773d26e
Showing 1 changed file with 10 additions and 2 deletions.
diff --git a/evchargingenv.md b/evchargingenv.md
@@ -16,7 +16,15 @@ EVChargingEnv supports real historical data as well as data sampled from a 30-co
 An observation at time $$t$$ is $$s(t) = (t, d, e, m_{t-1}, \hat{m}_{t:t+k-1|t})$$. $$t \in \Z_+$$ is the fraction of day between 0 and 1, inclusive. $$d \in \Z^n$$ is estimated remaining duration of each EV (in \# of time steps). $$e \in \R_+^n$$ is remaining energy demand of each EV (in kWh). If no EV is charging at EVSE $$i$$, then $$d_i = 0$$ and $$e_i = 0$$. If an EV charging at EVSE $$i$$ has exceeded the user-specified estimated departure time, then $$d_i$$ becomes negative, while $$e_i$$ may still be nonzero.
 
 ## Action Space
-The action space is continuous $$a(t) \in [0, 1]^n$$, representing the pilot signal normalized by the maximum signal allowed $M$ (in amps) for each EVSE. Physical infrastructure in a charging network constrain the set $$\mathcal{A}_t$$ of feasible actions at each time step $$t$$. Furthermore, the EVSEs only support discrete pilot signals, so $$\mathcal{A}_t$$ is nonconvex. To satisfy these physical constraints, EVChargingEnv can project an agent's action $$a(t)$$ into the convex hull of $$\mathcal{A}_t$$ and round it to the nearest allowed pilot signal, resulting in final normalized pilot signals $$\tilde{a}(t)$$. ACNSim processes $$\tilde{a}(t)$$ and returns the actual charging rate $$M \bar{a} \in \R_+^n$$ (in amps) delivered at each EVSE, as well as the remaining demand $$e_i(t+1)$$.
+The action space is continuous $$a(t) \in [0, 1]^n$$, representing the pilot signal normalized by the maximum signal allowed $$M$$ (in amps) for each EVSE. Physical infrastructure in a charging network constrain the set $$\mathcal{A}_t$$ of feasible actions at each time step $$t$$. Furthermore, the EVSEs only support discrete pilot signals, so $$\mathcal{A}_t$$ is nonconvex. To satisfy these physical constraints, EVChargingEnv can project an agent's action $$a(t)$$ into the convex hull of $$\mathcal{A}_t$$ and round it to the nearest allowed pilot signal, resulting in final normalized pilot signals $$\tilde{a}(t)$$. ACNSim processes $$\tilde{a}(t)$$ and returns the actual charging rate $$M \bar{a} \in \R_+^n$$ (in amps) delivered at each EVSE, as well as the remaining demand $$e_i(t+1)$$.
 
 ## Reward Function
-The reward function is a sum of three components: $$r(t) = p(t) - c_V(t) - c_C(t)$$. The profit term $$p(t)$$ aims to maximize energy delivered to the EVs. The constraint violation cost $$c_V(t)$$ aims to reduce physical constraint violations and encourage the agent's action $$a(t)$$ to be in $$\mathcal{A}_t$$. Finally, the CO2 emissions cost $$c_C(t)$$, which is a function of the MOER $$m_t$$ and charging action, aims to reduce emissions by encouraging the agent to charge EVs when the MOER is low.
+The reward function is a sum of three components: $$r(t) = p(t) - c_V(t) - c_C(t)$$. The profit term $$p(t)$$ aims to maximize energy delivered to the EVs. The constraint violation cost $$c_V(t)$$ aims to reduce physical constraint violations and encourage the agent's action $$a(t)$$ to be in $$\mathcal{A}_t$$. Finally, the CO<sub>2</sub> emissions cost $$c_C(t)$$, which is a function of the MOER $$m_t$$ and charging action, aims to reduce emissions by encouraging the agent to charge EVs when the MOER is low.
+
+## Getting Started
+
+TODO
+
+```bash
+python run_script.py  # TODO
+```