Skip to content
Go back

Formulating Dynamic Programming for EMS: State-Space and Cost Function

|

Written at somewhere on the Earth

In our previous post Dynamic Programming: The Gold Standard, we established that DP acts as a “map” for finding the shortest path. However, for a computer to read this map, we cannot simply feed it a physical vehicle. We must provide it with Mathematical Equations.

This step is known as Mathematical Modelling. It serves as the backbone of any control systems. If the model is flawed, any subsequent optimisation results are rendered meaningless.

Today, we will translate the physical problem of an FCHEV into the language of Mathematics: State-Space representation and the Cost Function. 📐

1. System Modelling

Before managing energy, we must determine how much energy the vehicle requires to move. This is the Longitudinal Dynamics problem.

The power demand at the wheels (PreqP_{req}) at any time instance tt is calculated based on Newton’s Second Law:

Ftrac=Faero+Froll+Fgrade+maF_{trac} = F_{aero} + F_{roll} + F_{grade} + m \cdot a

From the traction force, we can derive the electrical power required from the powertrain system (Fuel Cell + Battery), accounting for the efficiency of the electric motor and inverter:

Pelec_req(t)=v(t)ηmotor(12ρACdv(t)2+mgCrcos(α)+mgsin(α)+mdvdt)P_{elec\_req}(t) = \frac{v(t)}{\eta_{motor}} \cdot \left( \frac{1}{2}\rho A C_d v(t)^2 + mgC_r \cos(\alpha) + mg \sin(\alpha) + m \frac{dv}{dt} \right)

Note: In the DP algorithm, since the driving cycle (v(t)v(t) and α(t)\alpha(t)) is known a priori, Pelec_req(t)P_{elec\_req}(t) acts as a disturbance input at each time step.

2. Optimisation Problem Formulation

To apply Dynamic Programming, we must structure the system into a standard Discrete-time Optimal Control format. This structure typically comprises three elements: xx (State), uu (Control), and ww (Disturbance).

a. State Variable (xx)

The state variable represents the system’s “memory.” In the EMS problem for hybrid vehicles, the most critical time-varying variable is the Battery State of Charge (SOC).

xk=SOCkx_k = SOC_k

The State Transition Equation from step kk to k+1k+1 is defined as:

SOCk+1=SOCkVocVoc24RintPbatt(uk)2RintQbattΔtSOC_{k+1} = SOC_k - \frac{V_{oc} - \sqrt{V_{oc}^2 - 4 R_{int} P_{batt}(u_k)}}{2 R_{int} Q_{batt}} \cdot \Delta t

(Do not be alarmed; this is simply the current calculation formula I=P/VI = P/V, rewritten based on the simplified Rint battery model).

b. Control Variable (uu)

This is the decision variable. We can choose to control either the battery current or the Fuel Cell power. Typically, I select the Fuel Cell Power as the control variable:

uk=Pfc,ku_k = P_{fc,k}

c. Power Balance Constraint

At every instant, the energy supplied must equal the energy consumed:

Pfc+Pbatt=Pelec_reqP_{fc} + P_{batt} = P_{elec\_req}

Consequently, the battery power (PbattP_{batt}) becomes a dependent variable: Pbatt=Pelec_reqPfcP_{batt} = P_{elec\_req} - P_{fc}.

3. The Cost Function (JJ)

The objective of DP is to find a control sequence π={u0,u1,...,uN1}\pi = \{u_0, u_1, ..., u_{N-1}\} that minimizes a global cost function JJ.

J=k=0N1L(xk,uk)+Φ(xN)J = \sum_{k=0}^{N-1} L(x_k, u_k) + \Phi(x_N)

Where:

4. System Constraints

While mathematics allows PfcP_{fc} to be infinite, physics does not. We must impose strict Inequality Constraints:

  1. Fuel Cell Constraints: 0PfcPfcmax0 \le P_{fc} \le P_{fc}^{max} ΔPdown(Pfc,kPfc,k1)ΔPup(Ramp rate limits)- \Delta P_{down} \le (P_{fc,k} - P_{fc,k-1}) \le \Delta P_{up} \quad (\text{Ramp rate limits})

  2. Battery Constraints: SOCminSOCkSOCmax(e.g., 0.40.8)SOC_{min} \le SOC_k \le SOC_{max} \quad (\text{e.g., } 0.4 - 0.8) PbattminPbatt,kPbattmaxP_{batt}^{min} \le P_{batt,k} \le P_{batt}^{max}

5. Numerical Example (Case Study)

To visualize how DP operates at a single time step (tkt_k), let’s walk through a simplified scenario with hypothetical data.

Assumptions at time step kk:

The DP algorithm will discretise and test 3 feasible control candidates (uu) for the Fuel Cell and compare them:

Step 1: Power Split Calculation

Using the balance equation Pbatt=PreqPfcP_{batt} = P_{req} - P_{fc}:

Step 2: Instantaneous Cost Calculation (LL)

Looking up the Fuel Cell consumption map:

Step 3: State Transition Update (SOCk+1SOC_{k+1})

Calculate the change in battery energy and the new SOCSOC. (Simplified formula: ΔSOCPbattΔtQbatt\Delta SOC \approx - \frac{P_{batt} \cdot \Delta t}{Q_{batt}})

Step 4: Total Cost Evaluation (Cost-to-Go + Instantaneous Cost)

This is the decisive step. DP looks not only at the present but also at the future. The future cost (JnextJ_{next}) is retrieved from the Cost-to-Go matrix (which was calculated backwards from the end of the cycle).

Assumption: The Cost-to-Go matrix indicates that having low SOC (59.8%) incurs a high future penalty (recharging needed later), while having high SOC (60.1%) reduces future costs.

Candidate (uu)Instantaneous Fuel (LL)Assumed Future Cost (JnextJ_{next})TOTAL COST (JJ)
A (Pfc=0P_{fc}=0)0 g (Cheapest Now)100 g (High penalty)100 g
B (Pfc=30P_{fc}=30)0.45 g50 g (Medium)50.45 g
C (Pfc=45P_{fc}=45)0.75 g (Most Expensive)49.8 g (Low penalty)50.55 g

DP’s Verdict: At this specific second, Option B is the optimal choice (Lowest Total Cost of 50.45). Although Option A consumes zero Hydrogen right now, DP “foresees” that depleting the battery will cost more in the long run, so it rejects the EV mode in this specific context.


Conclusion

We have successfully “translated” a physical vehicle into mathematical equations:

With these components in place, the remaining task is to solve the Bellman equation. But how do we implement this on a computer? How do we handle state grid discretization?

In the next post, I will share the detailed MATLAB code to solve this problem. Get your MATLAB ready! 💻


👨🏻‍💻🏀

Previous Post
FCHEV Longitudinal Dynamics Model: From Mechanics to Electrics
Next Post
Energy Optimization Strategies for Fuel Cell Vehicles: Why is Dynamic Programming the Gold Standard?