Generating random Events to feed into a Policy

randomshinichi · February 21, 2020, 4:17am

Hello, I’m trying to simulate life as a Policy, and each timestep, several Events are generated and the Policy should choose between them.

The thing is from the tutorials, I don’t see a place to plug in Event generation before the timesteps are run. Is this not supported at all? Currently I just have a global variable that’s being read each iteration - it works, but I’m sure there is a better way.

events = generate_Events()
events_taken = []
def life_policy(params, step, sL, s):
    def choose_event(events, s):
        for i, e in enumerate(events):
            # General Rules
            if e.Health <= 20: # If it damages my health too much, it's not worth it
                continue
            if e.Time == 10: # If an event takes too much time until payoff, avoid it.
                continue
            if e.Money <= -30: # Why do something that hurts my finances? The whole point is to make money
                continue
            
            if s['Health'] + e.Health <= 0: # If it will kill me, don't do it
                continue
            if s['Money'] + e.Money <= 0: # If it will bankrupt me immediately, don't do it
                continue
        return e
    # If I'm busy taking on an Event, don't take on more Events.
    # As time goes on, I get my energy back, and living costs are spent.
    if s['Time'] > 0:
        return({'add_to_Health': randint(1,10), 'add_to_Time': -1, 'add_to_Money': -1000})
    
    e = choose_event(events, s)
    # Once a Life Event has been chosen, log it, and remove it from the pool of Events.
    events_taken.append(e)
    events.remove(e)
    return({'add_to_Health': e.Health, 'add_to_Time': e.Time, 'add_to_Money': e.Money})

markusbkoch · February 21, 2020, 2:12pm

Hi @randomshinichi and welcome! Sounds like an interesting simulation and I hope you can publish it in full here someday.

The way I see it, the list of events is essentially an exogenous dataset and acts as a driver to the life_policy. We use similar design patterns when we want the trajectory of a state variable to mimic that of some reference time series, like the market price of an asset for example. Are you running into some issue with this approach?

The only caveat I can think of is that if you have multiple Monte Carlo runs your simulation will return incorrect results. Monte Carlo runs are parallelized and they’ll all share the same list of events, so as they modify it (events.remove(e)) they’ll interfere with one another, which you’ll want to avoid.