Working with Large Objects

Hi everyone, I am working on building another tutorial for my website, but this time for hash tables. My hope is that I could show simulations for deciding optimal time to double the hash array/optimal collision handling. My issue is that the space needed would explode if cadCAD keeps track of the hash table in each simulation. Is there a recommended way to handle this that anyone knows of?

2 Likes

Hey Sean!

Right now, there are two ways to minimize memory footprint, although both of them are a bit dirty. One is to erase the state history on each timestep, and another is to use a tweaked version of cadCAD which doesn’t make deepcopies across state transitions.

For the first one, you would create a State Update Function which iterates over the state history, and deletes all the previous instances of the object. This would be as simple as something like:

ERASE_VARIABLES = [‘variable_2’, ‘variable_2’]

def s_used_memory(params, substep, state_history, prev_state, policy_input):
for a in state_history:
for b in a:
for key in ERASE_VARIABLES:
b[key] = None
return (‘used_memory’, None)

For the second one, you could install the ‘no_deepcopy’ branch from a forked repo (GitHub - danlessa/cadCAD at no_deepcopy). To install it, pass the following command on pip:

pip install git+https://github.com/danlessa/cadCAD@no_deepcopy

2 Likes

Thank you! This is exactly what I was looking for.

I believe it also works if you clear only the last set of substates after timestep zero, rather than looping through the entire state history each time:

def p_free_memory(params, substep, state_history, state):
    if state['timestep'] > 0:
        for key in ['state_variable_to_clear']:
            substates = state_history[-1]
            for substate in substates:
                substate[key] = None
    return {}
1 Like

It works perfectly - but the cadCAD version is unfortunately stuck at 0.4.18, which has the params[0]/params[“actual param that I want”] problem - could you please update this to 0.4.23?

Just put the no_deepcopy patch on 0.4.23 in my fork, seems to work fine


pip install git+https://github.com/randomshinichi/cadCAD

1 Like

I’ve just updated the branch to 0.4.23 btw

1 Like

beautiful, thank you!

1 Like

FYI: it is easier to consume that branch hack now. There’s a tweaked version of cadCAD with shallow copies + multi_proc + progress bars on https://pypi.org/project/cadCAD-tweaked/

To install, just pass pip install cadCAD-tweaked

I’m going to maintain informally that package so it is always sort of up to date, but be watchful for unexpected results

1 Like