Initialization
Now that you have decided the subsystems that you want to train your NNP on and prepared all the required files you can initialize the ArcaNN procedure by running (from the $WORK_DIR folder):
python -m arcann_training initialization start
Now it should have generated your first 000-training
directory. In $WORK_DIR
you will also find a default_input.json
file that lools like this :
{
"step_name": "initialization",
"systems_auto": ["SYSNAME1", "SYSNAME2", "SYSNAME3"],
"nnp_count": 3
}
The "systems_auto"
keyword contains the name of all the systems that were found in your $WORK_DIR/user_files/
(i.e. all LMP
files) directory and "nnp_count"
is the number of NNP that is used by default in the committee.
The initialization will create several folders. The most important one is the control/
folder, in which essential data files will be stored throughout the iterative procedure. These files will be written in .json
format and should NOT be modified. Right after initialization the only file in control/
is config.json
, which contains the essential information about your initialization choices (or defaults), such as your subsystem names and options. Finally the 000-training
empty folder should also have been created by the execution of the python script, where you will perform the first iteration of training.
If at this point you want to modify the datasets used for the first training you simply have to create an input.json
from the default_input.json
file and remove or add the system names to the list. You could also change the number of NNP if you wish. Then you only have have to execute the command of the initialization phase again and your 000-training
directory will be updated.