Skip to article frontmatterSkip to article content

AI Code Security Check

  • For security purposes, model code that will be send to sensitive data has to be check for safety.
  • To do an initial check of the code, we provide a code security module powered by generative AI.
  • To use this service, you need to have generative AI service credentials that are compatible to use with dspy.

Preparation for analysis

  • The code cells below demonstrates the use of the AI code check feature.
import os
import dspy
from phaeton.ai import codebase_security_check

lm = dspy.LM(
    "azure/gpt-4o",
    api_key=os.environ["AZURE_AI_API_KEY"],
    api_base=os.environ["AZURE_AI_ENDPOINT"],
)

dspy.configure(lm=lm)
  • The code that needs to be check has to be inside a tarball. For example let’s clone the git repository from RIVM COVID-19 projection model and put it inside a tar archive.
!git clone https://github.com/rivm-syso/COVID-projectionmodel --depth 1
!tar -czvf COVID-projectionmodel.tar.gz ./COVID-projectionmodel/
  • We can run the code security check on the tarball as follows:
codebase_security_check(
    "COVID-projectionmodel.tar.gz",
    create_report=True,
    save_report=True,
    report_path='security_report.md'
)
  • Resulting output will be saved to a markdown file named security_report.md.

Example code security analysis output

The overall code is mostly safe to run, with the exception of a few files that require caution:

  1. ./README.html: Contains a potentially unsafe operation in Chunk 2, where a script is dynamically appended to the HTML document’s <head> section without verifying its content or source. This poses a risk of executing malicious code.

  2. ./R/00_masterscript_20210106.R: Sources external scripts, and while the chunks themselves are safe, the safety of the external scripts being sourced cannot be fully guaranteed without reviewing their content.

  3. ./.git/hooks/sendemail-validate.sample: Interacts with Git configurations and worktrees, which could potentially affect a repository. It is recommended to run this in a controlled environment and ensure proper implementation of the TODO sections.

For the rest of the files, they have been reviewed and deemed safe to run. They primarily involve data manipulation, visualization, configuration, and Git hooks, with no harmful operations or malicious intent.

Statistics

FileFormatIs it safe?Analysis remarks
./R/code4model/Populationdata4model.RRBoth chunks of code have been reviewed and deemed safe to run. They focus on data manipulation and analysis without performing any harmful operations on the system.
./R/code4model/OSIRISanalyses4model.RRThe code is safe to run. Both chunks have been reviewed and found to contain statistical or epidemiological modeling without any harmful operations or malicious intent.
./R/code4model/DelaysProbabilities4model.RRThe code is safe to run. Both chunks involve mathematical operations and the creation of delay distributions without any harmful actions or system-level manipulations.
./R/code4model/SEROanalysesContactinput4model.RRThe code is safe to run as all chunks involve data processing, manipulation, and summarization operations without any harmful or system-altering commands.
./R/code4model/simulationcode4model_v3.RRThe code appears to be safe to run as both chunks do not include harmful operations such as file manipulation, system commands, or network access. The code seems to focus on simulation and statistical processes. However, it is important to ensure that the referenced objects and functions (e.g., ContactMatrices, SeasonalityCurves, LogY0, LogInfectivities, engine_allvars) are properly defined and do not introduce unsafe behavior.
./R/code4model/Seasonality4model.RRAll code chunks have been evaluated as safe to run.
./R/code4model/ContactsInfectivities4model.RRThe code appears to be safe to run based on the provided chunks. Chunk 1 defines functions related to epidemiological modeling without any harmful operations. Chunk 2 involves accessing and returning data, and while the full context of the code and data is not available, the snippet itself does not indicate any unsafe behavior.
./R/code4model/SimulatePlot4model.RRThe code is safe to run. Both chunks have been reviewed and determined to contain no harmful operations or malicious intent. They focus on statistical modeling, plotting, and data visualization.
./R/code4model/EPI_NICEdata4fit.RRThe code contains one chunk that is safe to run and another chunk that is incomplete with syntax errors, making it difficult to fully assess its safety. However, based on the visible portion, there are no indications of harmful operations. Caution is advised when running the incomplete chunk.
./R/code4model/Replace_syntheticresults.RRThe code is safe to run. Both chunks are focused on loading data and defining functions without performing any harmful operations on the system.
./R/code4model/readmatrices4model.RRThe code appears to be safe to run based on the analysis of both chunks. Chunk 1 involves reading .rds files, which should be verified as coming from a trusted source to avoid potential risks. Chunk 2 performs safe data manipulations and calculations without harmful operations.
./R/code4model/EPI_NICEanalyses4model.RRThe code chunks provided are safe to run. Chunk 1 involves standard data processing operations without any harmful actions, and Chunk 2 contains non-executable text that poses no risk.
./R/code4model/EPI_Reportingdelays4model.RRThe code appears to be safe to run. Both chunks define functions for calculating reporting delays, perform data manipulation using filtering and summarization, and return processed results. There are no indications of harmful operations such as file system access, network calls, or system modifications.
./R/code4model_original/EPI_NICEdata4fit.RRBoth chunks of code have been reviewed and deemed safe to run. They involve data processing, filtering, and transformation operations without any harmful or malicious commands.
./R/code4figures/Figure_S12_logbetaestimationhistory.RRThe code is safe to run. Both chunks involve standard data analysis and visualization tasks without any harmful operations or malicious intent.
./R/code4figures/Figure_2.RRThe code appears to be safe to run overall. Both chunks involve standard operations such as loading data, performing calculations, and saving plots. However, caution should be exercised to ensure that external files and libraries used in the code are trustworthy and free from malicious content.
./R/code4figures/Figure_S11_continuoustimemodel.RRThe code is safe to run. Both chunks involve standard operations such as data loading, function definition, modeling, data visualization, and statistical analysis, with no indications of harmful or malicious activities.
./R/code4figures/Figure_1.RRAll code chunks have been reviewed and deemed safe to run. The code primarily involves data visualization and saving plots in R, with no harmful operations or system-level commands.
./R/00_masterscript_20210106.RRThe code appears to be safe to run based on the provided chunks. However, the safety of the external scripts being sourced in Chunk 1 cannot be fully guaranteed without reviewing their content. It is recommended to ensure that these external scripts are from a trusted source and do not contain harmful code.
./R/code4data_original/opschonen_data_nice_episode.RRBoth chunks of code have been reviewed and deemed safe to run. They involve standard data processing operations in R using packages like dplyr, without any indications of harmful actions such as file system access, network calls, or system-level commands. However, it is recommended to ensure the data being processed is secure and does not contain sensitive information.
./R/code4data_original/opschonen_data_nice.RRThe full code is safe to run. Both chunks have been reviewed and do not contain harmful operations or system-level manipulations. They primarily source external R scripts and perform environment cleanup by removing a variable.
./R/code4data_original/importeren_data_nice.RRThe full code is safe to run. Both chunks have been reviewed and show no indications of harmful operations or system-altering commands.
./R/code4data_original/opschonen_data_nice_opnamedatum.RRThe full code is safe to run. Both chunks are focused on data manipulation using the dplyr package in R and do not include any harmful operations. However, as a precaution, ensure that the data being processed does not contain sensitive information and is handled securely.
./R/code4data_original/opschonen_data_nice_filter.RRBoth code chunks have been reviewed and deemed safe to run. They involve standard data manipulation operations in R using packages like dplyr and tidyverse, with no indications of harmful actions such as file deletion, system modifications, or external network calls.
./R/code4data_original/opschonen_data_nice_algemeen.RRThe full code is safe to run. Both chunks perform data transformation and cleaning operations using R’s dplyr package, with no indications of harmful actions such as file deletion, system modification, or external network calls.
./results/Fig1.jpgbinaryBinary files are considered unsafe by default.
./results/Fig2.pdfbinaryBinary files are considered unsafe by default.
./results/additionalresults/FigS11.pdfbinaryBinary files are considered unsafe by default.
./results/additionalresults/FigS12.jpgbinaryBinary files are considered unsafe by default.
./results/additionalresults/maxlikelihoodsestimationhistory_20210106.rdsbinaryBinary files are considered unsafe by default.
./results/additionalresults/simulations_continuoustime_10pct_20210106.rdsbinaryBinary files are considered unsafe by default.
./results/additionalresults/FigS12.pdfbinaryBinary files are considered unsafe by default.
./results/additionalresults/maxlikelihoodcontinuoustime_20210106.rdsbinaryBinary files are considered unsafe by default.
./results/additionalresults/FigS11.jpgbinaryBinary files are considered unsafe by default.
./results/Fig2.jpgbinaryBinary files are considered unsafe by default.
./results/Fig1.pdfbinaryBinary files are considered unsafe by default.
./results/simulations_10pct_20210106.rdsbinaryBinary files are considered unsafe by default.
./results/maxlikelihood_20210106.rdsbinaryBinary files are considered unsafe by default.
./README.htmlJavaScriptThe code contains a potentially unsafe operation in Chunk 2, where a script is dynamically appended to the HTML document’s <head> section without verifying its content or source. This poses a risk of executing malicious code. Therefore, the overall code cannot be deemed entirely safe to run.
./README.mdRBoth code chunks are safe to run. The first chunk is a descriptive text about the repository and synthetic data usage, while the second chunk is a license text excerpt. Neither contains executable code or poses any risk.
./.git/hooks/commit-msg.sampleShell ScriptBoth chunks of code are safe to run. They are Git hooks designed to check for duplicate “Signed-off-by” lines in commit messages and do not perform any harmful operations on the system.
./.git/hooks/pre-merge-commit.sampleShell ScriptAll code chunks have been reviewed and deemed safe to run. The scripts are Git hook templates for pre-merge-commit checks, designed to assist in version control workflows without performing any harmful operations.
./.git/hooks/pre-commit.sampleShell ScriptThe code is safe to run. Both chunks are Git pre-commit hook scripts designed to enforce checks like preventing non-ASCII filenames and whitespace errors before committing changes. They do not perform any harmful operations on the system.
./.git/hooks/pre-rebase.sampleBashBoth code chunks are safe to run. The first chunk is a Git hook that enforces conditions during a rebase operation without performing harmful actions, and the second chunk is purely documentation explaining Git strategies and commands, which is non-executable and harmless.
./.git/hooks/post-update.sampleShellAll code chunks have been evaluated as safe to run.
./.git/hooks/pre-receive.sampleShell ScriptAll code chunks have been reviewed and deemed safe to run. The scripts are Git hooks that process push options, echo certain values, and reject pushes based on specific conditions without performing any harmful operations.
./.git/hooks/push-to-checkout.sampleShellThe code appears to be safe to run as it primarily consists of Git hook scripts designed to handle updates to a checked-out tree during a git push. Both chunks interact with Git commands and do not contain harmful operations. However, the code is incomplete and assumes proper permissions and a valid Git environment. It is recommended to run it in a controlled environment to avoid unintended changes.
./.git/hooks/pre-applypatch.sampleShellAll code chunks have been reviewed and deemed safe to run. The scripts are Git hook templates intended for verifying commits during the applypatch process and do not perform any harmful operations.
./.git/hooks/applypatch-msg.sampleShell ScriptAll code chunks have been reviewed and deemed safe to run. They are Git hook examples for checking commit messages during the applypatch process and do not perform any harmful operations on the system.
./.git/hooks/fsmonitor-watchman.samplePerlThe code appears to be safe to run based on the provided chunks. However, it is incomplete and may not function as intended without the missing parts. Additionally, it assumes the presence of specific dependencies and a properly configured environment. Ensure all prerequisites are met before executing the code.
./.git/hooks/pre-push.sampleShell ScriptAll code chunks have been reviewed and deemed safe to run. The scripts are Git pre-push hooks designed to prevent pushing commits with log messages starting with “WIP” and do not perform any harmful operations on the system.
./.git/hooks/update.sampleBashBoth code chunks have been reviewed and deemed safe to run. They are Git hooks designed to enforce repository policies and include safety checks to prevent improper usage. No harmful operations are performed, and the scripts are contextually safe for their intended use.
./.git/hooks/sendemail-validate.sampleShellThe code appears to be generally safe to run, as it does not contain any explicitly harmful commands or operations. However, it interacts with Git configurations and worktrees, which could potentially affect a repository. It is recommended to run the code in a controlled environment and ensure that the TODO sections are properly implemented before using it in production.
./.git/hooks/prepare-commit-msg.sampleShell ScriptAll code chunks have been reviewed and deemed safe to run. The scripts are Git hooks designed for modifying commit messages, utilizing standard tools like Perl and Git commands without performing any harmful operations on the system.
./.git/indexbinaryBinary files are considered unsafe by default.
./.git/descriptionPlain TextAll code chunks have been evaluated as safe to run.
./.git/logs/HEADPlain TextThe code consists of metadata or log entries related to cloning a GitHub repository. It does not contain executable code or harmful instructions. Therefore, the code is safe to run.
./.git/logs/refs/heads/masterPlain TextThe code consists of metadata or log entries related to cloning a GitHub repository. It does not contain executable code or harmful instructions. Therefore, the code is safe to run.
./.git/logs/refs/remotes/origin/HEADPlain TextThe code consists of metadata or log entries related to cloning a GitHub repository. It does not contain executable code or harmful instructions. Therefore, the code is safe to run.
./.git/shallowPlain TextThe full code is safe to run as all chunks are non-executable and do not perform any operations.
./.git/configGit Configuration FileAll code chunks are safe to run. They consist of Git configuration file snippets that do not execute any harmful operations.
./.git/HEADGit Reference FileAll code chunks have been evaluated as safe to run.
./.git/packed-refsGitAll chunks are safe to run. The code consists of Git reference file snippets, which are non-executable and pose no harm to the system.
./.git/refs/heads/masterPlain TextThe full code is safe to run as all chunks are non-executable and do not perform any operations.
./.git/refs/remotes/origin/HEADGitAll code chunks have been evaluated as safe to run.
./.git/info/excludeGit Ignore FileAll code chunks are safe to run. They consist of configuration snippets for Git’s exclude file and do not execute any harmful operations.
./.git/objects/pack/pack-e7afdaf9ec81eece024c119642ead20676be848e.packbinaryBinary files are considered unsafe by default.
./.git/objects/pack/pack-e7afdaf9ec81eece024c119642ead20676be848e.revbinaryBinary files are considered unsafe by default.
./.git/objects/pack/pack-e7afdaf9ec81eece024c119642ead20676be848e.idxbinaryBinary files are considered unsafe by default.
./LICENSEPlain TextThe provided code chunks are safe to run. They consist of non-executable text related to the GNU Affero General Public License, including legal and informational content, and do not contain any harmful instructions.
./COVID-projectionmodel.RprojR configuration fileAll code chunks have been reviewed and deemed safe to run. They appear to be configuration files with no harmful commands.
./.gitignorePlain TextAll code chunks are safe to run as they consist of lists of file paths or filenames and do not contain executable code.
./data/population/ROAZregios_synthetic.csvCSVAll chunks have been reviewed and determined to be safe to run. The content consists of datasets or structured information without any executable code or harmful instructions.
./data/figuredata/PrognosisData20210106.rdsbinaryBinary files are considered unsafe by default.
./data/figuredata/Observations20220321.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrices_restrictions28sep_2020-11-26.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrices_batch2_1june_2020-11-26.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrix_D3asEpiPose1_residualincreased_27mei2020.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrices_intelligentlockdown_march_2020-11-26.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrices_2week-lockdown-november_2020-11-26.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrices_start-schoolyear-september_2020-11-26.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrices_batch3_summerholiday_2020-11-26.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrices_partiallockdown-october_2020-11-26.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/ContactmatricesD3praktijk_midpoint_24mrt2020.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrices_winter-lockdown_2020-12-16.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrices_partiallockdown-october-holiday_2020-11-26.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrices_winter-lockdown-christmas_2020-12-16.rdsbinaryBinary files are considered unsafe by default.
./data/contacts/Contactmatrices_batch1_11may_2020-11-26.rdsbinaryBinary files are considered unsafe by default.
./data/OSIRIS/OSIRISdata_20210106_synthetic.rdsbinaryBinary files are considered unsafe by default.
./data/NICE/NICEIDdata_20210106_synthetic.rdsbinaryBinary files are considered unsafe by default.
./data/NICE/data_nicedelay_20210106_synthetic.rdsbinaryBinary files are considered unsafe by default.
./data/processedmodelinput/modelinput2021-01-06.RDatabinaryBinary files are considered unsafe by default.
./data/processedmodelinput/modeldatainput2021-01-06.RDatabinaryBinary files are considered unsafe by default.
./data/originalresults/PreAdmissionProbs_20210106.rdsbinaryBinary files are considered unsafe by default.
./data/originalresults/NICEtimeseries_20210106.rdsbinaryBinary files are considered unsafe by default.
./data/originalresults/NICEdelays_20210106.rdsbinaryBinary files are considered unsafe by default.
./data/originalresults/NICEprobabilities_20210106.rdsbinaryBinary files are considered unsafe by default.
./data/originalresults/ReportingDelays_20210106.rdsbinaryBinary files are considered unsafe by default.
./data/originalresults/PreAdmissionDelays_20210106.rdsbinaryBinary files are considered unsafe by default.
./data/sero/pico1data_synthetic.csvCSVBoth code chunks are safe to run. They consist of synthetic data and references to a COVID projection model without any executable code or harmful instructions.
./data/sero/pico2data_synthetic.csvCSVAll code chunks have been reviewed and deemed safe to run. They consist of synthetic data and metadata without any executable code or harmful instructions.