Audit Summary
Notes from the initial data audit (Step 1). Full report lives at AUDIT.md at the project root.
Corpus
- 152 CSV files · 2.8 GB total · 10.5M data rows
- 152 tables documented in the DDL (6,631 column metadata entries, operator-supplied)
- 92 semantic relationships between 90 entity types (rises to ~124 after our manual node additions)
- 1,025 aircraft in the fleet roster, across 125 distinct models
Seed strategy
The DB load (db/ramco_aviation.sqlite, 605 MB) was filtered to the top-20 aircraft by activity score plus full masters. This captures ~86% of all real operational activity while keeping the DB small enough to ship.
Top of fleet
| Rank | Reg | Model | Tech logs | Discrepancies |
|---|---|---|---|---|
| 1 | 1132 | B767-200 | 1,040 | 1,919 |
| 2 | 101 | A310 | 840 | 862 |
| 3 | 6YJMB | A320-211 | 320 | 981 |
| 4 | 1133 | B767-200 | 467 | 593 |
| 5 | 1819 | A320-211 | 485 | 466 |
Empty tables (15)
DP_CFWDDTL_MEL_PART_CF_DTL, DP_DPMRDA_DOC_ATTACH_DTL, DP_PRMSCH_PARAM_SCHEDULE_DTL, FLOG_DFDPRSL_DEF_DISC_RESOL, FLOG_TLGCR_TECH_LOG_CR_DTL, FLOG_TLGDA_DOC_ATTACH_DTL, FLOG_TLGFIN_FINANCE_INFO, FLOG_TLGWU_WORK_UNIT_DTL, SWO_Error_Map, swo_msgprt_upd_dtl, swo_swocrgh_charge_details_his, swo_swopm_param_upd_dtl, swo_swoprth_part_details_his, swo_swores_swo_res_det, swo_visit_plan_eo_dtl.
Most are audit/history/attachment tables the operator doesn't use or didn't export.
Data-quality flags
- 36 orphan aircraft regs — regs appearing in transactions but not in
AC_ACI_AIRCRAFT_INFO. Mostly case-variant duplicates (VT-666vsVt-666) or test records. - Wildcard doc numbers — some
DISCP_DOC_NO = '%'rows exist; filtered out by the Brain at query time. - Case-sensitive reg columns — handled via seed case-normalisation during load.
See Data Quality and Case Sensitivity in Rules Reference for how the Brain handles these.